

# BIT-SLICE MICROPROCESSOR DESIGN 

John Mick<br>Engineering Manager Systems and Applications<br>Digital Bipolar Products<br>Advanced Micro Devices<br>James Brick<br>Manager of Systems and Applications<br>AM2900 Family<br>Advanced Micro Devices

## McGraw-Hill Book Company

| New York | St. Louis | San Francisco | Auckland | Bogotá <br> Hamburg <br> Montreal <br> Johannesburg <br> New Delhi | London <br> Panama |
| :---: | :---: | :---: | :---: | :---: | :---: | | Madrid | Mexico |
| :---: | :---: | :---: | :---: |

Advanced Micro-Devices reserves the right to make changes in its products without notice in order to improve design or performance characteristics. The authors and the company assume no responsibility for the use of any circuits described herein.

## Library of Congress Cataloging in Publication Data

Mick, John.
Bit-slice microprocessor design.
Includes index.

1. Bit slice microprocessors-Design and construc-
tion. I. Brick, James, joint author. II. Title.
TK7895.M5M44 621.3819'535 80-10610
ISBN 0-07-041781-4

Copyright © 1980 by McGraw-Hill, Inc. All rights reserved.
Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher.

234567890 KPKP 89876543210

Note to the Reader: Advanced Micro Devices cannot assume responsibility for use of any circuitry described other than circuitry entirely embodied in an Advanced Micro Devices product.

## CONTENTS

Preface vii
Acknowledgments ..... vii
Chapter I Computer Architecture ..... 2
Chapter II Microprogrammed Design ..... 12
Chapter III The Data Path ..... 92
Chapter IV The Data Path-PartII ..... 130
Chapter V Program Control Unit ..... 190
Chapter VI Interrupt ..... 206
Chapter VII Direct Memory Access ..... 208
Chapter VIII HEX-29 ..... 258
Chapter IX Super Sixteen ..... 318
Index ..... 385

New integrated circuits are usually accompanied by a wealth of theory and data sheets. Shortly thereafter follow the applicacation notes. The introduction of microprogrammable LSI parts, such as the Am2901 and subsequent ICs in the family, adhered to this pattern. We thought this was adequate in light of the previously successful introduction of fixed-instruction-set MOS microprocessors, which were more complex.
However, bit-slice microprocessor design proved more formidable than first realized. One reason was the intimate relationship between parts. These designs required the designer to pick and choose parts: How many slices are needed to do the job? Which microprogram sequencer and/or controller to select? Is a carry lookahead generator needed? And on, and on and. . . . All these devices had to play together; no single device was complete by itself.
For this added up-front design effort, the user got blazing speed and the utmost flexibility. The latter proved the second hinderance to easy designing. Users now had to design the instruction set as well as the hardware and applications programs. They no longer had the luxury of a fixed-instruction set. On the other hand, they could eliminate unnecessary instructions, easily modify or add instructions at a later date or emulate the existing instruction set of a slower CPU.
Complicating matters was the fact that the 2900 family did not spring whole into the world. Parts were introduced and redesigned over a period of years as engineering and processing resources could be brought to
bear. This evolutionary process still goes on.
To alleviate matters, Advanced Micro Devices announced a nine-part course in microprogrammable microprocessing, each part to stand alone but to build logically upon the preceding part. And, because engineering talent is our most important resource, this course was to unfold over a 22-month period.

Since completion of the course, there has been no diminishing in demand for information on the material covered. In fact, the market for bipolar microprogrammable LSI parts doubled in each of the previous two years and showed no signs of slowing. So, as our copies of individual course materials dwindled, we thought it only natural to bring them all together under one cover. This book is the result.

We think the extraordinary time and effort was well worth it.

## Acknowledgments

The authors wish to thank members of Advanced Micro Devices' bipolar applications department for their contributions to various chapters in this book. In particular we would like to thank Steve Cheng, Vernon Coleman, Mike Economidis, Jerry Gray, Jack Hong, Mike Miller, Warren Miller, Bob Schopmeyer, and Moshe Shavit.

We would also like to thank Mike Simmons and Lee McDonald of Monterey, CA,for allowing us to use their HEX-29 microprogrammable microcomputer in Chapter VIII.

## BIT-SLICE MICROPROCESSOR DESIGN



## Chapter I <br> Computer Architecture

## PREFACE

In this introductory Chapter we intend to:
1). develop a common terminology for future chapters.
2). introduce several stored-program-computer design topics.
3). define some of the computer architect's problems (which will be solved in the subsequent chapters).
In order to achieve these goals, we will start with computer basics. It should be stressed that approaches and solutions can be chosen which are different from the ones described in this and the subsequent chapters. However, the general ideas described will be appropriate to gain familiarity with the microprogrammable bit-slice devices in order to use them in any design configuration.

## BACK TO THE BASICS. . .

A STORED-PROGRAM-COMPUTER is defined as a machine capable of manipulating data according to predefined rules (instructions), where the program (collection of instructions) and data are stored in its memory (Fig. 1). Without some means of communication with the external world, the program and the data cannot be loaded into the memory nor can the results be read out. Therefore, an input/output device is required as shown in Fig. 2.


Figure 1. Basic Definition of a Stored-Program-Computer.


Figure 2. I/O Added to the Basic Stored-Program Computer.

The memory is usually organized in words, each containing $\mathbf{N}$ bits of information. A unique address is allocated for each word which defines its position relative to other words. The Central Processor Unit (CPU) usually reads or writes one word at a time by addressing the memory and then when the memory is ready, reading the contents of the word or writing new contents into that word. To perform this operation, two registers are usually used: The Memory Address Register (MAR), which contains the address and the Memory Data Register (MDR) which contains the data (Fig. 3).


Figure 3. MAR and MDR Depicted for a Stored-Program Computer.

Since accessing a memory (reading from it or writing into it) is usually a relatively slow procedure, it is advantageous to have a few memory locations inside the CPU which can be read from or written into very fast. These locations are usually called Accumulators or Working Registers. Having these fast access registers inside the CPU (Fig. 4) enables many operations to be carried out without referring to the memory (through the MAR and the MDR) and therefore these operations are executed faster.

The unit which actually performs the data manipulation is called the Arithmetic \& Logic Unit (ALU). It has two inputs for operands and one output for the result. It usually operates on all the bits of a word in parallel. The ALU can perform all or part of the following operations:

| Arithmetic | Logical |
| :--- | :--- |
| Add | OR |
| Complement | AND |
| Subtract | XOR |
| Increment | NAND |
| Decrement | NOR |
|  | XNOR |
|  | Complement |

In some architectures, one of the operands must always be in a special register (accumulator) and the result of the ALU operation is always transferred to this register. In a more general CPU, any two of the internal registers can contain the operands and the result of the ALU operation can be transferred to any one of them.

Another very useful feature of a CPU is the ability to shift the contents of a register or the output of the ALU one or more bits in either direction as shown in Fig. 5.


Figure 4. CPU with Internal High Speed Registers.


Figure 5. ALU and Shifter Added to the CPU Design.

We now have the elements to do any data manipulation required but we still need a unit which can properly set the MAR in order to find the next instruction of the program in the memory and to find its associated data. This unit is called the Program Control Unit (PCU) and its role is to load the MAR with the correct address in order to find the next instruction or data item or to point to a memory location where a data word should be written.
Often, the program steps (instructions, data) are written in the memory in consecutive locations, starting at address zero or at any other predefined address. The PCU can simply be incremented after each memory access thereby pointing to the address of the next instruction or data item. This counter-type PCU has very little flexibility. Sometimes we wish to change the "normal" flow of the instructions, particularly if we want to enable our computer to "make decisions" according to conditions prevailing at the current execution point. For example, we may want to execute one of two different sequences of instructions depending upon the result of the last operation performed. This is accomplished by loading the MAR with a new value (the address of the next instruction to be executed) rather than incrementing it. This operation is called a BRANCH or JUMP and can be unconditional (which allows execution of a non-contiguous string of instructions) or conditional (depending, for example, on whether the last operation's result was zero or not, was negative or positive, true or false, etc.).

Even more flexibility can be achieved by using a stack (a group of temporary internal or external memory locations) to store vital data. A stack pointer is used to address the memory location currently at the top of the stack. Indirect and relative addressing and other sophisticated addressing modes (all of which can be handled by the PCU) will be discussed later. Meanwhile, Fig. 5 shows the PCU as a part of the CPU.
Executing an instruction in our computer now requires the following steps:
a). The PCU loads the address of the next instruction to the MAR and signals to the memory that a Read is requested. Incidentally, the PCU may be as simple as a Program Counter equal to the address width. The memory loads the MDR with the contents of the location addressed.
b). The CPU decodes the instruction: i.e., (assuming operands are in internal registers) selects the proper registers to feed the ALU, selects the proper function to be performed by the ALU, sets up the shifter to displace the result, if required, and selects the register in which the result should be stored.
c). The ALU performs the function desired.
d). The result is loaded into the destination register.
e). The result is also examined to determine whether a BRANCH is to be performed.
f). The PCU calculates the address of the next instruction, (usually called a "FETCH").

This procedure becomes more complicated if the operands are not stored in the internal registers or if the result is not to be stored in one of them. Let's take an example instruction using relative addressing:
"Take the first operand from the location specified by the sum of the word after this instruction (immediate) and the contents of register R1; take the second operand from the location specified by the sum of the second word after this instruction and the contents of R2; add the two operands and place the result in the location specified by the sum of the third word after this instruction and the contents of register R3. Then execute the instruction located at the address, which is the sum of the fourth word after this instruction and the contents of register R4 if there is a carry resulting from the addition. Otherwise continue sequentially".

The steps required to execute this instruction are as follows:
a). The PCU loads the address of the next instruction to the MAR, signalling to the memory that a Read is requested. The memory loads the MDR with the contents of the location addressed.
b). The CPU decodes the instruction, i.e., initiates the following steps.
c). The PCU is incremented and the next word is read from the memory.
d). Register R1 and the MDR are selected as source registers, MAR is the destination register.
e). The ALU performs "ADD" and the result is placed in the MAR.
f). The first operand is fetched from the memory and placed, for example, in R5.
g). The PCU is incremented and the next word is read from the memory.
h). Register R2 and the MDR are selected again as source registers and MAR as the destination.
1). The ALU performs "ADD" and the result is placed in MAR.
j). The second operand is fetched from the memory and is placed, for example, in R6.
k). The PCU is incremented, the next word is read from the memory.
I). Register R3 and the MDR are selected as source registers, the MAR as destination.
$\mathrm{m})$. The ALU performs "ADD" and the result is placed in the MAR, which now points to the location where the sum of the operands should be stored.
n). Registers R5 and R6 are selected as sources (they contain the operands), MDR is now the destination.
o.) The ALU performs "ADD" and the result is placed in MDR.
p). A memory write cycle takes place and the contents of the MDR is stored at the desired address.
q). The carry is examined to determine the next step to be performed. Assume there is no carry.
r). The PCU is incremented twice (in order to skip the fifth word of the present instruction). It now points to the address of the next instruction.
As can be seen, 18 steps were used to perform a single addition using this complex relative addressing scheme. Obviously, our CPU needs some kind of "coordinator" which can:
1). Decode an instruction fetched from the memory.
2). Initiate the proper cycle of steps to be performed.
3). Set up the various controls for each step.
4). Execute the steps in an orderly sequence.
5). Make decisions according to the state of various signals (conditions).

We will call this coordinator the Computer Control Unit (CCU) and it is depicted in Fig. 6. Our CPU is now complete (more or less) and we will go into more detail later.

## THE MEMORY

Let's now discuss the memory. The information stored in the memory is organized in words, where each word consists of $N$ bits. N may be as small as 8 for very simple processors or as large as 64 in more powerful machines. The most common memory width for minicomputers is 16 bits. The number $N$ is called the width of the memory and the number of bits in the MDR is obviously also N ; equal to the width of the memory.

The depth of a memory is the number of words it contains. With a MAR having $k$ bits, $2^{k}$ consecutive memory locations can be addressed. The addresses start from zero and range through $2^{k}-1$.

The read access time of a memory directly accessible by the CPU is the time needed from stable address at the memory until the data is properly stored in the MDR. This access time depends on the type of memory used and can be as low as a few tens of nanoseconds and as large as several microseconds. Using high speed memory improves the performance of the computer as less time is wasted waiting for the memory to respond. In general, faster memories are costly, take more PC board area and use more power which results in more heat. A 32 bit wide, $2 \mathrm{~K}(2048)$ word memory with 50 nanosecond access time may need 10 amps from the +5 V power supply and may require a board area of $10^{\prime \prime} \times 6$ ". Yet this is a very small memory space.

It is usually not justified to have very large high-speed memories. Not all the programs and associated data need to reside in this memory at once. We may have the current program (or only a part of it) in the memory while other programs or data files can reside elsewhere and be brought into memory during the appropriate part of the program when needed.


Figure 6. A Computer Control Unit (CCU) Included in a CPU.

This "elsewhere" may be a magnetic tape, cassette, disk, diskette, etc. and we will call it Bulk Memory. The distinctive characteristics of Bulk Memory are:
1). very large capacity
2). non-volatile (retains the information when not in use)
3). not randomly accessible
4). long access time
5). inexpensive (per bit)

Usually, Bulk Memory devices are serially accessible, i.e., the access time for the first word is large, but then consecutive words can be accessed relatively fast.
In a later chapter the most efficient process of communication between the main and the bulk memory, called the Direct Memory Access (DMA), will be discussed in detall.

## THE EXTERNAL WORLD

In any useful machine, some means of communicating with the external word is needed. It may be a keyboard, a CRT, a card reader, a paper tape punch or, in a process controller, reading sensors or positioning actuators. The common denominator of almost all of the input/output devices is that they are much slower than the CPU and therefore a timing problem arises; the CPU must know when the I/O device is ready for data transfer. Usually, a signal is sent by the device to the CPU in order to draw its attention. The CPU now can do one of two things:
1). Test this signal periodically and when it is present, jump to a program which handles the data transfer. This type of operation is called "Polling". This technique has two
major drawbacks: First, appreciable computer time is spent performing these periodic tests where most of them will fail (no "Ready" signal present). Second, the recognition by the computer CPU of the appearance of a signal is delayed until the CPU arrives at this device in its polling sequence.
Imagine what will happen if there are a large number of I/O devices. Long latency times (delays) will occur if many I/O devices are busy simultaneously.
2). Include some hardware in the CPU which can sense the presence of a "Ready" signal and interrupt the normal flow of the instructions and force the computer to "Jump" to the I/O service program whenever there is a request. It can even send the CPU to different programs according to the I/O device whose "Ready" flag was detected and even establish priority among the different devices if more than one device would like to have the CPU's attention at the same time. Moreover, under program control, this circuitry can ignore some or all of the signals if the computer CPU must not be interrupted at that time. Obviously by paying the price of very little hardware, we gain enormously in computer performance. We will call this hardware the "Interrupt Controller" and will discuss it thoroughly later.
Our computer is now depicted in Fig. 7. We have included the ALU, the internal register file and the shift circuit in one block, which we call the "Arithmetic Processor Unit."

In the following pages and in the subsequent chapters, we will deal in more detail with each area of the machine.


Figure 7. The Stored-Program-Computer with DMA and Interrupt Control Added.

## A WORD ABOUT THE INSTRUCTION SET

The internal architecture of the CPU depends to some extent on the instruction set the computer is to execute. If the instruction set is large, some of the instructions usually are more complicated and the computer is more powerful, faster and more efficient. On the other hand, the internal circuitry is also more complicated. Some examples of these tradeoffs are as follows.

## ALU Processing Capability:

Although with three basic functions (add, complement, and OR/AND) all the arithmetic and logic operations can be performed, most processors are built to perform subtract, NAND, XOR, etc. This is perhaps the most outstanding example of how performance and speed can be gained with little penalty on the complexity of the machine. With the added features an XOR operation can be performed in one instruction instead of 5 .

## Data Movement:

Let us assume 4 different computers whose data movement capabilites are described below:

Machine A). A word can be read from the memory and loaded into Register A only. The contents of Register A can be written into the memory, or can be moved into any other register. The contents of any register can be copied into Register A.
Machine B). The contents of any register can be copied into any other register or it can be written into the memory. A word read from the memory can be loaded into any register.
Machine C). As B above but with the added capability to read from one location in memory, to write that word into another location in memory.
Machine D). As C above and also the memory-to-memory operation can be performed on consecutive addresses repetitively. The number of word transfers (or upper and lower address limits) are specified by the instruction.

Machine A has very limited data movement capability. In order to perform an operation on two operands residing in the memory, we have to:
1). Bring the first operand from the memory into Register $A$.
2). Copy it into another regıster.
3). Bring the second operand into Register A.
4). Perform the operation required (result in $A$ ).
5). Store the contents of Register $A$ into the memory.

If consecutive operations are required with several partial results, the drawbacks of machine A become more annoying, especially if the number of internal registers is small.

Moving a data block from one location in the memory to another location can be performed by one instruction in computer D, but requires the transfer of each word first to an internal register then to the new memory location in machines A, B (two instructions for each word transferred).

Obviously the decoding, multiplexing and sequencing of the computers grow in complexity as we proceed from machine A to machine D. We trade the complexity of hardware versus the software (programming), speed and performance.

## Addressing:

The operands for an operation can be found in several ways:

- The operand is an explicit part of the instruction (Immediate)
- The address of the operand is an explicit part of the instruction. (Direct)
- The address of the operand is in an internal register; the register itself is specified by the instruction. (RR)
- The address of the operand is the sum of the contents of an internal register (specified by the instruction) and a number (called the displacement) which is an explicit part of the instruction. (RX)
- The contents of an internal register are added to a number found in an address specified by the instruction. The sum is the address of the operand. (Indirect)
- The contents of an internal register are added to a number which is an explicit part of the instruction. The sum points to the location where the address of the operand is written. (Indirect)
- The contents of an internal register are added to a number which can be found at the location explicitly specified by the instruction. The sum thus formed points to a location where the address of the operand is written.
- Etc.

Many other schemes can be formed by combining the above operations or by chaining them. In every case an "Effective Address" must be found by calculations and/or memory references. Again, we can gain performance by using more sophisticated addressing schemes but we will pay for it by adding complexity to our machine, especially in its control portion.

## TIMING, SEQUENCING, CONTROLLING

In the previous paragraphs we have shown that we can gain performance in our computer by having a more complicated instruction set but more complex hardware is required, usually in the CCU. We have also shown an example for an "Add" operation which required 18 precisely controlled steps. Even if we assume that some of them can be performed simultaneously, we will need a multiphase clock to control these steps - something like that shown in Fig. 8. We can now load an instruction register at the beginning of an instruction with the first word of the instruction (the OP CODE) as is shown in Fig. 9. Using the outputs of the Instruction Register ( $\mathrm{IR}_{0}$ to $\mathrm{IR}_{\mathrm{n}-1}$ ), the different phases of the clock and the various condition inputs to the CCU, we can now try to write the logical equations which should satisfy all of the steps of all the instructions of our instruction set. Then use Karnough maps or other techniques to reduce these equations and finally realize them using AND, OR, INVERT gates and Flip Flops. Simple, isn't it? Imagine the complexity of a sophisticated computer and the debugging process it needs!
The question posed immediately is: Isn't there a more organized and more easily understandable way to do that? Or, perhaps, can we have some processor do the job for us? Can't we have some kind of "micro-machine" which can take care of all the timing, sequencing and controlling jobs of our computer - a computer inside the computer? With the advent of the Am2900 family - new Bipolar LSI devices - the answer is: Yes, we can!


Figure 8. An 8-Phase Clock.


Figure 9. The Instruction Register Bits.


Figure 10. The Micromachine.

## THE MICRO-MACHINE

What we need is essentially a machine which can execute a number of well defined sequences. But, remember that this is exactly the purpose of a stored program computer. The only difference between our micro-machine and a general purpose computer is that in the general purpose computer the program to be executed is changed from task to task, while in our micro-machine it is fixed. This allows the use of PROM for its memory instead of the RAM needed in the general purpose (GP) computer. Our Computer Control Unit (CCU) using this micro-machine may now look like Figure 10.

Basically, a microprogrammed machine is one in which a coherent sequence of microinstructions is used to execute various commands required by the machine. If the machine is a computer, each sequence of microinstructions can be made to execute a machine instruction. All of the little elemental tasks performed by the machine in executing the machine instruction are called microinstructions. The storage area for these microinstructions is usually called the microprogram memory.

A microinstruction usually has two primary parts. These are: (1) the definition and control of all elemental micro-operations to be carried out and (2) the definition and control of the address of the next microinstruction to be executed.
The definition of the various micro-operations to be carried out usually includes such things as ALU source operand selection, ALU function, ALU destination, carry control, shift control, interrupt control, data-in and data-out control, and so forth. The definition of the next microinstruction function usually includes identifying the source selection of the next microinstruction address and, in some cases, supplying the actual value of that microinstruction address.

Microprogrammed machines are usually distinguished from non-microprogrammed machines in the following manner. Older, non-microprogrammed machines implemented the control function by using combinations of gates and flip-flops connected in a somewhat random fashion in order to generate the required timing and control signals for the machine. Microprogrammed machines, on the other hand, are normally
considered highly ordered and more organized with regard to the control function field. In its simplest definition, a microprogram control unit consists of the microprogram memory and the structure required to determine the address of the next microinstruction.

The OP-CODE (type of instruction to be executed by the computer) is loaded into the Instruction Register and the instruction Decoder decodes it. Actually, it generates the microaddress where the first step of the execution sequence for that instruction resides in the microprogram memory. The Am2910 sequencer then generates the microaddress of the next microinstruction. The microprogram data supplies the control signals we need to control all the parts of the com-


Figure 11. Computer Control Function Flow Diagram.
puter (and there are a lot of them), including the sequencer itself. When all the steps of a machine instruction are executed, the microprogram will cause the reading (fetch) of the next machine instruction from the computer main memory. Typically, the Computer Control Unit is used to fetch instructions and decode them using a PROM for mapping the op code to the initial address of the sequence of microinstructions used to execute this particular instruction. It will also fetch all of the operands needed by the machine instruction and deliver them to the ALU for processing. An example of the flow of a typical Computer Control Unit is shown in Figure 11.

Assume the OP-CODE of the machine instruction that we fetch is 8 bits wide. This allows us to execute a minimum of 256 different instructions. Assume also that an average of 6 steps are needed to execute these instructions. Even if separate microprogram memory locations are used, a depth of this microprogram memory is only $1-1 / 2 \mathrm{~K}(\mathrm{~K}=1024)$. But in that case, the sequencer can almost be replaced by a simple counter. Usually we would like to share some micro-routines among different instructions. With very little effort, we can shrink the depth of the microprogram memory of Figure 10 to less than $1 / 2 \mathrm{~K}$. Of course the sequencer will be a little more sophisticated; it will perform conditional Branch and microsubroutine CALL's; but we still don't need the complicated addressing schemes for microprogram control as were described earlier as a part of the machine instruction set.

On the other hand, the width of our microprogram memory may be large - maybe 60 to 100 bits. This will depend on the number of control lines needed in our computer. This is of no great disadvantage since the price of PROM devices is dropping constantly. In a future chapter we will discuss techniques to reduce the depth and width of the microprogram memory to save cost.

It is important to understand the distinction between machine level instructions and microprogram instructions. Figure 12 shows a typical machine instruction for a 16 bit minicomputer that has an 8 -bit opcode to identify one of 256 instructions; a 4 -bit source register specification to identify one of 16 source registers and a 4-bit destination register specification to identify one of 16 destination registers. The microprogram instruction of Figure 12 may contain from 32 to 128 bits in a typical design; or even more bits in a very fast, highly parallel microcoded machine. This microinstruction word usually will contain fields for the ALU source operand, ALU function, ALU destination, status load enable, shift multiplexer control, bus


MICROPROGRAM INSTRUCTION


The machine instruction is 16-bits and consists of an op code, source register and destination register specification. The microprogram instruction defines all the elemental signals to control the various pieces of the machine.
cycle control, etc. These fields are used to control the various devices within the machine so that its execution is as desired on each clock cycle. This is more straightforward than using combinatorial logic and yields a more organized design.

Let us now compare the depth-over-width ( $\mathrm{d} / \mathrm{w}$ ) ratio of the computer's main memory to that of our microprogram memory.

In the Am9080A type microprocessor, the data field is 8 bits and the address field is 16 bits, allowing direct addressing of 64 K locations. The ratio $\mathrm{d} / \mathrm{w}$ is 8 K . In some minicomputers, the data width is $16-32$ bits and the addressing capability is $64-128 \mathrm{~K}$. The $\mathrm{d} / \mathrm{w}$ ratio is about the same. In larger computers with $32-64$ bit data width, we find $256-512 \mathrm{~K}$ deep memories or even deeper ones. The $\mathrm{d} / \mathrm{w}$ ratio again is 8 K at least.

On the other hand, the $d / w$ ratio in microprogram memories is seldom greater than a few tens. Even if we assume that it is 2 K deep and only 64 bits wide, we arrive at a d/w ratio of only 32 ; usually it will be around 10. It is much easier to control a machine with a $\mathrm{d} / \mathrm{w}$ ratio of 10 to 20 than to control one with $\mathrm{d} / \mathrm{w}=8 \mathrm{~K}$.

## ONE MORE WORD

We have suggested a replacement of the "random logic" realization of the CCU by a micro-machine. We call this a "Microprogrammed Architecture". Perhaps the biggest advantage of this type of architecture is the ease of structuring the control sequence. We allocate a bit or a group of bits in the microprogram memory to control a certain function (e.g.: ALU source register selection, ALU function, ALU destination selection, condition selection, next address calculation selection, MDR destination selection, MAR source selection, etc., etc.) and for each microstep we write the appropriate state for these bits (LOW-HIGH) into this memory field. Later we will see that automated and sophisticated tools are available to perform this microprogram writing. One such tool is AMDASM $^{\text {TM }}$ as available on System 29. But, this is not the only advantage of the microprogrammed architecture.

As nobody is perfect, some "bugs" may inadvertently slip into the design. In a random logic architecture, we will have to redesign and usually rebuild the whole computer. On the other hand, in a microprogrammed machine it is usually sufficient to change a couple of bits in the microprogram to rectify the problem. This is even easier if a RAM instead of a PROM is used during the development and debugging phases. Of course, we must be able to load this memory with the microprogram by some external means. Again, a powerful tool is available: AMD's System/29™.

Finally, let's face the reality: The marketing guys usually change their requirements (i.e., the instruction set) when you are $80 \%$ through your logic design. Now you have to start over from scratch. Not so! Change some microcode, perhaps very little hardware too and here you are! It is even more convenient when only additions to the existing instruction set are considered. Just add a few lines to your microprogram to comply with those new ideas! A mere few minutes using System 29 - That's flexibility! Incidentally, don't tell the marketing guys how easy it is or you will NEVER get the product out!!

## SUMMARY

The block diagram of Figure 13 shows a typical 16 -bit minicomputer architecture. Also identified on this block diagram are various Am2900 family elements that might be used in each of these blocks. Such a design might use either 4-Am2901A's or 4-Am2903's for the data path ALU. An Am2910 could be used as the microprogram sequencer for control of up to 4 K words of microprogram memory. Also shown on the block diagram are the Am9130 and Am9140 MOS Static RAM's which are potential candidates for use in the computer's main memory.

The following chapters will discuss various blocks of Figure 13 in detail and give design examples for each section. Needless to say, the design engineer can appropriately tailor any design to meet his throughput requirements. Also, special algorithms can be executed by adding the appropriate hardware and microcode to the blocks described.


Figure 13. A Generalized Computer Architecture.


Chapter II Microprogrammed Design

## CHAPTER II <br> MICROPROGRAMMED DESIGN <br> INTRODUCTION

A microprogrammed machine is one in which a coherent sequence of microinstructions is used to execute various commands required by the machine. If the machine is a computer, each sequence of microinstructions can be made to execute a machine instruction. All of the little elemental tasks performed by the machine in executing the machine instruction are called microinstructions. The storage area for these microinstructions is usually called the microprogram memory. This technique was identified by Wilkes in the 1950's as a structured approach to the random control logic in a computer.
A microinstruction usually has two primary parts. These are: (1) the definition and control of all elemental microoperations to be carried out and (2) the definition and control of the address of the next microinstruction to be executed.

The definition of the various micro-operations to be carried out usually includes such things as ALU source operand selection, ALU function, ALU destination, carry control, shift control, interrupt control, data-in and data-out control and so forth. The definition of the next microinstruction function usually includes identifying the source selection of the next microinstruction address, and in some cases, supplying the actual value of that microinstruction address.

Mıcroprogrammed machines are usually distinguished from non-microprogrammed machines in the following manner. Older, non-microprogrammed machines implemented the control function by using combinations of gates and flip-flops connected in a somewhat random fashion in order to generate the required timing and control signals for the machine. Microprogrammed machines, on the other hand, are normally considered highly ordered and more organized with regard to the control function field. In its simplest definition, a microprogram control unit consists of the microprogram memory and the structure required to determine the address of the next microinstruction.
Microprogramming is normally selected by the design engineer as a control technique for finite state machines because it improves flexibility, performance, and LSI utilization. Several additional key features of microprogrammed designs are listed below:

- More structured organizatıon
- Diagnostics can be implemented easily
- Design changes are simple
- Field updates are easy
- Adaptatıons are straightforward
- System definition can be expanded to include new features
- Documentation and Service are easier
- Design aids are available
- Cost and design time are reduced


## THE MICROPROGRAM MEMORY

The microprogram memory is simply an N word by M bit memory used to hold the various microinstructions. For an N word memory, the address locations are usually defined as location 0 through $\mathrm{N}-1$. For example, a 256 -word microprogram memory will have address locations 0 through 255 . Each word of the microprogram memory consists of $M$ bits. These $M$ bits are usually broken into various field definitions and the fields can consist of various numbers of bits. It is the definition of the varlous fields of a microprogram word that is usually referred to as FORMATTING.

An example of how microinstruction fields are defined in a typical machine microprogram memory word is as follows:

Field 1 - General purpose
Field 2 - Branch address
Field 3 - Next microinstruction address control
Field 4 - Condition code multiplexer control
Field 5 - Interrupt control
Field 6 - Fast clock/slow clock select
Field 7 - Carry control
Field 8 - ALU source operand control
Field 9-ALU function control
Field 10 - ALU destination control
Field 11 - Shift multiplexer control
Field 12 - etc.

## EXECUTING MICROINSTRUCTONS

Once the microprogram format has been defined, it is necessary to execute sequences of these micronstructions if the machine is to perform any real function. In its simplest form, all that is requred to sequence through a series of microinstructions is a microprogram address counter. The microprogram address counter simply increments by one on each clock cycle to select the address of the next microinstruction. For example, If the microprogram address counter contans address 23, the next clock cycle will increment the counter and it will select address 24. The counter will continue to increment on each clock cycle thereby selecting address 25, address 26, address 27, and so forth. If this were the only control available, the machine would not be very flexible and it would be able to execute only a fixed pattern of microinstructions.

The technique of continuing from one micronstruction to the next sequential microinstruction is usually referred to as CONTINUE. Thus, in microprogram control definition, we will use the CONTINUE (CONT) statement to mean simply incrementing to the next microinstruction.

## MICROPROGRAM JUMPING

If the microprogram control unit is to have the ability to select other than the next microinstruction, the control unit must be able to load a JUMP address. The load control of a counter can be a single bit field within the microprogram word format. Let us call this one-bit field the microprogram address counter load enable bit. When this bit is at logic 0 , a load will be inhibtted and when this bit is a logic 1 , a load will be enabled. If the load is enabled, the JUMP address contained within the microprogram memory will be parallel loaded into the microprogram address counter. This results in the ability to perform an N -way branch. For example, if the branch address field is eight bits wide, a JUMP to any address in the memory space from word 0 through word 255 can be performed.
This simple branching control feature allows a microprogram memory controller to execute sequential microinstructions or perform a JUMP (JMP) to any address either before or after the address currently contained in the microprogram address counter.

## CONDITIONAL JUMPING

While the JUMP instruction has added some flexibility to the sequencing of microprogram instructions, the controller still lacks any decision-makıng capability. This decisıon-making capability is provided by the CONDITIONAL JUMP (COND JMP) instruction. Figure 1 shows a functional block diagram of a microprogram memory/address controller providing the capability to jump on either of two different conditions. In this example, the load select control is a two-bit field used to control a


Figure 1. A Two-Bit Control Field Can be Used to Select CONTINUE, BRANCH, or CONDITIONAL BRANCH.
four-input multiplexer. When the two-bit field is equivalent to blnary zero, the multiplexer selects the zero input which forces the load control inactive. Thus, the CONTINUE microprogram control instruction is executed. When the two-bit load select field contains binary one, the $D_{1}$ input of the multiplexer is selected. Now, the load control is a function of the Condition 1 input. If Condition 1 is logic 0 , the microprogram address counter increments and if Condition 1 is logic 1 , the jump address will be parallel loaded in the next clock cycle. This operation is defined as a CONDITIONAL JUMP. If the load select input contains binary 2, the $D_{2}$ input is selected and the same conditional function is performed with respect to the Condition 2 input. If the load select field contains binary 3 , the $D_{3}$ input of the multiplexer is selected. Since the $D_{3}$ input is tied to logic HIGH, this forces the microprogram address counter to the load mode independent of anything else. Thus, the jump address is loaded into the microprogram address counter on the next clock cycle and an UNCONDITIONAL JUMP is executed. This load select control function definition is shown in Table 1.

TABLE 1.
LOAD SELECT CONTROL FUNCTION.

| $\mathbf{s}_{\mathbf{1}} \mathbf{s}_{\mathbf{0}}$ | Function |  |
| :---: | :---: | :--- |
| 0 | 0 | Continue |
| 0 | 1 | Jump Condition 1 True |
| 1 | 0 | Jump Condition 2 True |
| 1 | 1 | Jump Unconditional |

## OVERLAPPING THE MICROPROGRAM INSTRUCTION FETCH

Now that a few basic microprogram address control instructions have been defined, let us examine the control instructions used in a microprogram control unit featuring the overlap fetching of the next microinstruction. This technique is also known as "pipelinıng". The block diagram for such a microprogram control unit is shown in Figure 2. The key difference when compared with previous microprogrammed architectures is the existence of the "pipeline register" at the output of the microprogram memory. By definition, the pipeline register (or microword register) contains the microinstruction currently being executed by the machine. Simultaneously, while this microinstruction is being executed, the address of the next microinstruction is applied to the microprogram memory and the contents of that memory word are being fetched and set-up at the inputs to the pipeline register. This technıque of pipelining can be used to improve the performance of the microprogram control unit. This results because the contents of the microprogram memory word required for the next cycle are being fetched on an overlapping basis with the actual execution of the current microprogram word. It should be realized that when the pipeline approach is used, the design engineer must be aware of the fact that some registers contain the results of the previous microinstruction executed, some registers contain the current microinstruction being executed, and some registers contain data for the next microinstruction to be executed.


Figure 2. Overlapping (or Pipelining) the Fetch of the Next Microinstruction.

Let us now compare the block diagram of Figure 2 with that shown in Figure 1. The major difference, of course, is the addition of the pipeline register at the output of the microprogram control memory. Also, notice the addition of the address multiplexer at the source of the microprogram memory address. This address multiplexer is used to select the microprogram counter register or the pipeline register as the source of the next address for the microprogram memory. The condition code multiplexer is used to control the address multiplexer in this address selection. By placing an incrementer at the output of the address multiplexer, is is possible to always generate the current microprogram address "plus one" at the input of the microprogram counter register.
In Figure 1, the microprogram address counter was described as a counter and could be a device such as the Am25LS161 counter. In the implementation as shown in Figure 2, the Am25LS161 counter is not appropriate. Instead, an incrementer and register are used to give the equivalent effect of a counter.
The key difference between using a true binary counter and the incrementer register described here is as follows. When the jump address from the pipeline register is selected by the multiplexer, the incrementer will combinatorially prepare that address plus one for entry into the microprogram counter register. This entry will occur on the LOW-to-HIGH transition of the clock. Thus, the microprogram counter register can always be made to contan address plus one, independent of the selection of the next microinstruction address. When the address multiplexer is switched so that the microprogram counter register is selected as the source of the microprogram memory address, the incrementer will again set-up address plus one for entry into the microprogram counter register. Thus, when the address multiplexer selects the microprogram counter register, the address multiplexer, incrementer and microprogram counter register appear to operate as a normal binary counter.
The condition code multiplexer $\mathrm{S}_{0} \mathrm{~S}_{1}$ operates in exactly the same fashion as described for the condition code multiplexer of Figure 1. That is, binary zero in the pipeline register (the current microinstruction being executed) forces an unconditional selection of the microprogram register via $\mathrm{D}_{0}$. Binary one or blnary two in the next address select control bits of the pipeline register cause a conditional selection at the address multiplexer via $\mathrm{D}_{1}$ or $\mathrm{D}_{2}$. Thus, a CONDITIONAL JUMP can be executed. Binary three in the next address select portion of the pipeline register causes an UNCONDITIONAL JUMP instruction to be executed via $\mathrm{D}_{3}$.
When the overall machine timing is studied, it will be observed that the key difference between overlap fetching and nonoverlap fetching involves the propagation delay of the microprogram memory. In the non-pipelined architecture, the microprogram memory propagation delay must be added to the propagation delay of all the other elements of the machine. In the overlap fetch architecture, the propagation delay associated with the next microprogram memory address fetch is a separate loop independent of the other portion of the machine.

## SUBROUTINING IN MICROPROGRAMMING CONTROL

Thus far, we have examıned the CONTINUE instruction as well as the CONDITIONAL and UNCONDITIONAL JUMP instructions for overlap fetch. Just as in the programming of minicomputers and microcomputers, the advantages of SUBROUTINING can be realized in microprogramming. The idea here, of course, is that the same block of microcode (or even a single microinstruction) can be shared by several microinstruction sequences. This results in an overall reduction in the total
number of microprogram memory words required by the design. If we are to jump to a subroutine, what is required is the ability to store an address to which the subroutine should return when it has completed its execution. Examining the block diagram of Figure 3, we see the addition of a subroutine and loop (push/pop) stack (also called the file) and its associated stack pointer. The control signals required by the stack are an enable stack signal (FILE ENABLE $=\mathrm{FE}$ ) which will be used to tell the file whenever we wish to perform a push or a pop, and a push/pop control (PUP) used to control the direction of the stack pointer (push or pop).
In this architecture, the stack pointer always points to the address of the last microinstruction written on the stack. This allows the "next address multiplexer" to read the stack at any time via port F. When this selection is performed, the last word written on the stack will be the word applied to the microprogram memory. The condition code multiplexer of the previous example has also been replaced by a next address control unit. This next address control unit (Am29811A) can execute 16 different next address control functions where most of these functions are conditional. Thus, the device has four instruction inputs as well as one condition code test input which is connected to the condition code multiplexer. Note also that the next address control field of the microprogram word has been expanded to a four-bit field. Outputs from the Am29811A next address control block are used to control the stack pointer and the next address multiplexer of the Am2911. In addition, the device has outputs to control the three-state enable of the pipeline register and the three-state enable of the starting address decode PROM. Also, the architecture has a counter that can be used as a loop-counter or event counter.
The 16 instructions associated with the Am29811A are listed in Table 2. As is easily seen by referring to Table 2, three of the instructions in this set are associated with subroutining in microprogram memory. The first instruction of this set, is a simple conditional JUMP-TO-SUBROUTINE where the source of the subroutine address is in the pipeline register. The RETURN-FROM-SUBROUTINE instruction is also conditional and is used to return to the next microinstruction following the JUMP-TOSUBROUTINE instruction. There is also a conditional JUMP-TO-ONE-OF-TWO-SUBROUTINES, where the subroutine address is either in the PIPELINE register or in the internal REGISTER in the Am2911. This instruction will be explained in more detail later.

## TYPICAL COMPUTER CONTROL UNIT ARCHITECTURE USING THE Am2911 AND Am29811A

The microprogram memory control unit block diagram of Figure 3 is easily implemented using the Am2911 and Am29811A. This architecture provides a structured state machine design capable of executing many highly sophisticated next address control instructions. The Am2911 contains a next address multiplexer that provides four different inputs from which the address of the next microinstruction can be selected. These are the direct input (D), the register input (R), the program counter (PC), and the file (F). The starting address decoder (mapping PROM) output and the pipeline register output are connected together at the D input to the Am2911 and are operated in the three-state mode.
The architecture of Figure 3 shows an instruction register capable of being loaded with a machine instruction word from the data bus. The op code portion of the instruction is decoded using a mapping PROM to arrive at a starting address for the

TABLE 2. FUNCTIONAL DESCRIPTION OF Am29811A INSTRUCTION SET.

| MNEMONIC | INPUTS |  |  | OUTPUTS |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | FUNCTION | TEST INPUT | NEXT ADDR SOURCE | FILE | COUNTER | MAP-E | PL-E |
| Jz | L L L L | JUMP ZERO | X | D | HOLD | LL | H | L |
| cJs | L L L H | COND JSE PL | L | PC | HOLD | HOLD | ${ }^{H}$ | $\stackrel{L}{L}$ |
|  |  |  | H | D | PUSH | HOLD | H | L |
| JMAP | L L HL | JUMP MAP | x | D | HOLD | HOLD | L | H |
| CJP | L L H H | COND JUMP PL | L | PC | HOLD | HOLD | H | L |
|  |  |  | H | D | HOLD | HoLD | H | L |
| PUSH | L H L L | PUSH/COND LD CNTR | L | PC | PUSH | HOLD | H | L |
|  |  |  | H | PC | PUSH | LOAD | H | L |
| JSRP | L H L H | COND JSB R/PL | L | R | PUSH | HOLD | H | L. |
|  |  |  | H | D | PUSH | hold | H | L |
| c.Jv | L H HL | COND JUMP VECTOR | L | PC | HOLD | HOLD | H | ${ }^{\text {H }}$ |
|  |  |  | H | D | HOLD | HOLD | H | H |
| JRP | L H H | COND JUMP R/PL | L | R | HOLD | HOLD | H | L |
|  |  |  | H | D | HOLD | HOLD | H | L |
| RFCT | H L L L | REPEAT LOOP, CNTR $=0$ | L | F | HOLD | DEC | H | L |
|  |  |  | H | PC | POP | HOLD | H | L |
| RPCT | H L L H | REPEAT PL, CNTR $\neq 0$ | L | D | HOLD | DEC | ${ }^{+}$ | L |
|  |  |  | H | PC | HOLD | HOLD | H | L |
| CRTN | HL H L | COND RTN | L | PC | HOLD | HOLD | ${ }_{\mathrm{H}}$ | L |
|  |  |  | H | F | POP | HOLD | H | L |
| CJPP | H L H H | COND JUMP PL \& POP | L | PC | HOLD | HOLD | H | L |
|  |  |  | H | D | POP | HOLD | H | L |
| LDCT | H H L L | LOAD CNTR \& CONTINUE | $\times$ | PC | HOLD | LOAD | H | L |
| LOOP | H HL H | TEST END LOOP | L | F | HOLD | HOLD | H | L |
|  |  |  | H | PC | POP | HOLD | H | $\llcorner$ |
| CONT | H H H L | CONTINUE | $x$ | PC | HOLD | HOLD | H |  |
| JP | HHHH | JUMP PL | $\times$ | D | HOLD | HOLD | H | L |



MPR-457
Figure 3. A Typical Computer Control Unit Using the Am2911 and Am29811A.

TABLE 3. PIN FUNCTIONS.

microinstruction sequence required to execute the machine instruction. When the microprogram memory address is to be the first microinstruction of the machine instruction sequence, the Am29811A next address control unit selects the multiplexer D input and enables the three-state output from the mapping PROM. When the current microinstruction being executed is selecting the next microinstruction address as a JUMP function, the JUMP address will be avalable at the multiplexer D input. This is accomplished by having the Am29811A select the next address multiplexer D input and also enabling the three-state output of the pipeline register branch address field. The register enable input to the Am2911 is connected to ground so that this register will always load the value at the Am2911 D input. The value at $D$ is clocked into the Am2911's register (R) at the end of the current microcycle, which makes the D value of this microcycle available as the R value of the next microcycle. Thus, by using the branch address field of two sequential microinstructoons, a conditional JUMP-TO-ONE-OF-TWOSUBROUTINES or a conditional JUMP-TO-ONE-OF-TWO-BRANCH-ADDRESSES can be executed by either selecting the D input or the R input of the next address multiplexer.

When sequencing through continuous microinstructions in microprogram memory, the program counter in the Am2911 is used. Here, the Am29811A simply selects the PC input of the next address multiplexer. In addition, most of these instructions enable the three-state outputs of the pipeline register associated with the branch address field, which allows the register within the Am2911 to be loaded.

The $4 \times 4$ stack in the Am2911 is used for looping and subroutining in microprogram operations. Up to four levels of subroutines or loops can be nested. Also, loops and subroutines can be intermixed as long as the four-word depth of the stack is not exceeded.

## ARCHITECTURE OF THE Am2910

The Am2910 is a bipolar microprogram controller intended for use in high-speed microprocessor applications. It allows addressing of up to 4 K words of microprogram. A block diagram is shown in Figure 4.

The controller contains a four-input multiplexer that is used to select either the register/counter, direct input, microprogram counter, or stack as the source of the next microinstruction address.
The register/counter consists of 12 D-type, edge-triggered flipflops, with a common clock enable. When its load control, RLD, is LOW, new data is loaded on a positive clock transition. A few instructions include load; in most systems, these instructions will be sufficient, simplifying the microcode. The output of the register/counter is available to the multiplexer as a source for the next microinstruction address. The direct input furnishes a source of data for loading the register/counter.


Figure 4. Am2910 Block Diagram.

The Am2910 contains a microprogram counter ( $\mu \mathrm{PC}$ ) that is composed of a 12 -bit incrementer followed by a 12-bit register. The $\mu \mathrm{PC}$ can be used in ether of two ways When the carry-in to the incrementer is HIGH, the microprogram register is loaded on the next clock cycle with the current $Y$ output word plus one $(\mathrm{Y}+1 \rightarrow \mu \mathrm{PC})$. Sequential microinstructions are thus executed. When the carry-in is LOW, the incrementer passes the Y output word unmodified so that $\mu \mathrm{PC}$ is reloaded with the same Y word on the next clock cycle ( $\mathrm{Y} \rightarrow \mu \mathrm{PC}$ ). The same microinstruction is thus executed any number of times.
The third source for the multiplexer is the direct (D) inputs. This source is used for branching.
The fourth source available at the multiplexer input is a 5 -word by 12 -bit stack (file). The stack is used to provide return address linkage when executing microsubroutines or loops. The stack contains a buld-in stack pointer (SP) which always points to the last file word written. This allows stack reference operations (looping) to be performed without a pop. The stack pointer operates as an up/down counter. During microinstructions 2, 4 and 5 , the PUSH operation is performed. This causes the stack pointer to increment and the file to be written with the required return linkage. On the cycle following the PUSH, the return data is at the new location pointed to by the stack pointer.
During six other microinstructions, a POP operation occurs. This places the information at the top of the stack onto the $Y$ outputs. The stack pointer decrements at the next rising clock edge following a POP, effectively removing old information from the top of the stack.
The stack pointer linkage is such that any sequence of pushes, pops or stack references can be achieved. At RESET (Instruction 0 ), the depth of nesting becomes zero. For each PUSH, the nesting depth increases by one; for each POP, the depth decreases by one. The depth can grow to five. After a depth of five is reached, FULL goes LOW. Any further PUSHes onto a full stack overwrites information at the top of the stack, but leaves the stack pointer unchanged. This operation will usually destroy useful information and is normally avoided. A POP from an empty stack places non-meaningful data on the Y outputs, but is otherwise safe. The stack pointer remains at zero whenever a POP is attempted from a stack already empty.

The register/counter is operated during three microinstructions ( $8,9,15$ ) as a 12 -bit down counter, with result = zero available as a microinstruction branch test criterion. This provides efficient iteration of microinstructions. The register/counter is arranged such that if it is preloaded with a number N and then used as a loop termination counter, the sequence will be executed exactly $\mathrm{N}+1$ times. During instruction 15, a three-way branch under combined control of the loop counter and the condition code is available.

The device provides three-state $Y$ outputs. These can be particularly useful in designs requiring automatic checkout of the processor. The microprogram controller outputs can be forced into the high-impedance state, and pre-programmed sequences of microinstructions can be executed via external access to the address lines.

## OPERATION

Table 4 shows the result of each instruction in controlling the multiplexer which determines the Y outputs, and in controlling the three enable signals $\overline{\mathrm{PL}}, \overline{\mathrm{MAP}}$ and $\overline{\mathrm{VECT}}$. The effect on the $\mu \mathrm{PC}$, the register/counter, and the stack after the next positive-going clock edge is also shown. The multiplexer determines which internal source drives the $Y$ outputs. The value loaded into $\mu \mathrm{PC}$ is either identical to the $Y$ output, or else one greater, as determined by CI . For each instruction, one and only one of the three outputs $\overline{P L}, \overline{M A P}$ and $\overline{\mathrm{VECT}}$ is LOW. If these outputs control three-state enables for the primary source of mıcroprogram jumps (usually part of a pipeline register), a PROM which maps the instruction to a microinstruction starting location, and an optional third source (often a vector from a DMA or interrupt source), respectively, the three-state sources can drive the $D$ inputs without further logic.

Several inputs, as shown in Table 4 can modify instruction execution. The combination $\overline{\mathrm{CC}}$ HIGH and $\overline{\mathrm{CCEN}} \mathrm{LOW}$ is used as a test in 10 of the 16 instructions. $\overline{\text { RLD }}$, when LOW, causes the $D$ input to be loaded into the register/counter, overriding any HOLD or DEC operation specified in the instruction. $\overline{O E}$, normally LOW, may be forced HIGH to remove the Am2910 Y outputs from a three-state bus.

TABLE 4. Am 2910 MICROINSTRUCTION SET.

| $\begin{aligned} & \text { HEX } \\ & \mathrm{I}^{-1} \mathrm{I}_{0} \end{aligned}$ | MNEMONIC | NAME | REG/ CNTR CONTENTS | FAIL$\overline{\text { CCEN }}=\text { LOW and } \overline{\text { CC }}=\mathrm{HIGH}$ |  | $\overline{\text { PASS }} \overline{\text { CCEN }}=\text { HIGH or } \overline{C C}=\text { LOW }$ |  | REG/ CNTR | ENABLE |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  | Y | STACK | Y | STACK |  |  |
| 0 | JZ | JUMP ZERO | $\times$ | 0 | CLEAR | 0 | CLEAR | HOLD | PL |
| 1 | CJS | COND JSB PL | $\times$ | PC | HOLD | D | PUSH | HOLD | PL |
| 2 | JMAP | JUMP MAP | X | D | HOLD | D | HOLD | HOLD | MAP |
| 3 | CJP | COND JUMP PL | $\times$ | PC | HOLD | D | HOLD | HOLD | PL |
| 4 | PUSH | PUSH/COND LD CNTR | $x$ | PC | PUSH | PC | PUSH | Note 1 | PL |
| 5 | JSRP | COND JSB R/PL | X | R | PUSH | D | PUSH | HOLD | PL |
| 6 | CJV | COND JUMP VECTOR | X | PC | HOLD | D | HOLD | HOLD | VECT |
| 7 | JRP | COND JUMP R/PL | $\times$ | R | HOLD | D | HOLD | HOLD | PL |
| 8 | RFCT | REPEAT LOOP, CNTR $\neq 0$ | $\neq 0$ | F | HOLD | F | HOLD | DEC | PL |
|  |  |  | $=0$ | PC | POP | PC | POP | HOLD | PL |
| 9 | RPCT | REPEAT PL, CNTR $\neq 0$ | $\neq 0$ | D | HOLD | D | HOLD | DEC | PL |
|  |  |  | $=0$ | PC | HOLD | PC | HOLD | HOLD | PL |
| A | CRTN | COND RTN | $\times$ | PC | HOLD | F | POP | HOLD | PL |
| B | CJPP | COND JUMP PL \& POP | X | PC | HOLD | D | POP | HOLD | PL |
| C | LDCT | LD CNTR \& CONTINUE | X | PC | HOLD | PC | HOLD | LOAD | PL |
| D | LOOP | TEST END LOOP | X | F | HOLD | PC | POP | HOLD | PL |
| E | CONT | CONTINUE | X | PC | HOLD | PC | HOLD | HOLD | PL |
| F | TWB | THREE-WAY BRANCH | $\neq 0$ | F | HOLD | PC | POP | DEC | PL |
|  |  |  | $=0$ | D | POP | PC | POP | HOLD | PL |

Note If $\overline{\mathrm{CCEN}}=$ LOW and $\overline{\mathrm{CC}}=$ HIGH, hold; else load $X=$ Don't Care

The stack, a five-word last-in, first-out 12-bit memory, has a pointer which addresses the value presently on the top of the stack. Explicit control of the stack pointer occurs during instruction 0 (RESET), which makes the stack empty by resetting the SP to zero. After a RESET, and whenever else the stack is empty, the content of the top of stack is undefined until a PUSH occurs. Any POPs performed while the stack is empty put undefined data on the $F$ outputs and leave the stack pointer at zero. Any time the stack is full (five more PUSHes than POPs have occurred since the stack was last empty), the FULL warning output occurs. No additional PUSH should be attempted onto a full stack; if tried, information at the top of the stack will be overwritten and lost.

## THE Am2910 INSTRUCTION SET

The Am2910 provides 16 instructions which select the address of the next microinstruction to be executed. Four of the instructions are unconditional - their effect depends only on the instruction. Ten of the instructions have an effect which is partially controlled by an external, data-dependent condition. Three of the instructions have an effect which is partially controlled by the contents of the internal register/counter. The instruction set is shown in Table 4. In this discussion it is assumed that CI is tied HIGH.

In the ten conditional instructions, the result of the datadependent test is applied to $\overline{\mathrm{CC}}$. If the $\overline{\mathrm{CC}}$ input is LOW, the test is considered to have been passed, and the action specified in the name occurs; otherwise, the test has failed and an alternate (often simply the execution of the next sequential microinstruction) occurs. Testing of $\overline{\mathrm{CC}}$ may be disabled for a specific microinstruction by setting CCEN HIGH, which unconditionally forces the action specified in the name; that is, it forces a pass. Other ways of using $\overline{\text { CCEN }}$ include (1) tying it HIGH, which is useful if no microinstruction is data-dependent; (2) tying it LOW if datadependent instructions are never forced unconditionally; or (3) tying it to the source of Am2910 instruction bit $\mathrm{I}_{0}$, which leaves instructions 4,6 and 10 as data-dependent but makes others unconditıonal. All of these tricks save one bit of microcode width.

The effect of three instructions depends on the contents of the register/counter. Unless the counter holds a value of zero, it is decremented; if it does hold zero, it is held and a different microprogram next address is selected. These instructions are useful for executing a microinstruction loop a known number of times. Instruction 15 is affected both by the external condition code and the internal register/counter.
Perhaps the best technique for understanding the Am2910 is to simply take each instruction and review its operation. In order to provide some feel for the actual execution of these instructions, Figure 5 is included and depicts examples of all 16 instructions.

The examples given in Figure 5 should be interpreted in the following manner: The intent is to show microprogram flow as various microprogram memory words are executed. For example, the CONTINUE instruction, instruction number 14, as shown in Figure 5, simply means that the contents of microprogram memory word 50 is executed, then the contents of word 51 is executed. This is followed by the contents of microprogram memory word 52 and the contents of microprogram memory word 53. The microprogram addresses used in the examples were arbitrarily chosen and have no meaning other than to show instruction flow. The exception to this is the first example, JUMP ZERO, which forces the microprogram location counter to address ZERO. Each dot refers to the time that the contents of the microprogram memory word is in the pipeline register. While no special symbology is used for the conditional instructions, the text to follow will explain what the conditional choices are in each example.

It might be appropriate at this time to mention that AMD has a microprogram assembler called AMDASM, which has the capability of using the Am2910 instructions in symbolic representation. AMDASM's Am2910 instruction symbolics (or mnemonics) are given in Figure 5 for each instruction and are also shown in Table 4.
Instruction 0, JZ (JUMP and ZERO, or RESET) unconditionally specifies that the address of the next microinstruction is zero. Many designs use this feature for power-up sequences and provide the power-up firmware beginning at microprogram memory word location 0 .
Instruction 1 is a CONDITIONAL JUMP-TO-SUBROUTINE vIa the address provided in the pipeline register. As shown in Figure 5 , the machine might have executed words at address 50,51 and 52. When the contents of address 52 is in the pipeline register, the next address control function is the CONDITIONAL JUMP-TOSUBROUTINE. Here, if the test is passed, the next instruction executed will be the contents of microprogram memory location 90. If the test failed, the JUMP-TO-SUBROUTINE will not be executed; the contents of microprogram memory location 53 will be executed instead. Thus, the CONDITIONAL JUMP-TOSUBROUTINE instruction at location 52 will cause the instruction either in location 90 or in location 53 to be executed next. If the TEST input is such that location 90 is selected, value 53 will be pushed onto the internal stack. This provides the return linkage for the machine when the subroutine beginning at location 90 is completed. In this example, the subroutine was completed at location 93 and a RETURN-FROM-SUBROUTINE would be found at location 93.
Instruction 2 is the JUMP MAP instruction. This is an unconditional instruction which causes the MAP output to be enabled so that the next microinstruction location is determined by the address supplied via the mapping PROMs. Normally the JUMP MAP instruction is used at the end of the instruction fetch sequence for the machine. In the example of Figure 5 , microinstructions at locatıons 50,51,52 and 53 might have been the fetch sequence and at its completion at location 53, the jump map function would be contained in the pipeline register. This example shows the mapping PROM outputs to be 90 ; therefore, an unconditional jump to microprogram memory address 90 is performed.
Instruction 3, CONDITIONAL JUMP PIPELINE, derives its branch address from the pipeline register branch address value ( $\mathrm{BR}_{0}-\mathrm{BR}_{11}$ in Figure 6). This instruction provides a technique for branching to various microprogram sequences depending upon the test condition inputs. Quite often, state machines are designed which simply execute tests on various inputs waiting for the condition to come true. When the true condition is reached, the machine then branches and executes a set of microinstructoons to perform some function. This usually has the effect of resetting the input being tested until some point in the future. Figure 5 shows the conditional jump via the pipeline register address at location 52. When the contents of microprogram memory word 52 are in the pipeline register, the next address will be either location 53 or location 30 in this example. If the test is passed, the value currently in the pipeline register (3) will be selected. If the test fails, the next address selected will be contained in the microprogram counter which, in this example, is 53.

Instruction 4 is the PUSH/CONDITIONAL LOAD COUNTER instruction and is used primarily for setting up loops in microprogram firmware. In Figure 5, when instruction 52 is in the pipeline register, a PUSH will be made onto the stack and the counter will be loaded based on the condition. When a PUSH occurs, the value pushed is always the next sequential instruction address. In this case, the address is 53 . If the test fails, the counter is not


MPR-111
Figure 5. Am2910 Execution Examples.
loaded; if it is passed, the counter is loaded with the value contained in the pipeline register branch address field. Thus, a single microinstruction can be used to set up a loop to be executed a specific number of times. Instruction 8 will describe how to use the pushed value and the register/counter for looping.
Instruction 5 is a CONDITIONAL JUMP-TO-SUBROUTINE via the register/counter or the contents of the PIPELINE register. As shown in Figure 5, a PUSH is always performed and one of two subroutines executed. In this example, either the subroutine beginning at address 80 or the subroutine beginnıng at address 90 will be performed. A return-from-subroutine (instruction number 10) returns the microprogram flow to address 55 . In order for this microinstruction control sequence to operate correctly, both the next address fields of instruction 53 and the next address fields of instruction 54 would have to contain the proper value. Let's assume that the branch address fields of instruction 53 contain the value 90 so that it will be in the Am2910 register/counter when the contents of address 54 are in the pipeline register. This requires that instruction at address 53 load the register/counter. Now, during the execution of instruction 5 (at address 54), if the test failed, the contents of the register (value $=90$ ) will select the address of the next microinstruction. If the test input passes, the pıpeline register contents (value $=80$ ) will determine the address of the next microinstruction. Therefore, this instruction provides the ability to select one of two subroutines to be executed based on a test condition.
Instruction 6 is a CONDITIONAL JUMP VECTOR instruction which provides the capability to take the branch address from a third source heretofore not discussed. In order for this instruction to be useful, the Am2910 output, VECT, is used to control a three-state control input of a register, buffer, or PROM contaıning the next microprogram address. This instruction provides one technıque for performing interrupt type branching at the microprogram level. Since this instruction is conditional, a pass causes the next address to be taken from the vector source, while failure causes the next address to be taken from the microprogram counter. In the example of Figure 5, if the CONDITIONAL JUMP VECTOR instruction is contained at location 52 , execution will contınue at vector address 20 if the TEST input is HIGH and the microinstruction at address 53 will be executed if the TEST input is LOW.

Instruction 7 is a CONDITIONAL JUMP via the contents of the Am2910 REGISTER/COUNTER or the contents of the PIPELINE register. This instruction is very similar to instruction 5 ; the conditional jump-to-subroutine via R or PL. The major difference between instruction 5 and instruction 7 is that no push onto the stack is performed with 7 . Figure 5 depicts this instruction as a branch to one of two locations depending on the test condition. The example assumes the pipeline register contains the value 70 when the contents of address 52 is being executed. As the contents of address 53 is clocked into the pipeline register, the value 70 is loaded into the register/counter in the Am2910. The value 80 is available when the contents of address 53 is in the pipeline register Thus, control is transferred to ether address 70 or address 80 depending on the test condition.

Instruction 8 is the REPEAT LOOP, COUNTER $\neq$ ZERO instructıon. This microinstruction makes use of the decrementing capability of the register/counter. To be useful, some previous instructıon, such as 4 , must have loaded a count value into the register/ counter. This instruction checks to see whether the register/ counter contains a non-zero value. If so, the register/counter is decremented, and the address of the next microinstruction is taken from the top of the stack. If the register counter contains zero, the loop exit condition is occurring; control falls through to
the next sequentıal microinstruction by selecting $\mu \mathrm{PC}$; the stack is POP'd by decrementıng the stack pointer, but the contents of the top of the stack are thrown away.
An example of the REPEAT LOOP, COUNTER $\neq$ ZERO instruction is shown in Figure 5. In this example, location 50 most likely would contain a PUSH/CONDITIONAL LOAD COUNTER instruction which would have caused address 51 to be PUSHed on the stack and the counter to be loaded with the proper value for looping the desired number of times.
In this example, since the loop test is made at the end of the instructions to be repeated (microaddress 54), the proper value to be loaded by the instruction at address 50 is one less than the desired number of passes through the loop. This method allows a loop to be executed from 0 to 4095 times.
Single-microınstructıon loops provide a highly efficient capability for executıng a specific microinstruction a fixed number of times. Examples include fixed rotates, byte swap, fixed point multiply, and fixed point divide.
Instruction 9 is the REPEAT PIPELINE REGISTER, COUNTER $\neq$ ZERO instruction. This instruction is similar to instruction 8 except that the branch address now comes from the pipeline register rather than the file. In some cases, this instruction may be thought of as a one-word file extension; that is, by using this instruction, a loop with the counter can still be performed when subroutines are nested five deep. This instruction's operation is very similar to that of instruction 8 . The differences are that on this instruction, a failed test condition causes the source of the next microinstruction address to be the $D$ inputs; and, when the test condition is passed, this instruction does not perform a POP because the stack is not being used.
In the example of Figure 5, the REPEAT PIPELINE, COUNTER $\neq$ ZERO instruction is instruction 52 and is shown as a single microinstruction loop. The address in the pipeline register would be 52 . Instruction 51 in this example could be the LOAD COUNTER AND CONTINUE instruction (number 12). While the example shows a single microinstruction loop, by simply changing the address in a pipeline register, multi-instruction loops can be performed in this manner for a fixed number of times as determined by the counter.

Instruction 10 is the conditional RETURN-FROM-SUBROUTINE instruction. As the name implies, this instruction is used to branch from the subroutıne back to the next microinstruction address following the subroutine call. Since this instruction is conditional, the return is performed only if the test is passed. If the test is failed, the next sequential microinstruction is performed. The example in Figure 5 depicts the use of the conditional RETURN-FROM-SUBROUTINE instruction in both the conditional and the unconditional modes. This example first shows a jump-tosubroutine at instruction location 52 where control is transferred to location 90. At location 93, a conditional RETURN-FROMSUBROUTINE instruction is performed. If the test is passed, the stack is accessed and the program will transfer to the next instruction at address 53 . If the test is failed, the next microinstruction at address 94 will be executed. The program will contınue to address 97 where the subroutine is complete. To perform an unconditional RETURN-FROM-SUBROUTINE, the conditional RETURN-FROM-SUBROUTINE instruction is executed unconditionally; the microinstruction at address 97 is programmed to force $\overline{\mathrm{CCEN}}$ HIGH, disabling the test and the forced PASS causes an unconditional return.

Instruction 11 is the CONDITIONAL JUMP PIPELINE register address and POP stack instruction. This instruction provides another technique for loop termınation and stack maintenance.

The example in Figure 5 shows a loop being performed from address 55 back to address 51 . The instructions at locations 52, 53 and 54 are all conditional JUMP and POP instructions. At address 52 , if the TEST input is passed, a branch will be made to address 70 and the stack will be properly maintained via a POP. Should the test fall, the instruction at location 53 (the next sequential instruction) will be executed. Likewise, at address 53, either the instruction at 90 or 54 will be subsequently executed, respective to the test being passed or failed. The instruction at 54 follows the same rules, going to either 80 or 55 . An instruction sequence as described here, using the CONDITIONAL JUMP PIPELINE and POP instruction, is very useful when several inputs are being tested and the microprogram is looping waiting for any of the inputs being tested to occur before proceeding to another sequence of instructions. This provides the powerful jump-table programming technique at the firmware level.

Instruction 12 is the LOAD COUNTER AND CONTINUE instruction, which simply enables the counter to be loaded with the value at its parallel inputs. These inputs are normally connected to the pipeline branch address field which (in the architecture being described here) serves to supply either a branch address or a counter value depending upon the microinstruction being executed. There are altogether three ways of loading the counter the explicit load by this instruction 12; the conditional load included as part of instruction 4; and the use of the RLD input along with any instruction. The use of RLD with any instruction overrides any counting or decrementation specified in the instruction, calling for a load instead. Its use provides additional microinstruction power, at the expense of one bit of microinstruction width. This instruction 12 is exactly equivalent to the combination of instruction 14 and $\overline{\text { RLD }}$ LOW. Its purpose is to provide a simple capability to load the register/counter in those implementations which do not provide microprogrammed control for $\overline{\mathrm{RLD}}$.
Instruction 13 is the TEST END-OF-LOOP instruction, which provides the capability of conditionally exiting a loop at the bottom; that is, this is a conditional instruction that will cause the microprogram to loop, via the file, if the test is failed else to continue to the next sequential instruction. The example in Figure 5 shows the TEST END-OF-LOOP microinstruction at address 56 . If the test fails, the microprogram will branch to address 52. Address 52 is on the stack because a PUSH instruction had been executed at address 51 . If the test is passed at instruction 56 , the loop is terminated and the next sequential microinstruction at address 57 is being executed, which also causes the stack to be POPd; thus, accomplishing the required stack maintenance.

Instruction 14 is the CONTINUE instruction, which simply causes the microprogram counter to increment so that the next sequential microinstruction is executed. This is the simplest micronstruction of all and should be the default instruction which the firmware requests whenever there is nothing better to do.

Instruction 15, THREE-WAY BRANCH, is the most complex. It provides for testing of both a data-dependent condition and the counter during one microinstruction and provides for selecting among one of three microinstruction addresses as the next microinstruction to be performed. Like instruction 8, a previous instruction will have loaded a count into the register/counter while pushing a microbranch address onto the stack. Instruction 15 performs a decrement-and-branch-until-zero function similar to instruction 8 . The next address is taken from the top of the stack untll the count reaches zero; then the next address comes from the pipeline register. The above action continues as long as the test condition falls. If at any execution of instruction 15 the test condition is passed, no branch is taken; the microprogram counter register furnishes the next address. When the loop is
ended, either by the count becoming zero, or by passing the conditional test, the stack is POP'd by decrementing the stack pointer, since interest in the value contained at the top of the stack is then complete.
The application of instruction 15 can enhance performance of a variety of machine-level instructions. For instance, (1) a memory search instruction to be terminated either by finding a desired memory content or by reaching the search limit; (2) variable-field-length arithmetic terminated early upon finding that the content of the portion of the field still unprocessed is all zeroes; (3) key search in a disc controller processing variable length records; (4) normalization of a floating point number.

As one example, consider the case of a memory search instruction. As shown in Figure 5, the instruction at microprogram address 63 can be Instruction 4 (PUSH), which will push the value 64 onto the microprogram stack and load the number N , which is one less than the number of memory locations to be searched before giving up. Location 64 contains a microinstruction which fetches the next operand from the memory area to be searched and compares it with the search key. Location 65 contains a microinstruction which tests the result of the comparison and also is a THREE-WAY BRANCH for microprogram control. If no match is found, the test fails and the microprogram goes back to location 64 for the next operand address. When the count becomes zero, the microprogram branches to location 72 , which does whatever is necessary if no match is found. If a match occurs on any execution of the THREE-WAY BRANCH at location 65, control falls through to location 66 which handles this case. Whether the instruction ends by finding a match or not, the stack will have been POP'd once, removing the value 64 from the top of the stack.

## Am29811A Instruction Set Difference

The Am29811A instruction set is identical to the Am2910 except for instruction number 15. In the Am29811A, instruction number 15 is an unconditional JUMP PIPELINE REGISTER instruction. This provides the ability to unconditionally branch to any address contained in the branch address field of the microprogram. Thus, an unconditional N -way branch can be performed. Use of this instruction as opposed to a forced conditional jump pipeline instruction simply allows the condition code multiplexer select field to be shared (formatted) with other functions.

## TYPICAL COMPUTER CONTROL UNIT ARCHITECTURE USING THE Am2910

The microprogram memory control unit block diagram of Figure 6 is easily implemented using the Am2910. This architecture provides a structured state machine design capable of executing many highly sophisticated next address control instructions.
The architecture of Figure 6 shows an instruction register capable of being loaded with a machine instruction word from the data bus. The op code portion of the instruction is decoded using a mapping PROM to arrive at a starting address for the mıcroinstruction sequence required to execute the machine instruction. When the microprogram memory address is to be the first microinstruction of the machine instruction sequence, the Am2910 next address control selects the multiplexer D input and enables the three-state output from the mapping PROM. When the current microinstruction being executed is selecting the next microinstruction address as a JUMP function, the JUMP address will be available at the multiplexer D input. This is accomplished by having the Am2910 select the next address multiplexer D input and also enabling the three-state output of the pipeline register branch address field. The register enable input to the Am2910 can be grounded so that this register will load the value at the

Am2910 D input. The value at D is clocked into the Am2910's register (R) at the end of the current microcycle, which makes the $D$ value of this microcycle available as the $R$ value of the next microcycle. Thus, by using the branch address field of two sequential microinstructions, a conditional JUMP-TO-ONE-OF-TWO-SUBROUTINES or a conditional JUMP-TO-ONE-OF-TWO-BRANCH-ADDRESSES can be executed by elther selecting the $D$ input or the $R$ input of the next address multiplexer.
When sequencing through contınuous microinstructions in mıcroprogram memory, the program counter in the Am2910 is used. Here, the control logic simply selects the PC input of the next address multiplexer. In addition, most of these instructions enable the three-state outputs of the pipeline register associated with the branch address field, which allows the register within the Am2910 to be loaded. The $5 \times 12$ stack in the Am2910 is used for
looping and subroutining in microprogram operations. Up to five levels of subroutines or loops can be nested. Also, loops and subroutines can be intermixed as long as the five word depth of the stack is not exceeded.

## CCU TIMING

The minimum clock cycle that can be used in a CCU design is usually determined by the component delays along the longest "pipeline-register-clock to logic to pipeline-register-clock" path. At the beginning of any given clock cycle, data available at the output of the microprogram memory, counter status, and any other data and/or status fields, are latched into their associated pipeline registers. At this point, all delay paths begin. Visual inspection will not always point out the longest signal delay path.


Figure 6. A Typical Computer Control Unit Using the Am2910.

The obviously long paths are a good place to start, but each definable path should be calculated on a component by component basis until the truly longest logic signal path is found.
Referring to Figure 6, a number of potentially long paths can be identified. These include the instruction register to pipeline register time, the pipeline register to pipeline register time via the condition code multıplexer and the status to pipeline register time. In order to demonstrate the technique for calculating the AC performance of the Am2910 state machine design, the timing diagrams of Figure 7 are presented. Here, a number of propagation delay paths are evaluated such that the reader can learn the technique for performing these computations.
All of the propagation delays have been calculated using typical propagation delays because at the time of this writing, the characterization of the Am2910 has not been completed. When the final data sheet is published, the user need only select the appropriate worst case specifications and he can compute the desired maximum propagation delays for his design. Also, by looking at the typical propagation delay numbers, the designer will be able to evaluate the design margin in the system after he has completed all of the worst case calculations. These typical propagation delays represent the expected values if a system were set up on the bench and actual measurements would be taken at 5 V and $25^{\circ} \mathrm{C}$ operating temperature.

While Figure 6 and Figure 7 deal with the Am2910 microprogram sequencer, it is also instructive to evaluate the AC performance of a typical computer control unit using the Am2911 and Am29811A. Figure 3 shows such a connection and will be used as the basis for performing the propagation delay path calculations. The calculations for the various propagation delay paths are demonstrated in Figure 8 and are intended to show the
technique for computing these delays. As before, the typical propagation delays have been used in the computation for comparison purposes. The user can derive the maximum numbers at $25^{\circ} \mathrm{C}$ and 5 V , commercial temperature range and power supply variations or military temperature range and power supply variations as required for his design.
When Figure 7 and Figure 8 are reviewed in detail, the reader will recognize that the longest propagation delay paths in the case of the Am2910 as well as the Am2911 and Am29811A involve the three-state enables on the map PROM or the pipeline register for the branch address. If absolute maximum speed is desired, these paths can be eliminated by using one of several techniques. One technique is to simply allocate one or more bits in the pipeline register to control the three-state enables of the various devices connected to the $D$ input of the Am2910. For the example of Figure 6, one bit would be sufficient and the pipeline register could be implemented using an Am74S175 register. This would allow the true and complement outputs to be used to drive the pipeline register branch address output enable and the mapping PROM output enable. Thus, these longest paths would be eliminated and an improvement of about 30 ns would be achieved. A second technique for eliminating these propagation delay paths would be to use a four input NAND gate and a four input NOR gate to encode the equivalent function of the $\overline{M A P}$ enable and the $\overline{P L}$ enable. This technique is demonstrated in Figure 9. Again, an Am74S175 register would be used as the pipeline register to provide the instruction inputs to the Am2910 sequencer. This would allow instruction 2 to be decoded to provide the MAP enable signal and "NOT INSTRUCTION 2 " to be decoded as the pipeline enable signal. This technique can be applied as well to the computer control unit of Figure 3 to accomplish the same longest path elimination.


Figure 7. Propagation Delay Calculations on the Am2910 Microprogram Sequencer.
b)


PATH 1
PATH 2 - -
MPR-461
c)

| CONDITIONAL JUMP |  |  |  |
| :--- | :--- | :---: | :---: |
| SEVICE NO. DEVICE PATH PATH 1 PATH 2 <br> S - REG CP to Q 9 9 <br> 2922 D to $Y$ 13 13 <br> 2910 CC to $Y$ 21 - <br> PROM ADDR TO OUT 30 - <br> 2922 SET-UP R 5 - <br> 2910 SET-UP PC - 46 <br> TOTAL-ns  78 68 |  |  |  |

PATH 1
PATH 2 - -
Figure 7. Propagation Delay Calculations on the Am2910 Microprogram Sequencer (Cont.).
d)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| S - REG | CP to Q | 9 | 9 | 9 |
| 2910 | I to MAP | 27 | 27 | 27 |
| MAP-PROM | OE to OUT | 18 | 18 | 18 |
| 2910 | D to Y | 14 | - | - |
| PROM | ADDR to OUT | 30 | - | - |
| 2922 | SET-UP R | 5 | - | - |
| 2910 | SET-UP PC | - | 34 | - |
| 2910 | SET-UP R | - | - | 9 |
| TOTAL-ns |  | 103 | 88 | 63 |

> PATH $1 \square$
> PATH $2 \square$
> PATH $3-\infty$

MPR-463
e)


PATH 1
PATH 2 - -
PATH 3 - - - -
Figure 7. Propagation Delay Calculations on the Am2910 Microprogram Sequencer (Cont.).

g)


PATH 1
PATH $2=\square$
MPR-466
Figure 7. Propagation Delay Calculations on the Am2910 Microprogram Sequencer (Cont.).

b)


Figure 8. Propagation Delay Calculations for the Am2911 and Am29811A Design.


Figure 8. Propagation Delay Calculations for the Am2911 and Am29811A Design (Cont.).


Figure 8. Propagation Delay Calculations for the Am2911 and Am29811A Design (Cont.).


Figure 8. Propagation Delay Calculations for the Am2911 and Am29811A Design (Cont.).


Figure 8. Propagation Delay Calculations for the Am2911 and Am29811A Design (Cont.).


Figure 9. Using NAND and NOR Gates to Improve Am2910 Speed.

In order to compare the performance of the Am2910 with the Am2911 and Am29811A, Table 5 is presented. Here the propagation delays for the Am2911 and Am29811A are for a 12-bit wide microprogram sequencer configuration. If a wider configuration is used, only one additional carry input to carry output delay must be added to the appropriate paths of these calculations. A 12-bit wide Am2911/29811A configuration has been evaluated so that an "apples to apples" comparison can be made.
As is shown in Table 5, a number of combinations are possible for the longest AC propagation delay paths for these microprogram sequencers. First, the continue instruction can be executed the fastest of any of the microprogram instructions if the continues are sequential. That is, from the second continue on, the typical microcycle can be either 61 or 64 ns respectively. To achieve this speed, it is required that various signals throughout the architecture be stable such that the only paths that enter into the propagation delay calculation are the clock-to-output of the microprogram counter, the microprogram memory and the pipeline register setup.
The second group of instructions shown in Table 5 show some examples of instruction execution and jumping. These examples assume that the $\overline{\mathrm{MAP}}$ and $\overline{\mathrm{OE}}$ outputs are not used as described earlier. These calculations apply to several of the instructions but not to all the instructions. For the Am2910 sequencer all of the propagation delays are around 80 to 85 ns ; while for the Am2911/Am29811A combination, the propagation delays range from about 80 ns to 100 ns , depending on the instruction. It should be noted that certain other instructions such as push and conditional load counter should be evaluated to determine the speed at which they can be executed.

The last two instructions shown in Table 5 are for jumps where the output enable of the field supplying the address to the $D$ inputs of the microprogram sequencers are controlled by either the Am2910 or Am29811A. Notice that for Am2910 configuration, the jump map represents the longest propagation delay path and is 103ns typical. Also, for the Am2911/Am29811A combinatıon, the jump map instruction also represents the longest propagation delay path and is 109ns typical.

It is not the purpose of this exercise to show every possible propagation delay path; but rather, to show the reader the technique for computing propagatıon delays such that any design can be evaluated and the worst case past derived. Even here, not all of the worst case numbers shown in Table 5 have been derived in Figures 7 and 8. This was done intentionally and is left as an exercise for the student.

If the Am2909 or Am2911 and the Am29811A are combined into microprogram sequencers of either 8 bits in width or 16 bits in width, the calculations need only be modified slightly to determine
the microcycle times. Obviously, if two Am2911s are used, the worst case propagation delay paths do not change. However, if four Am2911s are used, the carry path will become the longer propagation delay path on several of the computations. This may be offset however since larger microprogram PROMs may be used if 64 K of microcode is actually being addressed or high power buffers may be placed between the Am2911 outputs and the microprogram memory to provide sufficient drive for such a large microprogram store.
In addition, the Am2909 and Am2911 may be used without the Am29811A where the user wishes to generate a special purpose instruction set or very high speed control of the internal multiplexer and push pop stack. In some, designs as much as 25 to 30ns, typical, can be removed from the longest propagation delay paths of the design by using high speed Schottky SSI. While this has not been the typical case, some designers have used it to provide a performance improvement not achievable with a standard Schottky condition code multiplexer and the Am29811A next address control unit.

## APPLICATIONS

It should be understood that the microprogram state machine built using either the Am2910 or the Am2911/29811A represents a general purpose state machine controller. Applications for this type of microprogrammed control include uses in minicomputers, communications, instrumentation, controllers and peripherals as well as special purpose processors. Typically, the microprogrammed approach provides a more structured organizatıon to the design and allows the design engıneer the greatest flexibility in implementation.
It is important to understand that microprogrammed machines need not be part of a typical minicomputer type structure. That is, a general purpose minicomputer usually has a machine instruction set that is totally different from its microprogram instruction control. As such, it is essential that the designer new to computer design and microprogram design understand the difference between a machine instruction and a microprogram instruction. This differentiation is shown in Figure 10 where a typical 16-bit machine level instruction is demonstrated as compared with a typical microprogram instruction. The machine level instruction usually consists of 16 bits and in this example, these bits are used to provide the op code, source register definition and destination register defintion. The microprogram instruction on the other hand usually consists of anywhere from 32 to 128 bits in a typical minicomputer type design. Here, the bits are used to control the elemental functions of a machine such as the Am2910 instruction control and condition code multiplexer, the Am2903 source, ALU function and destination control and so forth. For purposes of this explanation, let us assume that the machine level instruction is available to the machine programmer while the microprogram

TABLE 5. SUMMARY OF LONGEST AC PATHS FOR MICROPROGRAM SEQUENCERS.

| Instruction | Am2910 | $\begin{gathered} \text { Am2911 } \\ \text { Am29811A } \end{gathered}$ | Comments |
| :---: | :---: | :---: | :---: |
| Contınue | 61 | 64 | The fastest instruction. Assumes sequential continues! |
| Instruction Execute Jump Map (no $\overline{\mathrm{OE}}$ ) Jump PL (No $\overline{\mathrm{OE}}$ ) | $\begin{aligned} & 84 \\ & 83 \\ & 78 \end{aligned}$ | $\begin{array}{r} 88 \\ 78 \\ 101 \end{array}$ | If the $\overline{\mathrm{MAP}}$ and $\overline{\mathrm{PL}}$ outputs are not used. |
| Jump Map (via $\overline{\mathrm{OE}}$ ) Jump PL (via $\overline{\mathrm{OE}}$ ) | $\begin{array}{r} 103 \\ 98 \end{array}$ | $\begin{aligned} & 109 \\ & 104 \end{aligned}$ | If the $\overline{\mathrm{MAP}}$ and $\overline{\mathrm{PL}}$ outputs are used. |



Figure 10. Understanding Machine and Microprogram Instructions.
instruction is not available to the machine programmer at the assembly language level. Let it suffice to say that this assumption is not necessarily valid in machines being designed today.
Perhaps one of the most typical applications of the microprogrammed computer control unit state machine design is as the controller for a minicomputer. Here, the function of the microprogrammed controller is to fetch and execute machine level instructions. The flow required to perform this function is depicted in Figure 11 which should be representative for all general purpose type machines. Figure 11 shows that after initialization, the computer control unit simply fetches machine instructions, decodes these instructions and then fetches the required operands such that the original instruction can be executed. This cycle of fetching and executing instructions is performed without end. Such things as hardware halts or resets are ignored and should be assumed to only cause re-initialization.
Once the flow of a typical computer control unit is understood, it is possible to evaluate a number of architectures using the Am2910 or Am2911/Am29811A such that the flow diagram of Figure 11 can be implemented.

## STATE MACHINE ARCHITECTURES

After a machine instruction is fetched from memory, it is normally placed in the machine instruction register as described in Figure 6. Then the op code portion of the instruction is decoded so that a sequence of microinstructions in the microprogram memory can be selected for execution. Each microinstruction is fetched and its contents placed in the pipeline register as shown in Figure 6 for execution.

While the architecture of Figure 6 is recommended and has been used throughout the preceding portion of this chapter, it should be understood that a number of architectures are possible using these microprogram sequencers. The normal flow in fetching microinstructions is to determine the address of the next microinstruction, fetch the contents at that address and set up this data at the input of the pipeline register such that it can be clocked into the pipeline register for execution. If we assume that a clock is being used to clock the pipeline register, the Am2910, the machine instruction register and the Am2903 microprocessor bit slices, it is possible to define a number of computer control unit designs where the relationship between the clock edges is different.
There seem to be a minımum of seven different architectures that can be defined based on placıng registers in the appropriate signal paths and storing data on the low-to-high transition of the


Figure 11. Computer Control Flow Diagram.
clock. For purposes of this discussion, we will assume that all clocked devices will operate using the same clock such that changes will occur on the LOW-to-HIGH transition of the clock. While it is possible to use multiphase clocks and tie different clock phases to different devices, that type of system operatıon will not be described here. In all cases, we will be talking about the flow of signals between LOW-to-HIGH transitions of the clock. Typically, a cycle is started by a clock edge at a device and the signals begin to flow from one device to the next until a set-up time to a clock edge results. Then, the next microinstruction is executed in
exactly the same manner. There are three different identifiable types of microinstruction sequences where only one register is in the signal flow loop. The first of these we shall call an AddressBased microinstruction cycle. It usually starts with the address of a microprogram memory word being stored in a register by the clock. This address has been determined by the previous microinstruction. This address then accesses the microprogram memory to fetch its contents which are presented at its outputs to control the Arithmetic Logic Unit and the results of the Arthmetic Logic Unit function may be used to determine the next address selected that will be stored in this microprogram address register. This is shown as Figure 12a. The second type of microprogram architecture is called Instruction-Based. Here, the register is placed at the output of the microprogram memory as shown in Figure 12b. Again, the cycle consists of executing the microinstruction in the ALU; perhaps using the results of the operatoon to determine the address of the next microinstruction and then fetching the contents of that microinstruction and setting this new data up at the input to the register. The third basic architecture for microprogram control is çalled Data-Based. Here, a register is used to hold the status data from the ALU and this is the determining clock point for the cycle. Here, the status register initiates the selection of the next address from which the microprogrammed data is fetched and this microprogram instruction is used to execute a new function in the ALU thereby setting up the results for the status register. This scheme is shown in Figure 12c. Note that this scheme requires an additional register at the output of the microprogram memory to hold a portion of the microprogram instruction for controlling the condition code multiplexer and Am2910 instruction set. These primitive architectures for microprogrammed control demonstrate the three points at which a register can be placed to provide a start and an end for the microcycle. In a general sense, each of these three architec-
tures is one level pipelined. This, however, is not the definition normally associated with pipelining of microprogram control.

If combinations of the above described architectures are implemented, an improvement in performance will be realized. In each of the three architectures thus described (address-based, instruction-based, and data-based), all of the signal paths are in series and must be transcended before a microcycle can be completed. They are quite easy to program, however, since all of the tasks are completed in the loop before proceeding to the next microinstruction. As stated earlier, these tend to be the slowest of the possible architectures for microprogram control. This disadvantage can be overcome by using a technique referred to as pıpelining in microprogram control. In a pıpeline architecture, we overlap the fetch of the next microinstruction while we are executing the current microinstruction. This is acheeved by inserting additional registers in the overall path such that we can hold the signals step-by-step. There are three possible combinations of the above mentioned architectures that can be utlilized in microprogram control. These are address-instruction-based, address-data-based, and instruction-data-based. While each of these represent two stages of pipelining, we normally refer to these as the pipelined architectures. These are shown in Figure $12 \mathrm{~d}, 12 \mathrm{e}$ and 12 f . It is the instruction-data based architecture that is recommended for the Am2910 and provides the overall best trade-off in cost versus performance.

The last possible architecture using registers in the signal path is a combination of all three architectures and is called address-instruction-data-based microprogram control and is shown in Figure 12 g . Here, three stages of pipeline are involved and we normally refer to this as two-level pipelined archiecture. Needless to say, if no pipelining were involved at all, we would have a ring oscillator.


## Shaded Lines Show Required Signal Flow to Complete a Microcycle: Determine Address, Fetch Instruction and Execute.

Figure 12. Standard Microprogram Control Architectures.


Figure 12. Standard Microprogram Control Architectures (Cont.).


Figure 12. Standard Microprogram Control Architectures (Cont.).

The advantage of the instruction-data-based architecture is that the address and contents of the next microinstruction are being fetched while the current microinstruction in the pipeline register (Figure 6) is being executed. This allows a shorter microcycle since the microprogram memory fetch and ALU execution can be operated in parallel. The results of this type operation are demonstrated in Figure 13 where we see a typical timing diagram of the microprogram execution of the address-data-based instruction architecture. It should be noted that when the computational aspects of a microinstruction are not completed in the same microcycle, they obviously cannot be used to determine the address of another microcycle until the computation has been completed and stored in the status register. Thus, this pipelined architecture offers significant'speed improvement except in the case of certain conditional jumps. In other words, the conditional jump may not use the status register information of the im-
mediately preceding microinstruction because the computation is just being performed. For this architecture, the conditional jump fetch must be executed on the cycle after the status register contains the proper execution results. This can be seen by studying Figure 13. In most microprogram designs this is not a disadvantage because other housekeeping and ALU operations can be performed while the address of the next microinstruction is being determined using the current contents of the status register. While it is not directly pertinent to the discussion at this time, let us point out that the Am2904 has been designed such that the machine architect can utilize both instruction-data-based architecture as well as instructoon-based architecture if no housekeeping is required. Thus, the Am2910 and Am2904 can be used in a varıable architecture cycle to achieve maximum performance for the machine.


Figure 13. Timing Diagram of Microprogram Execution.


Figure 14. Typical Am2910 Microprogram Control Unit.

## The Am2910 in Computer Control

A general state machıne design using the Am2910 is shown in Figure 14. Here, all three output enables are used to advantage in order to control the mapping PROM, pipeline register and vector PROM in this design. This design is very straightforward and in fact is identical to that shown earlier.

One area that should not be overlooked is that of initializing the Am2910 at power up. One technique for accomplishing this is to use a pipeline register with a clear input to provide all LOWs to the instruction inputs of the Am2910. This will cause a reset of the stack in the Am2910 and force the outputs to the zero word and microcode which can be used for the initıalızation routıne. Typıcally, power up will result in the firing of a timer which can be connected to the clear input of the register. Figure 15 shows the technique for initializing the Am2910 using this method.
One advantage of the Am2909 when compared to either the Am2910 or Am2911 is the OR inputs to the microprogram address field. These OR inputs allow two, four, eight or 16-way branching for each device if proper control is used. This control can be accomplished using the Am29803A, 16-way branch control unit. A typical computer control unit using the Am2909, Am2911, Am29803A and Am29811A is shown in Figure 16. In this example, the least significant microprogram control sequencer is an Am2909 and the two more significant sequencers are Am2911s.


Figure 15. Initializing the Am2910.


Figure 16. A High Performance Microprogram Controller Using the Am2909, Am29811A and Am29803A.

## DETAILED DESCRIPTION OF THE Am2911 AND Am29811A IN A COMPUTER CONTROL UNIT

The detailed connection diagram of a straight-forward computer control unit is shown in Figure 17. This design features all of the next address control functions described previously and a few features have also been added.

Referring to Figure 17, the instruction register consists of two Am25LS377 Eight-Bit Registers with Clock Enable. These registers are designated as U1 and U2 and provide ability to selectively load a 16 -bit instruction. This particular design assumes that the instruction word consists of an eight-bit op code as well as eight bits of other data. Therefore, the op code is decoded using three 256 -word by 4 -bit PROMs. The Am29761 has been selected for this function and is shown in Figure 17 as U3, U4 and U5.
The basic control function for the microprogram memory is provided by the Am2911s. In this design, three Am2911s (U6, U7,
and U8) are used so that up to 4 K words of microprogram memory can be addressed. The microprogram memory can consist of PROMs, ROMs, or RAMs, depending on the particular design and the point of its development. This particular design shows the capability of a 64-bit microword; however, the actual number of bits used will vary from design to design.
The pipeline register associated with the computer control unit consists of five integrated circuits designated U16, U17, U18, U19 and U20.

One of the features of the architecture depicted in Figure 17 is the event counter shown as U9, U10 and U11. This event counter consists of three Am25LS163s connected as a 12-bit counter. The counter can be parallel loaded with a 12-bit word from pipeline registers U18, U19 and U20. The multiplexer and D-type flip-flop (U21 and U22) at the counter overflow output (U9) is present to improve system cycle time and will be described in detail later.

MULTIPLEXER SELECT

| $\mathbf{R}_{\mathbf{2 0}}$ | $\mathbf{R}_{\mathbf{1 9}}$ | $\mathbf{R}_{\mathbf{1 8}}$ | $\mathbf{R}_{\mathbf{1 7}}$ | SELECT |
| :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | 0 | 0 | TEST |
| 0 | 0 | 0 | 1 | TEST |
| 0 | 1 |  |  |  |
| 0 | 0 | 1 | 0 | TEST 2 |
|  |  | $\vdots$ |  |  |
|  |  |  |  | $\vdots$ |
| 1 | $\mathbf{1}$ | 1 | 1 | TEST 15 |

POLARITY CONTROL

| $\mathbf{R 1}_{16}$ | OUTPUT |
| :---: | :--- |
| 0 | COMPLEMENT |
| 1 | OF TEST |
| 1 | TRUE TEST |

NEXT ADDRESS CONTROL

| $\mathbf{R}_{15}$ | $\mathbf{R}_{14}$ | $\mathbf{R}_{13}$ | $\mathbf{R}_{12}$ | FUNCTION |
| :---: | :---: | :---: | :---: | :---: |
| $\times$ | $\times$ | $x$ | $X$ | NEXT <br> INSTRUCTION |

MACHINE INSTRUCTION REGISTER

| $\mathbf{R}_{\mathbf{2 1}}$ | FUNCTION |
| :---: | :---: |
| 0 | LOAD |
| 1 | HOLD |

CONTROL VALUE

| $\mathbf{R}_{\mathbf{1 1}}-\mathbf{R}_{\mathbf{0}}$ | FUNCTION |
| :---: | :---: |
| $X X X \cdots X X X$ | VALUE |

JUMP ADDRESS

| $\mathrm{BR}_{11}-\mathrm{BR}_{\mathbf{0}}$ | FUNCTION |
| :---: | :---: |
| $X X X \cdots X X X$ | JUMP ADDRESS |



Figure 17. Computer Control Unit with Am2911.


This design also features a 16 -input condition code multiplexer using two Am74S251s, which are designated U12 and U14. Condition code polarity control capability has been added to the design by using an Am74S158 Two-Input Multiplexer designated as U13. The W outputs and Y outputs from U12 and U14 have been connected together but only one set of outputs will be enabled at a time via the three-state control signal designated as $\mathrm{R}_{20}$ and $\overline{\mathrm{R}_{20}}$. Since the Y output is inverting and the W output is non-inverting, the two-input multiplexer, U13, can be used to select the test condition as either inverting or non-inverting. This allows the test input on the Am29811A Next Address Control Unit, U15, to execute conditional instructions on either the inverted or non-inverted polarity of the test signal. For example, a CONDITIONAL BRANCH may be performed on either carry set or carry reset. Likewise, the same CONDITIONAL BRANCH might be performed on either the sign bit as a logic one or the sign bit as a logic zero. Note that the Am29811A Next Address Control Unit has eight outputs. Four outputs to control the Am2911's $\mathrm{S}_{0}$, $\mathrm{S}_{1}$, PUP and $\overline{\mathrm{FE}}$ inputs. Two outputs to control the three-state enables of the devices connected to the D inputs, i.e., a map enable ( $\overline{\text { MAP E }}$ ) to select the mapping PROMs and a pipeline enable ( $\overline{\mathrm{PL} E})$ to enable the three-state Am2918 outputs which make up a 12 -bit wide branch address field. The remaining two Am29811A outputs are for loading and enabling the Am25LS163 counters. CNTENABLE from the Am29811A is active-LOW while the Am25LS163 counter requires an active-HIGH enable, therefore CNT ENABLE from the Am29811A is passed through one section of the Two-Input Multiplexer (U13) for inversion. An alternative counter, the Am25LS169, has enable as active-LOW; therefore, this inversion through U13 is not required.
At this point, a discussion of the typical operation of this computer control unit is in order. First, bits 0-11 of the microprogram memory output word, are connected to the pipeline register designated U18, U19 and U20. The Am2918 has been selected for this portion of the pipeline register because of its continuous outputs and three-state outputs. The three-state outputs are connected to the D inputs of the Am2911 to provide a branch address whenever needed. These 12 bits are designated $\mathrm{BR}_{0}-\mathrm{BR}_{11}$. The $Q$ outputs of these same Am2918s are designated $R_{0}-R_{11}$ and are connected to the parallel load input of the Am25LS163 Counters. Thus, the counter can be loaded with any value between 0 and 4,095 . Many designs will take advantage of $R_{0}-R_{11}$ and use it as a general purpose field whenever the counter is not being loaded or a jump pipeline is not being performed. Using a microprogram memory field for more than one function (branch address and counter load value in this example) is called FORMATTING and will be covered in greater detail later. The other two devices in the pipeline register shown on the architecture of Figure 17 are U16 and U17. First, U17 receives four bits (12, 13, 14 and 15) from the microprogram memory to provide four-bit instruction field to the Am29811A. This four-bit field, designated $R_{12}-R_{15}$, provides the actual next address control instruction for the computer control unit. $\mathrm{R}_{16}$ is the polarity control bit for the test input and is connected to the select input of the Am74S158 Two-Input Multiplexer. When $\mathrm{R}_{16}$ is LOW, the signal at the Am29811A test input will be inverted, but when $\mathrm{R}_{16}$ is HIGH , the test input will be non-inverted.

The Am74S175 has been used as part of the pipeline register (U16) because it has both inverting and non-inverting outputs. Signals $R_{17}, R_{18}$ and $R_{19}$ are used to control the One-of-Eight Multiplexer (U12 and U14) A, B and C inputs. Pipeline register output $\mathrm{R}_{20}$ and $\overline{\mathrm{R}_{20}}$ are used to enable either the U12 outputs or the U14 outputs such that a one-of-sixteen multiplexer function is implemented. In this design, the TEST 0 input of U14 is connected to ground. This provides a convenient path for converting
any of the conditional instructions to non-conditional instructions. That is, any of the conditional instructions can be executed unconditionally by selecting the TEST 0 input which is connected to ground and forcing the polarity control to either the inverting or non-inverting condition. This allows the execution of unconditional JUMP, unconditional JUMP-TO-SUBROUTINE, and unconditional RETURN-FROM-SUBROUTINE instructions.
Bit 21 from the microprogram memory utilizes a flip-flop in U 17 as part of the pipeline register. This output, $\mathrm{R}_{21}$, is used as the enable input to the instruction register. Needless to say, other techniques for encoding this enable function in a formatted field could be provided.

## A HIGH PERFORMANCE COMPUTER CONTROL UNIT USING THE Am2909 AND Am29803A

The high performance CCU (Figure 18) is of a similar basic design as the previously described CCU. The major differences are, referring to Figure 18, the addition of an extended enable control (U16), a vector input (U24 and U25), and an Am29803A 16-way Branch Control Unit (U23). These performance enhancements are more related to function than to actual circuit speed. The use of these enhancements by the microprogram provides greater flexibility in controlling a machine's environment, and can reduce the microinstruction count required to perform a particular task, which has the effect of increasing overall system throughput.
In describing this high performance CCU design, those sections which remain unchanged from the previous description (Figure 17), will not be covered again. This includes the mapping PROMs, sequencer, Am29811A, counter, condition test inputs and associated polarity control, and the pipeline register. The areas that will be covered are: extended enable control (U16), Vector inputs (U24 and U25), and the Am29803A 16-way Branch Control Unit (U23).

## Extended Enable Control

Extended enable control is accomplished via an Am74S139 dual two-to-four line decoder in conjunction with the Am29811A next address control unit. In Figure 17, PL E and MAP E of the Am29811A were connected directly to the components that they are to control (pipeline registers and mapping PROMs, respectively). Likewise, CNT LOAD and CNT ENABLE are connected directly to the counters that they control (with the exception that CNT ENABLE requires inversion when using Am25LS163 counters). In Figure 18, PL E, MAP E, CNT LOAD and CNT ENABLE go to the inputs of the Am74S139 two-to-four line decoder (U16). When either PLE or MAP E is LOW, then either $2 \mathrm{Y}_{1}$ or $2 \mathrm{Y}_{2}$ of U16 is LOW and either the pipeline branch address registers or mapping PROMs are enabled. If both PL E and MAP E are HIGH, then output $2 \mathrm{Y}_{3}$ of U16 is LOW enabling the threestate outputs of U24 and U25 which are alternate microprogram starting address decoders (alternate mapping PROMs), and called VECTOR INPUT in this design. Likewise, $\overline{\text { CNT LOAD }}$ and CNT ENABLE follow the same rules, enabling the counter to load or count via $1 Y_{1}$ and $1 Y_{2}$ of U16.

## Vector Input

The "Vector Input" provides the system designer with a powerful next starting address control. For example, one possible use might be as an interrupt vector. For instance, use the "Interrupt Request" output of an Am2914 Vectored Priority Interrupt Controller (or group of Am2914s) as an input to one of the conditional test inputs of multiplexers (U12 or U14). Then connect the Am2914 Vector Out lines to the vector mapping PROMs (Vector input U24 and U25). The microprogram then could, at the appro-
priate time, test for a pending interrupt and if present, jump in microprogram memory directly to the routine which handles the specific interrupt as requested via the Am2914 Vector Output lines. This routine will take the proper steps to preserve the status of the interrupt system, and then will service the interrupt. This is one of many possible uses for the Vector Input. Other possible uses include both hardware and software "TRAP" routines and so forth. As can be seen, the design presented here uses the Vector Enable line (output $2 \mathrm{Y}_{3}$ or U16) to enable an alternate starting address input at the Am2911. This, however, does not preclude the use of other devices in place of mapping PROMs as the D-input vector source.

It should be understood that this does not accomplish a "microinterrupt" function in that it is not a random possibility. Instead a microprogrammed test is made and an alternate microroutine is performed. A true "microprogram interrupt" is one that could occur at any microinstruction. The Am2910 does not handle this case internally.

## Am29803A 16-Way Branch Control Unit

The Am29803A provides 16-way branch control when used in conjunction with the Am2909 bipolar microprocessor sequencer, and is shown as U23 in Figure 18 with its pipeline register U22. The Am29803A has four TEST-inputs, four INSTRUCTIONinputs, four OR-outputs, and an enable control. The four ORoutputs connect directly to the Am2909 OR-ínputs (U8 in Figure 18). The four INSTRUCTION-inputs to the Am29803A provide control over the TEST-inputs and OR-outputs, and are provided by the microprogram via the pipeline register U22 (Figure 18).
Basically, the INSTRUCTION-inputs $\left(\mathrm{I}_{0}-\mathrm{I}_{3}\right)$ provide sixteen instructions $\left(0-\mathrm{F}_{16}\right)$ which can select sixteen possible combinations of the TEST-inputs and provide a specific output on the ORoutputs depending upon the state of the inputs being tested. (The subscript 16 refers to basic 16.) All possible combinations of instruction-inputs, TEST-inputs and OR-outputs are shown in Figure 19.

Note that instruction zero does not test any inputs (a disable instruction). Instructions 1, 2, 4 and 8 test one input and can cause a branch to one of two words. Instructions $3,5,6,9,10$ and 12 test two inputs and can jump to one of four words (a 4-word page). Instructions 7,11, 13 and 14 test three inputs and can jump on an eight word page. Instruction number 15 tests all four inputs and the result can jump to any word on a sixteen word page.

## USING THE Am29803A

In the architecture of Figure 18, the Am29803A allows 2-way, 4-way, 8-way or 16-way branching as determined by selectable combinations of the TEST-inputs. Referring to Figure 19, the ZERO instruction (all instruction bits LOW) inhibits the testing of any TEST-inputs, thus providing LOW OR-outputs. Any single TEST-input selected ( $\mathrm{T}_{0}, \mathrm{~T}_{1}, \mathrm{~T}_{2}$ or $\mathrm{T}_{3}$ ) will result in $\mathrm{OR}_{0}$ being HIGH or LOW in correspondence with the polarity of the selected TEST-input. Selecting any combination of two TEST inputs results in the outputs $\mathrm{OR}_{0}$ and/or $\mathrm{OR}_{1}$ being HIGH or LOW, following a mapped one-to-one relationship, i.e., $\mathrm{OR}_{0}$ and $\mathrm{OR}_{1}$ will follow the TEST-inputs, but no matter which pair of TEST-inputs are selected, their HIGH/LOW condition is mapped to the $\mathrm{OR}_{0}$ and $\mathrm{OR}_{1}$ outputs. Likewise, selecting any three TEST inputs, will map their HIGH/LOW condition to the $\mathrm{OR}_{0}, \mathrm{OR}_{1}$ and $\mathrm{OR}_{2}$ outputs. Selecting all four TEST-inputs, of course, causes a one-toone relationship to exist between the HIGH/LOW conditions of the TEST-inputs and the corresponding OR-outputs. Refer to Figure 19 to verify the relationships between INSTRUCTIONinputs, TEST-input, and OR-output. It is very important that the
mapping relationship between these signals be completely understood. When using the Am29803A TEST-OR capability as shown in Figure 18, the microprogrammer must position the applicable microcode within microprogram memory so that the low-order address bits are available for ORing. Sequencer instructions using the Am2909/2911 D-inputs (JRP, JSRP, JP and CJS in particular) are ideally suited for the Am29803A TEST-OR capability. The jump-to-location, available via pipeline $\mathrm{BR}_{0}-\mathrm{BR}_{11}$ or the Am2909/2911 register, can contain the address of a branch table. A branch table is merely a sequential series of unconditional jump instructions. The particular jump instruction executed is determined by the low-order address bits; that is, the first jump inştruction in a branch table must start at a location in microprogram memory whose low-order address bit (or bits) is zero. If a single Am29803A TEST-input is selected (2-way branching) then only the least significant bit in the beginning branch table address needs to be zero. Two Am29803A TEST-inputs selected (4-way branching) requires that the branch table start on an address with the low-order two bits equal to zero; 8-way branching requires three low-order zero bits, and 16-way branching requires four low-order zero address bits. Understanding this branch control concept is really quite simple. The branch table is located in microprogram memory beginning at a location whose address has sufficient low-order zero bits to accommodate the number of selected Am29803A TEST-inputs. If, for instance, three TESTinputs were selected, the first jump instruction in the branch table must be at an address whose low-order three bits are zero, such as address $\mathrm{OF8}_{16}$. The second jump instruction in the branch table would begin in microprogram memory address $0 F 9_{16}$. The third jump at location $0 F A_{16}$, the fourth at $0 \mathrm{FB}_{16}$, etc. Through all eight locations ( $0 \mathrm{~F} 8_{16}-0 \mathrm{FF} \mathrm{F}_{16}$ ). Assume the following pipeline instruction (referring to Figure 18): (1) U22 selects three Am29803A TEST-inputs, (2) U18 instructs the Am29811A Next Address Controller to select the Am2909/2911 D-inputs, (3) U16 enables the pipeline branch address as the D source, and (4) U19, U20 and U21 supplies the address 0 F8 $1_{16}$ as the branch address. The Am29803A TEST-inputs will be ORed into the low-order three bit positions, thus providing a jump entry into the branch table indexed by the value of the OR bits. Each instruction in the branch table is usually a jump instruction, which allows the selection of a particular microcode routine determined by the value presented at the Am29803A TEST-inputs. These jump instructions are the first instruction of the particular sequence. There are, of course, many other ways to use the Am29803A 16-way Branch Control Unit.
The microprogram memory address supplied via an Am2909 sequencer can be modified by the Am29803A 16-way Branch Control Unit. Remember, however, that the microcode associated with this address modification relies on certain address bits being zero, therefore this microcode is not arbitrarily relocatable. The above discussion describes using the D-input and branching to provide low-order zeroes to use the OR inputs. Through proper design, the Register, PC Counter, or File can be used equally well.

## THE COMPLETE COMPUTER CONTROL UNIT USING THE Am2910

A detailed connection diagram for a straightforward computer control unit using the Am2910 is shown in Figure 20. This design utilizes the Am25LS377 as U1 and U2 to implement a 16-bit instruction register. The op code outputs from the instruction register drive three Am29761 PROMs to perform the op code decoding function. These are shown in the diagram of Figure 20 as U3, U4 and U5. The Am2910 sequencer (U6) is used to perform the basic microprogram sequencing function.

MULTIPLEXER SELECT

| $\mathbf{R}_{\mathbf{2 0}}$ | $\mathbf{R}_{19}$ | $\mathbf{R}_{\mathbf{1 8}}$ | $\mathbf{R}_{\mathbf{1 7}}$ | SELECT |
| :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | 0 | 0 | TEST 0 |
| 0 | 0 | 0 | 1 | TEST 1 |
| 0 | 0 | 1 | 0 | TEST 2 |
|  |  | $\bullet$ |  |  |
|  |  | $\bullet$ |  |  |
| 1 | 1 | 1 | 1 | TEST 15 |

POLARITY CONTROL

| $\mathbf{R}_{16}$ | OUTPUT |
| :---: | :--- |
| 0 | COMPLEMENT OF TEST |
| 1 | TRUE TEST |

NEXT ADDRESS CONTROL

| $\mathbf{R}_{15}$ | $\mathbf{R}_{14}$ | $\mathbf{R}_{13}$ | $\mathbf{R}_{12}$ | FUNCTION |
| :---: | :---: | :---: | :---: | :--- |
| $\times$ | $x$ | $x$ | $X$ | NEXT <br> INSTRUCTION |

MACHINE INSTRUCTION REGISTER

| $\mathbf{R}_{\mathbf{2 1}}$ | FUNCTION |
| :---: | :---: |
| 0 | LOAD |
| 1 | HOLD |

COUNTER VALUE

| $\mathbf{R}_{11}-\mathbf{R}_{\mathbf{0}}$ | FUNCTION |
| :---: | :---: |
| $X X X \cdots X X$ | VALUE |

JUMP ADDRESS

| BR $_{\mathbf{1 1}}-\mathrm{BR}_{\mathbf{0}}$ | FUNCTION |
| :---: | :---: |
| $X X X \cdots X X X$ | JUMP ADDRESS |

OR BRANCH CONTROL

| $\mathbf{R}_{25}$ | $\mathbf{R}_{24}$ | $\mathbf{R}_{23}$ | $\mathbf{R}_{22}$ | FUNCTION |
| :---: | :---: | :---: | :---: | :---: |
| x | x | x | x | TEST <br> INSTRUCTION |


$\mathrm{R}_{0}-\mathrm{R}_{11}$

Figure 18. High Performance Computer Control Unit with Am2909/2911.


| Function | $\mathrm{I}_{3}$ | $\mathrm{I}_{2}$ | 11 | $I_{0}$ | $\mathrm{T}_{3}$ | $\mathrm{T}_{2}$ | $\mathrm{T}_{1}$ | $\mathrm{T}_{0}$ | $\mathrm{OR}_{3}$ | $\mathrm{OR}_{2}$ | OR1 | $\mathrm{OR}_{0}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| No Test | L | L | L | L | $\times$ | x | x | X | L | L | L | L |
| Test $\mathrm{T}_{0}$ | L | L | L | H | $\begin{aligned} & \hline x \\ & x \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ | $\mathrm{L}$ | L | $\mathrm{L}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ |
| Test $\mathrm{T}_{1}$ | L | L | H | L | $\begin{aligned} & x \\ & x \end{aligned}$ | $\begin{aligned} & \times \times \\ & \times \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & x \\ & x \end{aligned}$ | $\bar{L}$ | $\bar{L}$ | $\bar{L}$ | $\begin{aligned} & \hline L \\ & H \end{aligned}$ |
| Test $\mathrm{T}_{0}$ \& $\mathrm{T}_{1}$ | L | L | H | H | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & \hline x \\ & \times \\ & \times \\ & x \\ & \hline \end{aligned}$ | L L $H$ $H$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \hline \end{aligned}$ |  |  | $\begin{aligned} & \hline L \\ & L \\ & H \\ & H \\ & \hline \end{aligned}$ | $\begin{aligned} & \text { L } \\ & H \\ & L \\ & H \\ & \hline \end{aligned}$ |
| Test $\mathrm{T}_{2}$ | L | H | L | L | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & x \\ & x \end{aligned}$ | $\begin{aligned} & x \\ & x \end{aligned}$ | $L$ | $L$ | $\bar{L}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ |
| Test $\mathrm{T}_{0}$ \& $\mathrm{T}_{2}$ | L | H | L | H | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & H \end{aligned}$ | $\begin{aligned} & x \\ & x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \end{aligned}$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & L \\ & H \\ & L \\ & H \end{aligned}$ |
| Test $T_{1}$ \& $T_{2}$ | L | H | H | L | $\begin{aligned} & \hline x \\ & \times \\ & \times \\ & \times \\ & \hline \end{aligned}$ | L L H H | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \hline \end{aligned}$ | $\begin{aligned} & \hline x \\ & \times \\ & \times \\ & \times \\ & \hline \end{aligned}$ | $L$ $L$ $L$ | $L$ $L$ $L$ | $\begin{aligned} & \text { L } \\ & \text { L } \\ & \text { H } \\ & H \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L}^{\prime} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \hline \end{aligned}$ |
| Test $T_{0}, T_{1} \& T_{2}$ | L | H | H | H | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \end{aligned}$ | L $L$ $L$ $L$ $H$ $H$ $H$ $H$ | $L$ $L$ $H$ $H$ $L$ $L$ $H$ $H$ | L $H$ $L$ $H$ $L$ $H$ $L$ $H$ |  | $L$ $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ | $L$ $L$ $H$ $H$ $H$ $L$ $H$ $H$ $H$ | $\begin{aligned} & L \\ & H \\ & H \\ & L \\ & H \\ & L \\ & H \\ & L \\ & H \end{aligned}$ |
| Test $\mathrm{T}_{3}$ | H | L | L | L | $\begin{aligned} & \hline \mathrm{L} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \end{aligned}$ | $\bar{L}$ | $\bar{L}$ | $\bar{L}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \end{aligned}$ |
| Test $T_{0}$ \& $T_{3}$ | H | L | L | H | $\begin{aligned} & \hline L \\ & L \\ & H \\ & H \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & x \\ & x \\ & x \\ & x \\ & x \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \end{aligned}$ | $L$ $L$ $L$ $L$ | $\begin{aligned} & L \\ & L \\ & H \\ & H \end{aligned}$ | $\begin{aligned} & L \\ & H \\ & H \\ & H \end{aligned}$ |
| Test $T_{1}$ \& $\mathrm{T}_{3}$ | H | L | H | L | $\begin{aligned} & L \\ & L \\ & H \\ & H \\ & H \end{aligned}$ | $\begin{aligned} & x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \hline \end{aligned}$ | $\begin{aligned} & x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \end{aligned}$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \mathrm{H} \end{aligned}$ | $\begin{aligned} & L \\ & H \\ & L \\ & H \end{aligned}$ |
| Test $T_{0}, T_{1} \& T_{3}$ | H | L | H | H | L L L L $H$ $H$ $H$ $H$ | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $L$ $L$ $H$ $H$ $H$ $L$ $L$ $H$ $H$ | L $H$ $L$ $H$ $L$ $H$ $L$ $H$ | $\begin{aligned} & L \\ & L \\ & L \\ & L \\ & L \\ & L \\ & L \\ & L \end{aligned}$ | $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{~L} \\ & \mathrm{H} \\ & \mathrm{H} \end{aligned}$ | L H $L$ $H$ $L$ $H$ $L$ $H$ |
| Test $\mathrm{T}_{2}$ \& $\mathrm{T}_{3}$ | H | H | L | L | L L H H | $\begin{aligned} & L \\ & H \\ & L \\ & H \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & \hline \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & \times \\ & \times \\ & \hline \end{aligned}$ |  | L L L L | $\begin{aligned} & \text { L } \\ & \text { L } \\ & H \\ & H \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{L} \\ & \mathrm{H} \\ & \mathrm{~L} \\ & \mathrm{H} \end{aligned}$ |
| Test $\mathrm{T}_{\mathbf{0}}, \mathrm{T}_{2} \& \mathrm{~T}_{3}$ | H | H | L | H | L $L$ $L$ $L$ L $H$ $H$ $H$ $H$ | $\begin{aligned} & \text { L } \\ & L \\ & H \\ & H \\ & H \\ & L \\ & H \\ & H \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \end{aligned}$ | $\begin{aligned} & \text { L } \\ & H \\ & L \\ & H \\ & H \\ & H \\ & H \\ & H \end{aligned}$ | $L$ $L$ $L$ $L$ $L$ $L$ $L$ | $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ $H$ | L L $H$ $H$ $L$ $L$ $L$ $H$ $H$ | L $H$ L $H$ $L$ $H$ $L$ $H$ |
| Test $T_{1}, T_{2}$ \& $T_{3}$ | H | H | H | L | L L L L $H$ $H$ $H$ $H$ | $L$ $L$ $H$ $H$ $H$ $L$ $L$ $H$ $H$ | $\begin{aligned} & \text { L } \\ & H \\ & L \\ & H \\ & H \\ & H \\ & L \\ & H \end{aligned}$ | $\begin{aligned} & \hline x \\ & \times \\ & \times \\ & x \\ & x \\ & x \\ & x \\ & x \\ & x \end{aligned}$ | $L$ $L$ $L$ $L$ $L$ $L$ $L$ | $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ $H$ | L L $H$ $H$ $L$ $L$ $H$ $H$ | $\begin{aligned} & \text { L } \\ & H \\ & L \\ & H \\ & L \\ & H \\ & L \\ & H \end{aligned}$ |
| Test $T_{0}, T_{1}, T_{2}$ \& $T_{3}$ | H | H | H | H | $L$ $L$ $L$ $L$ $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ $H$ $H$ $H$ $H$ | $\begin{aligned} & \text { L } \\ & L \\ & L \\ & L \\ & H \\ & H \\ & H \\ & H \\ & L \\ & L \\ & L \\ & L \\ & H \\ & H \\ & H \\ & H \end{aligned}$ | $L$ $L$ $H$ $H$ $H$ $L$ $L$ $H$ $H$ $L$ $L$ $H$ $H$ $L$ $L$ $H$ $H$ |  | $L$ $L$ $L$ $L$ $L$ $L$ $L$ $L$ $H$ $H$ $H$ $H$ $H$ $H$ $H$ $H$ |  | $L$ $L$ $H$ $H$ $H$ $L$ $L$ $H$ $H$ $L$ $L$ $H$ $H$ $L$ $L$ $H$ $H$ |  |

[^0]A 16 input condition code multiplexer function is provided by using two Am2922s as U7 and U8. These devices allow one of sixteen inputs to be tested and the polarity of the test can also be determined. The pipeline regıster consists of U9, U10, U11, U12 and U13. These devices are edge triggered D type registers and have been selected to provide unique functions as required depending on their bit positions in the pipeline register. An Am74S175 was selected for U9 because both a true and complement output were desired to provide control to the condition code multiplexer three state enables. An Am74S174 register was selected as U10 because it provides a clear input for initializing the Am2910 microprogram sequencer. Three Am2918s were selected for U11, U12 and U13 because they have a three state output that can be used to provide the branch address field to the D inputs of the Am2910 and they also have a set of outputs that can be used to provide other control signals via this field when it does not contain a branch address. No specific devices are shown for the microprogram memory as the user should select the desıred width and depth depending on his design.

## ANOTHER DESIGN EXAMPLE

The Am2909, Am2910, Am2911, Am29811A and Am29803A have been designed to operate in the microprogram sequencing section of any digital state machine. Typically, the examples shown are for performing the computer control unit function of a typical minıcomputer class machine. The design engineer should not limit his thınking for the use of these devices simply to that of microprogram sequencing in a computer control unit. These devices can be successfully used in other areas of designing such as memory control, DMA control, interrupt control and special purpose microprogrammed machıne architectures. In order to provide an example of a design using these devices in something other than a typical computer control unit, a microprogrammed CRT controller is described in the following.
In order to provide some basis for the design of a CRT controller, the requirements of this controller must be spelled out. These are given as follows.
A) Character size: $5 \times 7$ dot matrix. The character field will be 7 dots by 10 horizontal lines thereby providıng ample space for the $5 \times 7$ character and the intervening space between characters and lines of characters.
B) 80 characters per line. A standard 80 character per line display will be utilized and there will be 18 character perıods allowed for horizontal retrace time.
C) 24 lines of characters per frame. This provides a total of 240 visible lines per frame ( 24 lines of characters by 10 horizontal lines per character). There are a total of 24 lines provided for vertical retrace bringing the total number of lines per frame to 264.
D) Refresh rate 60 frames per second. Therefore, the horizontal line rate will be $264 \times 60=15,840 \mathrm{~Hz}$. As there are a total of $80+18=98$ character periods in a line, the character rate will be $98 \times 15.84=1,552.32 \mathrm{KHz}$, and the dot rate will be 7 x $1.5288=10.86624 \mathrm{MHz}$. (Note• No interlace is used.)
E) It is assumed that there is a 2 K word deep $\times 8$-bit wide character RAM available to the host computer in which it can write the ASCII equivalent of the characters to be displayed. If scrolling is to be used, the host computer must also write the first visible character's address divided by $16_{10}$ into the Am25LS374 "First Address Register".
F) This CRT controller must generate an 11-bit character address that is used by the 2K word deep character RAM. It must also generate the required video enable signals and the horizontal and vertical blankıng sıgnals.

## Principle of Operation

A detailed block diagram of the CRT controller is shown in Figure 21. The block diagram shows an interface to an SBC-80/10 data bus, address bus and control bus. The outputs of the CRT controller are connected to a CRT monitor on the block diagram. Otherwise the block diagram shows a straightforward use of the Am2910 and three Am2911s to implement the CRT control function using microprogrammed technıques. The SBC-80/10 was selected for this example since it is well known.

A logic diagram of the CRT controller is shown in Figure 22. Three Am29775 512-word x 8-bit registered PROMs are used to contaın the 23-bit wide microprogram. While only a minimum number of words are used in the design as shown, many additional words can be used to add various options (as described later). The address for these Am29775 registered PROMs is provided by an Am2910 microprogram sequencer. Three Am2911 sequencers are used to generate the character address for the character RAM. The least significant Am2911 sequencer is connected as a divide by 16 counter. This RAM address is compared with the desıred last character address $(80 \times 24=1920)$ value using an Am25LS2521 8-bit equal to detector. When the last address is detected, it can be sensed at the condition code multiplexer (Am25LS153) that is used to select the condition code for the Am2910 sequencer.

The data derived from the 2 K word character RAM is decoded by a character generator (6061) in this design and the character output is parallel loaded into an Am25LS23 shift register. This shift register is used to provide the video signal from its $Q_{0}$ output to eventually drive the display via an Am74S240 buffer. The diagram of Figure 22 depicts an oscillator input source to supply the dot frequency. In this design, a 10.86624 MHz oscillator should be connected to this oscillator input point. This oscillator input signal is used to clock the shift register containing the individual dot bits (dot-on or dot-off) and also drives an Am25LS169 counter which divides this frequency by 7 to generate the character rate clock. This character rate clock is used throughout the controller to provide a timing signal for the state machine design.

An Am25LS168 decade counter is used to generate the line inputs for the character generator and to count 10 horizontal lines per character space. This counter is clocked by the horizontal blanking signal ( HB ) and its $\overline{\mathrm{RCO}}$ output is used as one of the condition code multiplexer inputs. The $\overline{\mathrm{RCO}}$ output can be tested to determine when 10 counts have been executed by the counter and it is also used to enable the last address comparator during the 10th horizontal line time.

When the host computer accesses the character RAM, the HOST-ACCESS line is pulled LOW. This removes the Am2911 outputs from the character RAM address bus. When this access occurs, improper data may be present at the shift register inputs. Thus, the character generator PROM output is disabled by the HOST-ACCESS signal during this time.

When power is applied to this CRT controller or whenever it is reset, the RESET line is driven LOW. This signal is inverted through an Am25LS240 and then disables a part of the pipeline register outputs as well as enabling one half of an Am25LS241. This Am25LS241 inserts LOWs onto the instruction (I) inputs of the Am2910 sequencer. Then, the next character rate clock will force the microprogram address outputs to zero and the microprogram for the CRT controller as shown in Figure 23 will be executed starting at address zero.


Figure 20. Computer Control Unit with Am2910.



Figure 21. CRT Controller Block Diagram.

## The Microprogram for the CRT Controller

Table 6 shows a complete description of the microprogrammed CRT controller microcode. Execution of these microinstructions is controlled by the Am2910 sequencer.
As can be seen in Table 6, several techniques were used in this short microprogram to provide the different counting requirements of this CRT controller. Although only one format ( 80 characters per line, 24 lines per frame) was shown here, the designer can easily configure his own format by simply changing some constants in the microprogram. As an exercise, the reader is encouraged to find a means to program the CRT controller for different formats. The host computer software could configure the controller format by using an additional register similar to the "First Address Register". This will be discussed in an appendix at the end of this chapter.
A complete wiring diagram for the microprogrammed CRT controller is shown in Figure 24. This can be used directly with the interface shown in Appendix A such that the CRT controller can
be connected directly to an Am9080A based microprocessor system. Appendix A also depicts the use of a 2 K word x 8 bit character RAM as described previously.

## CRT Controller Timing Considerations

As was discussed earlier, the character clock frequency for the CRT controller is $1,552.32 \mathrm{KHz}$. Thus, it is desirable to calculate the longest path of the design to ensure that none exceed this clock period of 644.1 ns . The timing diagrams of Figure 25 depict a number of different paths with the associated propagation delay calculations.

When all of the timing diagrams of Figure 25 are examined, it will be found that only three show propagation delay times of over 200ns typical. Of these, the worst case is 318 ns as shown in Figure 25(i). Since the requirement of the design is to insure that none exceed 644.1 ns , we have more than a 2 to 1 margin in the design based on the typicals. Thus, we can see that the design will operate properly even over the full military temperature range and power supply variations based on this analysis.


Figure 22. CRT Controller.


Figure 24. CRT Controller.


| $\begin{aligned} & \text { ADDR } \\ & \text { (Hex) } \end{aligned}$ | Label | Am2 1 | 2910 <br> $\overline{\text { CCEN }}$ | MUX |  |  |  |  | Am2911 <br> $\overline{Z E R O H}$ | $\overline{Z E R O L}$ | $\mathrm{C}_{n}$ |  | VB | NUM | Comments |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | INIT | CJV | L | 3 |  | H H | H | L | H | L | L | H | L | X | ,Load first address from Register to 2911's file |
| 1 |  | LDCT | $x$ | $x$ |  |  | L | H | H | L | L | H | L | $23_{10}$ | ;Load 2910's counter with member of rows/frame - 1 |
| 2 | MAIN | CONT | x | x |  | H | L | H | H | L | H | H | L | X | ,Address supplied by 2911's file |
| 3 |  | CJP | L | 1 |  | L L | L | H | H | H | H | L | L | \$ |  |
| 4 |  | CJP | L | 1 |  |  | L | H | H | H | H | L | L | \$ | ;One row $5 \times 16=80$ characters |
| 5 |  | CJP | L | 1 |  |  | L | H | H | H | H | L | L | \$ | ,One row $5 \times 16=80$ characters |
| 6 |  | CJP | L | 1 |  |  | L | H | H | H | H | L | L | \$ |  |
| 7 |  | CJP | $L$ | 1 |  |  |  | H | H | H | H | L | L | \$ |  |
| 8 |  | CJS | L | 0 |  | L | L | H | H | H | H | H | L | TENTH | ,If tenth (last) line of a row jump to "TENTH" subroutine |
| 9 |  | CJS | L | 2 |  | L | L | H | H | H | H | H | $L$ | LASTA | , If last character jump to "LASTA" subroutine |
| A |  | CJP | L | 1 |  | L | L | H | H | H | H | H | L | \$ | ,Wat, untll horizontal invisible counts done |
| B |  | CJP | H | X |  |  | L | H | H | X | X | H | L | MAIN | ,Then do the Main routine agan |
| C | TENTH | RPCT | X | X |  | L | L | L | H | H | H | H | L | GOBACK | ,Push next addr on 2911's file jump to "GOBACK" if not End of Frame |
| D |  | CJV | L | 3 |  | H | H | L | H | L | X | H | H | X | ,Load 2911's file from First Address Register |
| E |  | LDCT | X | Y |  | L | L | H | H | X | x | H | H | 14610 | ,Load 2910's counter with number of invisible characters during Vert retrace divided by 16, minus 1 |
| F |  | PUSH | L | 3 |  | L | L | H | H | H | H | H | H | X | ;Push next PC to 2910's file for double |
| 10 |  | CJP | L | 1 |  | L | L | H | H | H | H | H | H | \$ | , Wait for LS2911 to count 16 |
| 11 |  | RFCT | X | $x$ |  | L | L | H | H | H | H | H | H | x | ,Decrement 2910's counter and jump one line back if $=0$ |
| 12 |  | LDCT | X | $x$ |  | L | L | H | H | H | H | H | H | $23_{10}$ | ;Load 2910's counter again with number of rows/frame - 1 |
| 13 |  | CRTN | H | x |  |  | L | H | H | H | H | H | H | X | ;Return from subroutine |
| 14 | GOBACK | CRTN | H | X |  | L | L | H | H | H | H | H | L | X | ;Return |
| 15 | LASTA | CRTN | H | x |  | X | X | L | L | H | H | H | L | X | ;Load zero to 2911's file and return. |

Figure 23. Microprogram for the CRT Controller.

TABLE 6. DESCRIPTION OF THE MICROPROGRAM FOR THE CRT CONTROLLER.

| Microprogram Address | $\stackrel{\text { Low }}{\text { Order Am2911 }}$ | High Order Am2911s | Am2910 | Comments |
| :---: | :---: | :---: | :---: | :---: |
| 0 | Since $\overline{\text { ZERO }}$ is low, its output will be LOW. The $C_{n}$ input (from the Pipeline Register) is LOW so that the microprogram incrementer will not increment. | Both $S_{1}$ and $S_{0}$ are HIGH so that the D inputs will be routed to the Y outputs. These inputs will come from the First Address Register (the Am2910 VECT is LOW). $\mathrm{C}_{\mathrm{n}}$ is LOW (see left column); therefore the microprogram counter will not increment. $\overline{F E}$ is LOW (and PUP is always HIGH) causing the present output to be pushed on the stack. The character address is already the "First Character Address". | The CJV instruction is selected. Therefore, VECT output will be LOW, enabling the "First Address Register onto the internal 8 -bit bus. CCEN is LOW; the MUX is selecting a constant HIGH, and the sequencer will address the next consecutive microprogram address (word 1). | This instruction pushes the "First Character Address" more significant bits onto the Am2911's file, and continues to the next microinstruction. |
| 1 | $\overline{\text { ZERO }}$ and $\mathrm{C}_{\mathrm{n}}$ are still LOW, so no change in this device. | $S_{1}$ and $S_{2}$ are LOW; thus, the $Y$ outputs will be the current PC, (the same as the $Y$ outputs were in the previous step). $\mathrm{C}_{\mathrm{n}}$ is still LOW, therefore no change will occur in the PC | LDCT is selected and the number of character-rows per frame minus $1\left(23_{10}\right)$ is loaded into the Am2910 register/counter. The sequencer addresses the next microinstruction. |  |
| $\stackrel{2}{\text { MAIN" }}$ | Mantaining ZERO LOW assures the proper starting address. $\mathrm{C}_{\mathrm{n}}$ is HIGH; therefore, the internal PC will be incremented. | With $\mathrm{S}_{1}=$ HIGH, $\mathrm{S}_{0}=$ LOW and $\overline{\mathrm{FE}}=\mathrm{HIGH}$, the Am2911 will refer to its internal file (the starting address of this particular character-row) without popping | The Am2910 will generate the next microprogram address. | This is the starting location for the main loop. |

TABLE 6. DESCRIPTION OF THE MICROPROGRAM FOR THE CRT CONTROLLER (Cont.).

| Microprogram Address | Low Order Am2911 | Order Am2911s | Am2910 | Comments |
| :---: | :---: | :---: | :---: | :---: |
| 3 | This Am2911 now counts up using its PC incrementer At the final count (moving from $F_{16}$ to 0 ) its $C_{n+4}$ output will be HIGH | Initially these two Am2911s will not change their $Y$ outputs since their $C_{n}$ input is LOW. However, when the $\mathrm{C}_{\mathrm{n}}$ input goes HIGH, the internal PC will increment | With the MUX selecting the $\mathrm{C}_{\mathrm{n}+4}$ output from the least significant Am2911 slice, the CC input to the Am2910 sequencer will be LOW until the Am2911 counts 16. $\overline{C C}=$ LOW will cause the next microprogram address to be the pipeline register contents; this is also the current microprogram address (word 3). When $\mathrm{C}_{\mathrm{n}+4}$ goes HIGH, CC will go HIGH and together with CCEN = LOW, will force the Am2910 to address the next consecutive microprogram address (4) | This microstep will be executed 16 tumes (Note that $80=5 \times 16$ ) |
| $\stackrel{4}{\text { through }}$ $7$ | Same as 3. | Same as 3. | Same as 3, except that at each address, the current microprogram address is written | The microprogram itself is used as a counter in this application since the count is only 5 , the microprogram is relatively short versus the memory's depth and this is a convenient means to economize on chip count |
| 8 | Continues to count (note that it enters this line with an output of zero) | Since $C_{n}$ is LOW (see left column) no change occurs in these devices. Note that the Y outputs contain the more significant bits of the address of the first character of the next character row | The MUX selects the Am25LS168 ten-line-counters $\overline{\mathrm{RCO}}$ as the condition code input to the Am2910 (CC) If the line count is less than $10, \overline{\mathrm{CC}}$ will be HIGH and the next microinstruction will be addressed If the tenth line of a character row is executed, $\overline{\mathrm{CC}}$ will be LOW and a JUMP-TO-SUBROUTINE to an address, supplied by the pipeline register ("TENTH") will be executed | We are now at the end of a TV line Therefore, the Horizontal Blanking Signal (HB) is HIGH The least significant Am2911 slice now counts the invisible characters during the horizontal retrace. |
| 9 | Continues to count through the internal PC incrementer | No change | The MUX now selects the Last Address Comparator output for $\overline{\mathrm{CC}}$ If the current more significant bits of the characteraddress coincide with the last address $+1\left(1920_{10} / 16\right)$ a subroutine call will be performed to "LASTA" Otherwise, the microprogram will continue consecutively | Note that 80 characters/row and 24 rows/frame requires a $1920_{10}$ word memory When the last memory location $\left(1920_{10}\right)$ is read out, the scan will begin at 0 |
| A | Continues to count At count $15, \mathrm{C}_{n+4}$ goes HIGH. | No change until $\mathrm{C}_{\mathrm{n}}$ goes HIGH, then count | Same as at address 3. | Waiting for the least significant Am2911 to count to 15 This microstep will be executed as many times as necessary to accomplish this |
| B | It doesn't matter what this device does at this microstep because at the next microstep it will receive LOW on its ZERO input | No change | Unconditionally ( $\overline{\text { CCEN }}=$ HIGH) steers the microprogram to the address supplied by the pipeline register ("MAIN" = 2) | Performing a JUMP to the beginning of the main-loop (address 2). |
| C "TENTH" | Continues to count | No change | If internal counter is equal to zero, it means that 24 character rows were aiready displayed and we are at the bottom of the CRT display A vertical retrace period is needed and the microprogram will continue sequentially. If the counter is not yet zero, we do not need to execute the vertical retrace routine and the next address will be supplied by the piperegister ("GOBACK" = 1416) while the internal counter is decremented. | The decision whether the bottom of the CRT (End of Frame) is reached or not is made internally in the Am2910, using its counter |

TABLE 6. DESCRIPTION OF THE MICROPROGRAM FOR THE CRT CONTROLLER (Cont.).

| Microprogram Address | Low Order Am2911 | Order Am2911s | Am2910 | Comments |
| :---: | :---: | :---: | :---: | :---: |
| D | $\overline{\mathrm{ZERO}}=$ LOW, therefore, output $Y=0$ This is necessary to assure that $\mathrm{C}_{\mathrm{n}+4}$ is LOW | Same as at address 0 | Same as at address 0 | As we are at the End of Frame, the "First-Address-Register" contents (enabled by the Am2910's VECT output) is pushed onto the Am2911's file Note that the Vertical Blanking Signal (VB) goes HIGH |
| E | Same as at address B | No change | The internal counter is loaded with $146_{10}$, supplied by the pipeline register The next consecutive microstep is addressed | $\left(146_{10}+1\right) \times 16_{10}=2352_{10}$ equals the number of characterperiods during vertical retrace. Loading $\mathbf{2 3 5 2}_{10}$ directly into the Am2910's counter would require 7 bits. Usingthis scheme we reduce the microprogram width. |
| F | Counts | No change. | With $\overline{\text { CCEN }}=$ LOW and $\overline{C C}=$ HIGH (supplied from a constant HIGH by the MUX), the next address $\left(10_{16}\right)$ will be pushed onto the Am2910 file, the counter will not be affected and the next consecutive microstep will be addressed | This is a preparatory step for the 2 step "Vertical Retrace" doublenested loop. |
| $10_{H}$ | Counts When final count is reached, $\mathrm{C}_{\mathrm{n}+4}=\mathrm{HIGH}$ | No change with $\mathrm{C}_{\mathrm{n}}=$ LOW, increments with $\mathrm{C}_{\mathrm{n}}=\mathrm{HIGH}$ This has no practical affect as the HB signal is HIGH, and at the beginning of the next visible line, the correct address will be fetched from the file (address 2) | The MUX supplies the $C_{n+4}$ output of the less significant Am2911 slice to the Am2910 $\overline{\mathrm{CC}}$ input While this signal is low, the Am2910 will select the pipeline register as the source of the next microinstruction address The current address $\left(10_{H}\right)$ being written there, this instruction will be executed until $\overline{\mathrm{CC}}$ goes HIGH Then the next consecutive instruction will be selected through the Am2910 internal PC | Again, this is a possible way to dwell on a certan microstep watting a condition to change its status (like address 3 through 7) This is the internal loop of a double-nested loop system |
| ${ }^{11} \mathrm{H}$ | Counts | No change | If the final count has been reached, the next microinstruction will be addressed and the internal stack will be popped (adjusted) Otherwise, the next microinstruction address will be the one residing on the top of the stack (which is $10_{16}$ ). | This is the external loop of the double-nested loop system, which counts the vertical retrace interval. By addıng a single mıcroinstruction the chip count was reduced |
| $12_{H}$ | Counts | No change | Same as at address 1 | Reinitializes the Am2910 internal counter with the number of character rows per frame. |
| $13_{\mathrm{H}}$ | Counts | No change | Unconditional return from subroutine $(\overline{\mathrm{CCEN}}=\mathrm{HIGH})$ | End of "TENTH" subroutine at End of Frame (with vertical retrace). |
| $\begin{gathered} 14 \mathrm{H} \\ \text { "GOBACK" } \end{gathered}$ | Counts | No change | Unconditional return from subroutine | End of "TENTH" subroutine without vertical retrace |
| $\begin{gathered} 15 \mathrm{H} \\ \text { "LASTA" } \end{gathered}$ | Counts | Pushes zero into file | Unconditional return from subroutine | A one-line subroutine to reinltialize character address to zero |



Figure 25.
c)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| 29775 | CP to D | 15 | 15 | - |
| 2910 | I to Y | 40 | - | - |
| 2910 | CCEN to Y | - | 23 | - |
| 2910 | CP to $Y$ | - | - | 54 |
| 29775 | A (ts) | 40 | 40 | 40 |
| TOTAL-ns |  | 95 | 78 | 94 |



PATH 1 - $-\infty$
CLOCK
PATH $2 \square-\infty$
PATH $3--\square$
MPR-493
d)

| DEVICE NO. | DEVICE PATH | PATH 1 |
| :--- | :--- | :---: |
| 29775 | CP to D | 15 |
| 25 LS153 | A, B to $Y$ | 19 |
| 2910 | CC to $Y$ | 21 |
| 29775 | A (ts) | 40 |
| TOTAL-ns |  | 95 |



PATH 1 - - -
Figure 25. (Cont.)
e)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 |
| :--- | :--- | :---: | :---: |
| 29775 | CP to D | 15 | 15 |
| 2910 | I to PL, VECT | 27 | 27 |
| 29775 | E $_{1}$ to D | - | 15 |
| $25 L S 374$ | OE to Y | 14 | - |
| 2910 | PC (ts) | - | 34 |
| 2911 | D (ts) | 17 | - |
| TOTAL-ns |  | 73 | 91 |



PATH 1
PATH 2
PATH 2 MPR-495
f)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| 29775 | CP to $D$ | 15 | 15 | 15 |
| 2911 | ZERO to $\mathrm{C}_{n+4}$ | - | - | 30 |
| 2911 | $\mathrm{C}_{n}$ to $\mathrm{C}_{\mathrm{n}+4}$ | - | 9 | - |
| 25 LS168 | CP to $\overline{\text { RCO }}$ | 19 | - | - |
| $25 L S 153$ | D to Y | 20 | 20 | 20 |
| 2910 | CC to $Y$ | 21 | 21 | 21 |
| 29775 | A (ts) | 40 | 40 | 40 |
| TOTAL-ns |  | 115 | 105 | 126 |



PATH 1 $\qquad$
PATH 2
$\qquad$
PATH 3

Figure 25. (Cont.)
g)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| 29775 | CP to $D$ | 15 | 15 | - |
| 2911 | $\mathrm{~S}_{0}, \mathrm{~S}_{1}$ to Y | - | 19 | - |
| 2911 | CP to $\mathrm{Y}\left(\mathrm{S}_{1} \mathrm{~S}_{0}=\mathrm{HL}\right)$ | - | - | 54 |
| $25 L \mathrm{~S} 168$ | CP to $\overline{\mathrm{RCO}}$ | 19 | - | - |
| 25 LS 2521 | A to $\mathrm{E}_{0}$ | - | 9 | 9 |
| $25 L \mathrm{~S} 2521$ | $\mathrm{E}_{1}$ to $\mathrm{E}_{0}$ | 6 | - | - |
| 25 LS 153 | D to Y | 20 | 20 | 20 |
| 2910 | CC to Y | 21 | 21 | 21 |
| 29775 | A (ts) | 40 | 40 | 40 |
| TOTAL-ns |  | 121 | 124 | 144 |


h)

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 |
| :--- | :--- | :---: | :---: |
| 2911 | CP to $\mathrm{Y}\left(\mathrm{S}_{1} \mathrm{~S}_{0}=\mathrm{HL}\right)$ | 39 | - |
| 2911 | CP to $\mathrm{C}_{\mathrm{n}+4}\left(\mathrm{~S}_{1} \mathrm{~S}_{0}=\mathrm{HL}\right)$ | - | 54 |
| 2911 | $\mathrm{C}_{\mathrm{n}}\left(\mathrm{t}_{\mathrm{S}}\right)$ | - | 15 |
| 9114 | A to D | 150 | - |
| 6061 | A to OUT | 70 | - |
| 25 S 23 | D (ts $)$ | 23 | - |
| TOTAL-ns |  | 282 | 69 |



PATH 1
PATH $2-$
PATH 2


Figure 25. (Cont.)

## SUMMARY

The Am2910 provides a powerful solution to the microprogram memory sequence control problem. The Am2910 is a fixed instruction set, 12-bit wide microprogram sequencer. In addition, the Am2909, Am2911, Am29811A and Am29803A provide another solution to the microprogram sequencing problem. These devices are bit slice oriented and provide more potential flexibility to the microprogram sequencing solution. All of these devices are particularly well suited for the high performance computer control unit and structured state machine designs using overlap fetch of the next microinstruction - also referred to as instruction-data-based microprogram architecture.

These Am2900 family microprogram control devices offer the highest performance LSI solution to the problem of microprogram control. They provide a number of conditional-branch source addresses as well as conditional jump-to-subroutine and conditional-returm instructions. In addition, several techniques for timed and untımed looping are provided such that loops from one to several microinstructions can be executed. All of the devices described in this chapter are competitively priced and currently avallable. In addition, all of these devices are available with specifications guaranteed over the full commercial temperature range and power supply tolerance as well as the full military temperature range and power supply tolerance. All of these devices undergo $100 \%$ reliability assurance testing in compliance with MIL-STD-883.

## APPENDIX A

Figure A1 shows the logic diagram of an interface circuit used to connect the microprogrammed CRT controller to any Am9080A type processor. Sixteen address-lines, eight data lines, a memory-read, a memory write and an I/O write signal are assumed to be used in an active LOW polarity.
An Am25LS2521 8-bit comparator is used to decode the addresses of the 2 K by 8 character memory. This memory can be placed anywhere in the memory space in increments of 2 K by using 5 DIP-switches. The comparator is enabled by the presence of either the $\overline{\mathrm{MMR}}$ or the $\overline{\mathrm{MMW}}$ signal. The output of this comparator is the HOST ACCESS signal.
The HOST ACCESS signal enables the two Am25LS240 buffers which connect the processor address bus to the character mem-
ory address bus. It also enables one half of an Am25LS241 buffer transferring the $\overline{M M R}$ or $\overline{\text { MMR }}$ active LOW signal to the proper data buffer enable (Am25LS240's) and to the WE pins of the four Am9114 memories in case of a memory write operation. The $\overline{C S}$ of two of these memories are driven by $\mathrm{A}_{10}$ while the $\overline{\mathrm{CS}}$ of the other two memories are driven by $\mathrm{A}_{10}$, thus forming a 2 K by 8 memory space.
An Am25LS2521 8-bit comparator is enabled by the $\overline{\mathrm{I} / \mathrm{OW}}$ control line. If $n$ matches the settings of the DIP switches at the $B$ inputs of the comparator, an OUT $n$ instruction will write the data into the Am25LS374 "First Address Register".
Figure A2 shows the complete wiring diagram of this interface circuit.


Figure A1. CRT Controller.


Figure A2. CRT Controller.


## APPENDIX B

## General

A software emulation of the CRT controller was written in BASIC-E and run on the System 29 support processor. Figure B1 is a printout of this program.

## Notations

For reference purposes, each clock pulse (CP) in the program is numbered. The clocks are character-rate clocks. A subscript " 10 " signifies that this variable belongs to the Am2910 (e.g. R10 $=$ the contents of the Am2910 Register Counter) and similarly a subscript 11 signifies the Am2911 dependent variables (e.g. Y11 the Y outputs of the two more significant Am2911s).
Usually the normal function names were used though for the active LOW functions the bar was deleted for simplicity. A 0 signifies always a LOW and 1 signifies HIGH. Other abbreviations used in the program:

```
    \(M A=\) Microprogram Address \((Y\) output of the Am2910)
    \(\mathrm{CA}=\) Character Address
    \(\mathrm{PC}=\) Program Counter (internal)
        \(R=\) Register (internal)
        F = File (internal)
    SP = Stack Pointer (internal)
TENC \(=\) The Am25LS168 decade counter
    \(\mathrm{L} 4 \mathrm{~B}=\) The 4 least significant bits of CA (the Y outputs of
                the less significant Am2911
    CN = Carry-in into the less significant Am2911
    CN4 \(=\) Carry-out from the less significant Am2911
    CN4 = Carry-in to the next significant Am2911
    \(110=\) The Am2910 instruction
    HB \(=\) Horizontal Blanking signal (active HIGH)
    \(\mathrm{VB}=\) Vertical Blanking signal (active HIGH)
CPM \(=\) Maxımum Clock Pulse (at which the program
                stops)
```


## Description

The different groups and subroutines of the emulation program are as follows: (See Figure B1).
$<1000$ series: The microcode. Subroutine 50 is the Am25LS168 decade counter clocking routine. TENTH is the RCO output of this device.
1000 series: This is essentially the Am2910 emulation. Note the definition of the two functions FNFAIL and FNPASS at the beginning of the program, compare to the Am2910 instruction definitions in its data sheet.
2000 series: The Am25LS153 multiplexer emulation.
2500 series The less significant Am2911 emulation. Note that the only input to this device is ZEROL. CN and the internal PC (called L4B) are controlled in the CLOCK Subroutine (4000 series).
3000 series: The two more significant Am2911's emulation, $S_{0}$ and $S_{1}$ are treated as a single number (ranging from 0 through 3 ) and denoted by S11.
4000 series The Clocking routine.
5000 series The main emulation routine. It includes the Am25LS2521 comparator routine and checks the Clock Pulse against CPM to determine end of run.
5500 series Emulation parameter setup (initialization). The starting and ending CP numbers, MA, TENC, R10 and VECTOR (The "First Address Register") can be set.
6000 series: Sets up the print-out parameters
7000 series: Printout subroutine
9000 series: Sets the program mode: RUN, PRINT or QUIT (return to $\mathrm{CP} / \mathrm{M}$ )

The emulation was exercised to evaluate fifteen different performance aspects of the CRT Controller. The results indicated that in all cases, the design operated as desired.

```
REM
REV=12
PRINT REV
9000 REM
        PRINT
        FRINT
        FRIINT " ***********************************************************
        PRINT
        FRINT
        FRINT " A MICROPROGRAMMEII CRT CONTROLLER EMULATION"
        FRINT
        FRINT
        FRINT
        FRINT
        FRINT
        FRINT : EY MOSHE M. SHAUIT"
        FRINT " ADUANCEH MLCRO DEUICES"
        FRRINT " FEEFUARY 27, 1978"
        FRINT
        FRINT
FEM
        IIM F1O(G)
        DEF FNFAIL=CCEN=0 ANI CC=1.
        DEF FNF'ASS=CCEN=1 OF CC=O
REM
FEM
REM
REM
REM
FEM
9100 PRINT
    FRINT
    FRINT
    INFUT "R-UN, P-RINT OR Q-UIT ";MODES
    IF LEN(MONES)=0 THEN GOTO 9100
    MONE=ASC(MONES)-79
    IF MODE<1 OR MONE > 3 \
                                    THEN FRINT MONES; * IS INUALII":\
                                GOTO 9100
    ON MONE GOTO 9110,9120,9130
REM
9120 RETURN
REM
9130 FEM FUN
    FRINT
    INFUT "FUT RESULTS ON FILE (O IF IIRECT FRINTOUT)= *;WFILES
    FRINT "CP= ";CF;"MA= ";MA;"VECTOF= ";VECTOK;\
                            "CPM= ";CFM;"ROW= ";24-R10
    INFUUT "INITIALIZE (Y OR N; CF,MA=0 IF N)":S$
    IF SS="Y" \
                            THEN GOSUB 5500 \ REM INIT.
            ELSE CF=O : MA=0
    IF WFILES="O" \
            THEN GOTO 6010 \ REM IIRECT FRINTOUT
            ELSE FILE WFILEF : GOTO 5000 REM MAIN
FEM
9110 FEM PRINT
    PRINT
    INFUT "GET RESULTS FROM FILE=";RFILE;
    FILE RFILES
REM
6000 REM FFRINT FARAMETERS
    PFINT
6010 PRINT "OUTPUT FORMATS:"
```

```
        PRINT " A=CF AND CA ONLY"
    FRINT " B=CP,CA,HB,VB,MA"
    FRINT * C=CF,CA,MA,TENC,F10*
    FRINT " I=ALL"
    FRINT
    INFUT "FORMAT=";FORMAT$
    IF LEN(FORMATS)=0 THEN GOTO 6010
    IF ASC(FORMAT$)<65 OR ASC(FORMAT$)\68 \
        THEN FRINT FORMATS;" IS ILLEEGAL" :\
            GOTO 6010
    PFINT
REM
6020 REM
    IF WFILES NE "O" \
    THEN CONTROLS="A" :\
        GOTO 6030
PRINT "CLOCK CONTROL"
FRINT " A=CONTINOUS*
PRINT " E=STEF"
INFUT "CONTROL=";CONTROL$
IF LEN(CONTROL$)=0 THEN GOTO }602
IF ASC(CONTROL$)<65 OR ASC(CONTROL$)>66 \
    THEN FRINT CONTROLS;" IS ILLEGAL" :\
    GOTO 6020
PRINT
REM
6030 PRINT "OUTFUT CONTFOL"
    PRINT " A=AT EACH CF:"
    FRINT " E=AT EUERY N-TH CF"
    PRINT " C=MANUAL CONTFOL"
    FRINT * LI=STARTING AT CF'S AT EUERY CF"
    FRINT * E=STARTING AT CFS AT EVEKYY N-TH CF"
    INFUT "OUTFUT=";OUTFUTS
    IF LEN(OUTFUT$)=0 THEN GOTO 6030
IF ASC(OUTFUT$)<65 OF ASC(OUTFUT$)>69 \
    THEN FRINT OUTFUTS;" IS ILLEGAL.." :\
                            GOTO 6030
0.CTL=ASC(OUTFUT5)-64
ON O.CTL GOTO 6090,6032,6090,6034,6036
INFUT "N=";N
M=0
GOTO 6090
6034 INFUT 'CFS= ";CFS
GOTO 6090
6036 INFUT "CFSS= ";CFS
INFUT "N= ";N
M=0
GOTO }609
FEM
6090 FORMAT = ASC(FORMAT$)-64
    ON FORMAT GOSUB 6190,6300,6200,6100
    IF WFILE$="O" THEN GOTO 5000 REM MAIN
FEM
6 9 0 0 ~ P R I N T
IF ENII *1 THEN }691
FOF I=1 TO 2 STEF O REM HO UNTIL ENII OF FILE
REAI *1; CF,F10,F1,SF10,FC1O,CA,MUX,CC,CCEN,MA,TENC,\
    CN4,F11,HB,UB
F10(SF'10)=F1
GOSUE 7000 FEM PFINT
GOSUB 5200 FEM ESCAFE (FEV 7)
IF S=155 THEN FRINT:PRINT "ABORTEI AT * :CF : GOTO 6910
NEXT I
```



```
    CA=Y11*16+L4E REM CHAFACTEF ALINRESS
            FEM COMFAFATOR NEXT
        IF Y11=120 AND TENTH=0 \ REM FEV 8
            THEN COMF=O \
            ELSE COMF=1
GOSUB 2000 FEM MUX
GOSUB 1000 REM 2910
    FEU 6
    TF WFILEs="O" THEN GOSUB 7000 \ REM IIRECT PRINTOUT
            ELSE FRINT *1;CF,R10,F10(SF10),SF10,FC10,CA,MUX,\
                            CC,CCEN,MA,TENC,CN4,F11,HE,UB
IF CONTFOLS="E" THEN INFUT SS FEM SINGLE STEF'
REM CHECK ENII OF RUN
GOSUB 5200 FEM ESCAFE (REV 7)
IF S=155 THEN FRINT:FRINT "ABORTEI AT *;CP : GOTO 5100
IF CF<CFM THEN GOTO 5000 REM FEEFEAT MAIN
REM
5100 IF WFILES NE "O" THEN CLOSE (1)
    OUT 100,12 REM PRINTER F'AGE EJECT (FEU 7)
    GOTO 9100
FEM
FEEM S200 SUE FEEV 7
5200 REM ESCAPE SUBROUTINE:
    S=INF(97)
    S=INT(S/2)
    S=S/2-INT(S/2)
    IFF S NE O THEN S = INF(96)
    RETUFN
REM
5500 FEM INITIALIZATION
    FFINT
    SF10=1
    FRINT "MA= ";MA
5505 INFUT "NEW MA (Y OK N)";SS
    IF S$="N" THEN GOTO 5510
    INFUT "MA=(0<=MA<22)";MA
    MA=INT (MA)
    IF MA<O OF MA>21 
                            THEN FRINT MA;" IS ILLEGAL" :\
                                GOTO 5505
    IF MA=0 THEN TENC=0 : HB=1 : TENTH=1
REM
5510 PRINT
    PRINT "VECTOR= ";VECTOR
5515 INFUT *NEW UECTOR (Y OR N)":SS
    IF S$="N" THEN GOTO 5520
    INFUT "VECTOR=(0`=VECTOR<120)";VECTOR
    UECTOF=INT (UECTOR)
    IF UECTOF<O OR UECTOR`119 \
        THEN FRINT UECTOR;" IS ILLEGAL" :\
        GOTO 55.1.5
REM
5520 FRINT
    FRINT "CF= ";CP
    INFUT "NEW CF' (Y OR N) ";SS
    IF S$="N" THEN GOTO 5530
5525 INFUT "CF'(>=0)= ";CF'
    CF=INT(CF)
    IF CF<O THEN PRINT CF;" IS ILLEGAL" : GOTO 5525
FEM
5530 PRINT
    FRINT "CFM= ";CFM
5535 INFUT "NEW CFM (Y OR N)";S$
    IF SS="N" THEN GOTO 5540
```

Figure B1. (Cont.)

```
        INFUT "CFM=(CF+1<CFM)";CFM
        CPM=INT (CPM)
        IF CFM<CF+1 THEN PRINT CFM;" IS ILLEGAL";"CP= ";CF" :GOTO 5535
FEM
5540
5545 INPUT "NEW TENC (Y OR N)";S$
    FRINT "TENC= ";TENC
    IF MA=O THEN GOTO 5550
        IF S$="N" THEN GOTO 5550
        INFUT "TENC=(O<=TENC<10)";TENC
        TENC=INT(TENC)
        IF TENC<O OF TENC>9 \
                            THEN FRINT TENC;" IS ILLEGAL" :\
                            GOTO 5545
        IF TENC=9 THEN TENTH=0 ELSE TENTH=1
REM
5550
5555
REM
5560
REM
FEM
REM
30
            CCEN=0
            MUX==3
            S11=3
            FE=0
            ZEROH=1.
            ZEFOL=0
                CN=O
                HE=1 REM FEU 2
                UB=0
                FL=O
                RETURN
REM
2 I10=12
S11=0
FE=1.
ZEROH=1
ZEROL=O
CN=O
HE=1 REM REV 2
UB=0
FL=23
RETUFN
REM
3
I10=14
S11=2
FE=1
ZEROH=1
ZEFOL=0
CN=1.
HE=1 REM REV 2
UE=0
FETURN
REM

\begin{tabular}{|c|c|c|c|c|}
\hline \multirow[t]{6}{*}{} & \(\mathrm{FE}=0\) & REM & REU 10 & \\
\hline & ZEROH=1. & & & \\
\hline & ZEROL \(=0\) & & & \\
\hline & GOSUB 50 & & & \\
\hline & \(U B=1\). & & & \\
\hline & RETURN & & & \\
\hline \multicolumn{5}{|l|}{REM} \\
\hline \multirow[t]{4}{*}{15} & \(110=12\) & & & \\
\hline & S11 \(=0\) & REM & REV 10 & \\
\hline & FE=1 & REM & REV 10 & \\
\hline & ZEROH \(=1\) & & & \\
\hline \multirow[t]{5}{*}{REM} & ZEROH=1 & & REM & REMOUEN REV 10 \\
\hline & gosub 50 & & & \\
\hline & UB=1. & & & \\
\hline & PL= 119 & & & \\
\hline & FEETURN & & & \\
\hline \multirow[t]{12}{*}{\[
\begin{aligned}
& \text { REM } \\
& 16
\end{aligned}
\]} & & & & \\
\hline & I \(10=4\) & & & \\
\hline & CCEN \(=0\) & & & \\
\hline & MUX \(=3\) & & & \\
\hline & S11=0 & & & \\
\hline & \(\mathrm{FE}=1\). & & & \\
\hline & ZEROH=1 & & & \\
\hline & ZEROL \(=1\) & & & \\
\hline & \(\mathrm{CN}=1\) & & & \\
\hline & gosub 50 & & & \\
\hline & \(U B=1\). & & & \\
\hline & RETURN & & & \\
\hline \multicolumn{5}{|l|}{REM} \\
\hline \multirow[t]{12}{*}{17} & \(110=3\) & & & \\
\hline & CCEN=0 & & & \\
\hline & MUX \(=1\) & & & \\
\hline & S11 \(=0\) & & & \\
\hline & FE=1 & & & \\
\hline & ZEROH=1 & & & \\
\hline & ZEFOL \(=1\). & & & \\
\hline & \(C N=1\) & & & \\
\hline & gosub 50 & & & \\
\hline & \(\checkmark B=1\). & & & \\
\hline & \(F \mathrm{~L}=16\) & & & \\
\hline & RETUFN & & & \\
\hline \multicolumn{5}{|l|}{REM} \\
\hline \multirow[t]{9}{*}{18} & \(110=8\) & & & \\
\hline & \(511=0\) & & & \\
\hline & \(\mathrm{FE}=1\). & & & \\
\hline & ZEFOH=1 & & & \\
\hline & ZEFOL \(=1\) & & & \\
\hline & \(C N=1\) & & & \\
\hline & gosub 50 & & & \\
\hline & \(\cup B=1\) & & & \\
\hline & FETURN & & & \\
\hline \multicolumn{5}{|l|}{REM} \\
\hline \multirow[t]{10}{*}{19} & I \(10=12\) & & & \\
\hline & \(511=0\) & & & \\
\hline & \(\mathrm{FE}=1\) & & & \\
\hline & ZEFOH=1 & & & \\
\hline & ZEROL \(=1\) & & & \\
\hline & CN=1 & & & \\
\hline & gosub 50 & & & \\
\hline & UB=1 & & & \\
\hline & F'L=23 & & & \\
\hline & RETURN & & & \\
\hline \multirow[t]{2}{*}{REM
20} & & & & \\
\hline & \(\mathrm{I} 10=10\) & & & \\
\hline
\end{tabular}
```

    S11=0
    FE=1
    ZEROH=1
    ZEROL=1
    CN=1
    gosub 50
    UB=1
    FETURN
    REM
21 I10=10
CCEN=1
S11=0
FE=1.
ZEFOH=1.
ZEROL=1.
CN=1
gosub 50
UB=0
FEETURN
FEM
22 I10=10
CCEN=1
FE=0 REM REU 9
ZEROH=0
ZEROL=1 REM REV 9
CN=1
GOSUE 50
UB=0
FETURN
REM
50 REM TEN-LINE-COUNTER CLOCKING SUBROUTINE
IF HB=1 THEN RETUFN
HE=1.
TENC=TENC+1
IF TENC=9 THEN TENTH=0 ELSE TENTH=1
IF TENC=10 THEN TENC=0
FETURN
REM FUSH AND POF SURROUTINES REMOUED REU 3
1000 FEM 2910 INSTFUCTIONS SUBROUTINE
ON I10+1 GOTO 1100,1110,1120,1130,1140,1150,1160,1170,1180, \
1190,1200,1210,1220,1230,1240,1250
REM
1100 FEEM JZ
MA=0 REM 2910 Y
SP10=0 REM 2910 STACK FOINTER (=0 REU 3)
RETUFN
REM
1110 REM CJS
IF FNFAIL \
THEN MA=PC1O\
ELSE MA=FL :\
FUSH=1 REM REU 3
RETURN
REM
1120 REM JMAP
FRRINT "JMAF NOT PROGRAMMEI"
RETUF'N
REM
1130 REM C.JF
IF FNFAIL \
THEN MA=PC1O \
ELSE MA=FLL
RETUF'N

```
REM
1140 REM PUSH
        IF FNFASS THEN R10=PL REM LOAII COUNTER
        MA=F'C10
        PUSH=1 REM REU 3
        RETUFN
REM
1150 REM JSRF
    FRINT "JSRP NOT PROGRAMMED"
        RETUFN
REM
1160 REM C.JN
        IF FNFAIL \
        THEN MA=PC10 \
        ELSE MA=VECTOF
        RETURN
REM
1170 REM JRP
        IF FNFAIL \
        THEN MA=R10 \
        ELSE MA=FLL
        RETURN
REM
1180 REM FFCT
        IF R10=0
        THEN MA=PC10 :\
        POF=1 \
        ELSE MA=F10(SF10) :\
        R10=R10-1.
        RETUFN
REM
1190 REM RPCT
        IF R10=0 \
        THEN MA=PC10\
        ELSE MA=FLL :\
        F10=R10-1.
        RETURN
REM
1200 REM CRTN
        IF FNFAIL \
            THEN MA=PC10 \
        ELSE MA=F1O(SF10) :\
                FOF=1 REM REV 3
        RETURN
REM
1210 REM C.JPP
        FRINT "CJPF NOT PROGRAMMEI"
        RETURN
REM
1220 REM LDCT
        R10=FL.
        MA=FC10
        RETURN
REM
1230 REM LOOF
        IF FNFAIL \
            THEN MA=F10(SP10) \
            ELSE MA=FCC10 :\
                FOF:=1
                                    REM REU 3
        RETURN
REM
1240 REM CONT
        MA=F'C10
        RETURN
                                Figure B1. (Cont.)
```

```
REM
1250 REM TWB
    PRINT "TWB NOT PROGRAMMEI"
    RETUFN
REM
REM
2000 REM MUX SURROUTINE
    ON MUX+1 GOTO 2100,2200,2300,2400
REM
2100 IF TENTH=0
            THEN CC=O \
            ELSE CC=1
        RETUFN
REM
2200 IF CN4=0 \
            THEN CC=0 \
            ELSE CC=1
        RETURN
        IF COMF:=0 \
        THEN CC=0 \
        ELSE CC=1.
    RETURN
REM
2400 CC=1
    RETURN
REM
REM
2500 FEM LEAST SIGNIFICANT 2911 (2911L) SUBROUTINE
    IF ZEROL=0 THEN L4B=0
    RETURN
REM
FEM
REM
REM
3000 REM MORE SIGNIFICANT 2911S (2911H) SUBROUTINE
    ON S11+1 GOSUB 3100,3200,3300,3400
    IF ZEROH=0 THEN Y11=0
    RETURN
REM
3100 Y11=FC11
    RETURN
REM
3200 Y11=R11
    FETURN
REM
3300 Y11=F11
    RETURN
REM
3400 IF I10=6 \
                                    THEN Y11=VECTOR \
                                    ELSE Y11=FL.
    RETURN
REM
FEM
4000 REM CLOCK SUBROUTINE
REM FC1O=MA+1 REMOUEN REV 4
    IF CN=1 THEN L4E=L4B+1
    IF L4B>15 THEN L4B=0 : CN4=1 ELSE CN4=0
    IF CN4=1 \
        THEN FCLI=Y11+1\
        ELSE FC11=Y11
    IF FE=0 THEN F11=PC11
    <--REV 3
```

```
IF FUSH=1 \
            THEN SP10=SF10+1 :\
                F10(SF10)=PC10:\
                FUSH=0
    IF SF10>4 \
                            THEN PRINT "2910 STACK FULL " :\
                                GF'10=3
IF FOP=1 \
            THEN SF'10=SF'10-1 :\
            POF=0
IF SP10<0 \
            THEN FRINT "POP EMPTY FILE? ";CF :\
                SF10=0
REU 3 -->
PC10=MA+1 REM REV 4
CP=CF+1
RETURN
REM
REM
```

Figure B1 (Cont.)

## APPENDIX C

A simple circuit was designed to accommodate five different display formats and also to comply with the European 50 Hz TV standard. Figure C 1 is the circuit diagram of this additional circuit.
The following parameters change when the format is changed:

1) The number of characters/line.
2) The number of lines/frame.
3) The number of characters to display (ו.e., the address of the last character).
4) The line frequency and therefore the dot frequency.

The number of characters/lıne is counted by the least significant Am2911 sequencer via the microcode. Therefore, the microcode can be changed to change the number of characters/line. The number of lines/frame is counted by a constant, loaded into the

Am2910 internal counter by the microcode. The microcode can be changed to vary the number of lines/frame.

The scan is reinıtialized to zero when the last address +1 is attained. $U_{9}$ (Am25LS2521) detects this address by comparing bits $A_{4}$ through $A_{10}$ of the character address bus to a constant supplied to its $B$ inputs. A table listing these constants is shown in Figure C1. By setting the DIP switches according to that table, the character scan will reinitialize correctly. The same constant is routed through one half of an Am25LS240 (U24) to the internal data bus. At microprogram address zero, a JUMP MAP instruction enables these outputs thereby putting a starting address on the bus according to the table in Figure C1.
The microprogram is shown on Figure C2.


Figure $\mathbf{C 1}$.

```
A>TYPE CRT.DEF
;
;CKT DEFINITION FIIE
; BY MCSEE M. SHAVIT
;REV Q 3/8/78
;
TITIF CRT CONTRCIIER --DEFINITIONS
WCRD 24
;
FE: DEF 1VB#1,23X
ZEROÏ: IEF 1X,1VB#1,22X
S11: DEF 2X,2V%:Q#,2\emptysetX
I10: DEF 4X,4VH#,16X
CN: DFF GX,1VB#1,14X
ZERCI: LEF 10X,1VE#1,13X
VE: DEF 11X,1VB#D,12X
HE: DEF 12X,1VB#0,11X
CCEN: DEF 13X,1VE#,10Y.
MUXD: LEF 14X,B#QD,8X
MUX1: LEF 14X,B#18,8X
MUX2: DEF 14X,B#Ø1,8X
MUX3: DEF 14X,B#11,8X
PI: DEF 1EX,&V%:
;
I: EGU B#D
H: FQU B#1
CCUNT: DEF E B#1,B#1,B#\varnothing凶,5X,B#1,B#1,R#\ell,B#\varnothing,1X, <X, 8X
COUNTE: LEF S#1,B#1,B#DQ,5X,E#1,B#1,B#\varnothing,B#1,1X, 2X,8X
COUNTV: LEF B#1,B#1,B#0\ell,5X,B#1,B#1,D#1,B#1,1X,2X,8X
;
INI
A>
```

Figure C2. AMDASM Definition and Assembly Files for the CRT Controller.

```
AMDOS/29 AMDASM MICRO ASSEMBLER. V1.1
CRT CONTKOILFR
    ;CFT CONTRCLIER NICROFROGFAN
    ;
    ;EY MOSHE M. SHAVIT
    ;REV 2 5/3/7E
    ;
0000
    I10 H#2 ;JUMP MAP
    ; 24 ROWS 80 CHARACTERS 60 F/S
00\ell1 S248\ell: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZRROH & 2EROL L &
        / CN L & HB H & VB
        I10 H#C & S11 0 & FE & ZEROH & ZERCL L & CN L & BE H &
        /VB & PL D#23
    0003 M2480: I10 H#E & S11 2 & FE & ZEROH & ZEROI L & CN & HB H & VB
0004 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
0005 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
0006 I10 E#3 & CCEN L & MUX1 & COUNT & PL $
0007 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
0008 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
0009 I10 H#1 & CCEN L & MUX0 & COUNTH & PL T2480
000A I10 H#1 & CCEN L & MUX2 & COUNTH & PL LASTA
QOQB IIO H#3 & CCEN L & MUX1 & COUNTH & PL $
000C I10 H#3 & CCEN H & S11 0 & FF & 2EROH & FB H & VB & PL M2480
000D T2480: I10 H#G & S11 0 & FE L & 2EROH & ZEROL & CN H & HB H & VB & PL GOP
ACK
    ODOE I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / HB H & VB H
        I10 H#C & S11 D & FE & ZEROH & HR H & VB H & PL D#146
        I10 H#4 & CCEN L & MUX3 & COUNTV
        I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
        I10 H#8 & COUNTV
        I10 H#C & COUNTV & PL D#23
        I10 H#A & CCEN H & COUNTV
    ;
    OD15 GOBACK: I10 H#A & CCEN H & COUNTH
    \varrho016 LASTA: I10 H#A & CCEN H & FE L & ZEROH L & ZEROL & CN H & HB H & VB
        ;
        ; 24 ROWS 64 CHARACTERS 60 F/S
    0017 S2464: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / CN L & HB H & VB
        I10 H#C & S11 0 & FE & ZEROH & ZEROL L & CN L & HB H &
        /VB & PL D#23
    0019 M2464: I10 H#E & S11 2 & FE & ZEROH & ZEROL L & CN & HB H & VB
    001A I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    001B I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    001C I10 H#3 & CCEN L & MUX1 & COUNT & PI $
    001D I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    001E I10 H#1 & CCEN L & MUXD & COUNTH & PL T2464
    0\ell1F I10 H#1 & CCEN L & MUX2 & COUNTH & PL IASTA
    0020 I10 H#3 & CCEN L & MUX1 & COUNTH & PL $
    OE21 IID H#3 & CCEN H & S11 & & FF & ZEROH & HB H & VB & PL M24E4
    &O22 T2464: I10 H#9 & S11 0 & FE L & ZEROH & ZEROL & CN H & HB H & VB & PL GOB
ACK
    0023 I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / BB H & VB H
    0024 I10 H#C & S11 0 & FE & ZEROH & HB H & VB H & PL D#122
    0025 I10 H#4 & CCEN L & MUX3 & COUNTV
    0026 I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    0027 I I0 H#.8 & COUNTV
```

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
CRT CONTROLLER
```



AMDOS/29 AMDASM MICRO ASSEMBLER, V1'.1
CRT CONTROLLER

```
        / CN L & HB H & VB
    0050 S1616: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
0051 I10 H#C & S11 D & FE & 2EROH & 2EROL L & CN L & HB H &
        /VB & PL D#15
0052 M1€16: I10 H#E & S11 2 & FE & ZEROH & ZEROL L & CN & HB H & VB
0053 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
0054 I10 H#1 & CCEN L & MUX0 & COUNTH & PL T1616
0055 I10 H#1 & CCEN L & MUX2 & COUNTH & PL LASTA
0056 I10 H#3 & CCEN L & MUX1 & COUNTH & PI $
005% I10 H#Z & CCEN H & S11 0 & FE & ZEROH & HB H & VB & PL M1616
0058 T1616: I10 H#9 & S11 0 & FE L & 2EROH & ZEROL & CN H & HB H & VB & PI GOB
ACK
    0059
    HBH& &BH
        I10 H#C & S11 | & FE & 2EROH & HB H & VB H & PL D#203
        I10 H#4 & CCEN L & MUX3 & COUNTV
        I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
                                I10 H#8 & COUNTV
                                10 H#C & COUNTV & PL D#15
                                I10 H#A & CCEN H & COUNTV
    00F| ORG H##F0 ;24*80
        ;
00F3 (l)
                ORG H#@F9 ;24*32
                                10 H#3 & CCEN H & PL S2432
00FB ORG H#0FB ;16*32
```



```
    ;
    ;
    0100; ORG H#100
```

0100 S2480E: I10 H\# \& CCEN L \& MUX3 \& S11 3 \& FE L \& ZEROH \& ZEROL L \&
/ CN L \& HB H \& VB
I10 H\#C \& S11 0 \& FE \& 2EROH \& 2EROL L \& CN L \& HB H \&
/VB \& PL D\#23
0102 M2480E: I $1 \varnothing$ H\#E \& S11 $2 \& F E$ \& ZEROH \& ZEROL L \& CN \& HB H \& VB
Q1e3 I10 H\#3 \& CCEN L \& MUX1 \& COUNT \& PL \$
0104 I I 10 H\#3 \& CCEN L \& MUX1 \& COUNT \& PL $\$$
0105 I10 H\#Z \& CCEN L \& MUX1 \& COUNT \& PL
0106 I10 H\#3 \& CCEN I \& MUX1 \& COUNT \& PI $\$$
0107 I10 H\#3 \& CCEN L \& MUX1 \& COUNT \& PI \$
0108 I10 H\#1 \& CCEN L \& MUXD \& COUNTH \& PL T2480E
©109 I10 H\#1 \& CCEN L \& MUX2 \& COUNTH \& PL LASTA
010A I10 H\#3 \& CCEN L \& MUX1 \& COUNTH \& PL \$
©10B IID H\#3 \& CCEN H \& S 11 © \& FE \& ZEROH \& HB H \& VB \& PL M2480E
O10C T2480E: I10 H\#O \& S11 Ø \& FEL \& ZEROH \& 2EROL \& CN H \& HB H \& VB \& PL GOB
ACK

Figure C2 (Cont.)

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
CRT CONTROLLER
    010D I10 H#6 & CCEN L & MUXZ & S11 3 & FE L & ZERCH & ZEROL L &
    010E II10 H#C & S11 0 & FE & ZEROE & HB H & VB H & PL D#200 ;ITERATES
201 TIMES
    010F I10 H#4 & CCEN L & MUX3 & COUNTV
    0110 I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    0111 I10 H#8 & COUNTV
    0112 I10 B#C & COUNTV & PL D#239
0113 I10 H#4 & CCEN L & MUX3 & COUNTV
@114 I1& H#3 & CCEN L & MUX1 & COUNTV & PL $
0115 I10 H#8 & COUNTV
    ;
0116 
    24 ROWS 64 CHARACTERS 50 F/S
    0118 S2464E: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & 2EROH & ZEROL L &
        / CN L & HB H & VB
    C119 I10 H#C & S11 0 & FE & ZEROH & ZEROL L & CN L & HB H &
        /VB & PL D#23
011A M2464E: I10 H#E & S11 2 & FE & ZEROH & ZEROL L & CN & HB H & VB
011B I10 H#3 & CCEN L & MUX1 & COUNT & PI $
011C I10 E#3 & CCEN L & MUX1 & COUNT & PL $
011D I10 E#3 & CCEN L & MUX1 & COUNT & PL $
011E I10 H#3 & CCEN L & MUX1 & COUNT & PI $
011F I10 H#1 & CCEN L & MUX0 & COUNTH & PL T2464E
0120 I10 H#1 & CCEN L & MUX2 & COUNTH & PL LASTA
0121 IIO H#3 & CCEN L & MUX1 & COUNTH & PL $
0122 I10 H#3 & CCEN H & S11 0 & FE & ZERCH & HB H & VB & PL M24€4E
012z T2464E: I10 H#O & S11 O & FEL & ZEROH & ZEROL & CN H & HB H & VB & PL GOB
ACK
    O124 I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / HB H & VB H
    0125 I10 H#C & S11 0 & FE & ZEROH & HB H & VB H & PL D#200
    012\epsilon I10 H#4 & CCEN L & MUX3 & COUNTV
    @127 I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    0128 I10 H#8 & COUNTV
    0129 I10 H#C & COUNTV & PL D#167 ;369
012A I10 H#4 & CCEN L & MUX3 & COUNTV
&12B IID H#3 & CCEN L & MUX1 & COUNTV & PL $
\ell12C I10 H#8 & COUNTV
012D I10 H#C & COUNTV & PL D#23
O12E IID H#A & CCEN H & COUNTV
    24 ROWS 32 CHARACTERS 50 F/S
    012F S2432E: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
    / CN L & HB H & VB
    0130 I10 H#C & S11 O & FE & ZEROH & ZEROL L & CN L & HB H &
        /VB & PL D#23
    0131 M2432E: I10 H#E & S11 2 & FE & ZEROH & 2EROL L & CN & HB H & VB
    0132 I10 H#Z & CCEN L & MUX1 & COUNT & PL $
    0133 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    0134 I10 H#1 & CCEN I & MUX0 & COUNTH & PL T2432E
    0135 I10 H#1 & CCEN L & MUXZ & COUNTH & PL LASTA
    0136 I10 H#3 & CCEN L & MUX1 & COUNTH & PL $
```

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
CRT CONTROLLER
    0137 I10 H#3 & CCEN H & S11 0 & FE & ZEROH & HB H & VB & PL M2432E
    0138 T2432E: I10 H#9 & S11 D & FE L & ZEROH & ZEROL & CN H & HB H & VB & PL GOB
ACK
    0139 I10 H#E & CCEN L & MUX3 & S11 3 & FE L & 2EROH & ZEROL L &
        / HB H & VB H
    013A I10 H#C & S11 0 & FE & ZEROH & HB H & VB H & PL D#224
    013B I10 H#4 & CCEN L & MUX3 & COUNTV
    013C I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    013D I10 H#8 & COUNTV
    013E I10 H#C & COUNTV & PL D#23
    \emptyset13F I1\varnothing H#A & CCEN H & COUNTV
    16 RQWS 32 CHARACTERS 50 F/S
    0140 S1632E: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / CN L & HB H & VB
    0141 I10 H#C & S11 O & FE & 2EROH & 2EROL L & CN L & HB H &
        /VB & PL L#15
    0142 M1E32E: I10 H#E & S11 2 & FE & ZEROH & 2EROL L & CN & HB H & VB
    0143 I10 H#Z & CCEN L & MUX1 & COUNT & PL $
    0144 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    0145 I10 H#1 & CCEN L & MUX0 & COUNTH & PL T1632E
    0146 I10 H#1 & CCEN L & MUX2 & COUNTH & PL LASTA
    0147 I10 H#3 & CCEN L & MUX1 & COUNTH & PI $
    0148 I10 H#3 & CCEN H & S11 0 & FE & ZEROH & HB H & VB & PL M1E32E
    0140 T1632E: I10 H#O & S11 \emptyset & FE L & ZEROH & ZERCL & CN H & HB H & VE & PL GOB
ACK
    014A I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
    / HB H & VB H
        I10 H#C & S11 D & FE & ZEROH & HB H & VB H & PL D#250
    I10 H#4 & CCEN L & MUX3 & COUNTV
    I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    I10 H#& & COUNTV
    I10 H#C & COUNTV & PL D#223 ;475
    I10 H#4 & CCEN L & MUXZ & COUNTV
    I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
    I10 H#8 & COUNTV
    I10 H#C & COUNTV & PL D#15
    I10 H#A & CCEN H & COUNTV
    16 ROWS 16 CHARACTERS 50 F/S
    0155 S1G16E: I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / CN L & HB H & VB
    0156 II\varnothing H#C & S11 0 & FE & ZEROH & ZEROL L & CN L & HB H &
        /VB & PL D#15
    0157 M1G1GE: I10 H#E & S11 2 & FE & ZEROH & ZEROL L & CN & HB H & VB
    0158 I10 H#3 & CCEN L & MUX1 & COUNT & PL $
    0159 I10 H#1 & CCEN L & MUXO & COUNTH & PI T1616E
    015A I10 H#1 & CCEN L & MUX2 & COUNTH & PL LASTA
    015B I10 H#3 & CCEN L & MUX1 & COUNTH & PL $
    015C I10 H#3 & CCEN H & S11 O & FE & ZEROH & HB H & VB & PL M1616E
    015D T1616E: I1D H#O & S11 D & FE L & ZEROH & ZEROL & CN H & HB H & VB & PL GOB
ACK
    015E I10 H#6 & CCEN L & MUX3 & S11 3 & FE L & ZEROH & ZEROL L &
        / HB H & VBH
            I10 H#C & S11 O & FE & ZEROH & HB H & VB H & PL D#200
            I10 H#4 & CCEN L & MUXZ & COUNTV
            I10 H#3 & CCEN L & MUX1 & COUNTV & PL $
            I10 H#E & COUNTV
```

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
CRT CONTROLLER
```



## 0000 XXXX0010XXXXXXXX XXXXXXXX

 $0001 \quad 01110110 \mathrm{X0001011} \mathrm{XXXXXXXX}$ $006211001100 \mathrm{X0001XXX} 00010111$ $000311101110 \times 1001 \times X X$ XXXXXXXX $000411000011 \times 110001000000100$ $000511000011 \times 110001000000101$ $000611000011 \times 110001000000110$ $000711000011 \times 110001000000111$ $000 \varepsilon 11000011 \times 110001800001000$ $000511000001 \times 110100000001101$ $000 \mathrm{~A} 11000001 \times 110100100010110$ $000 \mathrm{~B} 11000011 \times 110101000001011$ $000 \mathrm{C} 11000011 \times \mathrm{XX011XX} 00000011$ $006 \mathrm{D} 01001061 \times 1101 \mathrm{XXX} 00010101$ Ø00E $01110110 \times X 011011$ XXXXXXXX $000 \mathrm{~F} 11001100 \mathrm{XXX11XXX} 10010010$ $001011000100 \times 1111011$ XXXXXXXX $001111000011 \times 111101000010001$ $001211001000 \times 1111 \times X X$ XXXXXXXX $001311001100 \mathrm{X1111XXX} 00010111$ $001411001010 \times 11111 \mathrm{XX} \cdot \mathrm{XXXXXXXX}$ $001511001010 \times 11011 \mathrm{XX} \times \times \times X X X X$ 0616 00XX1010X11011XX XXXXXXXX $001701110110 \times 0001011$ XXXXXXXX 001811001100 x 0001 XXX 00010111 $0 \ell 19$ 11101110X1001XXX XXXXXXXX $001 \mathrm{~A} 11000011 \times 110001000011010$ $001 \mathrm{~B} 11000011 \times 110001000011011$ $001 \mathrm{C} 11000011 \times 110001000011100$ $001 \mathrm{D} 11000011 \times 110001000011101$ $001 \mathrm{E} 11000001 \times 110100000100010$ $0<1 F 11000001 \times 110100100010110$ $002011000011 \times 110101000100000$ $002111000011 \times \times X 011 \times \mathrm{X} 00011001$$002201001001 \times 1101 \times x \times 00010101$ $002301110118 \times \mathrm{x} 011011 \mathrm{XXXXXXXX}$ $002411001100 \times \times \times 11 \times X X \quad 01111010$ $002511000100 \times 1111011 \mathrm{XXXXXXXX}$ $002 € 11000011 \times 111101000100110$ $002711001000 \times 1111 \times X X$ XXXXXXXX 0 02E $11001100 \times 1111 \times X X 00016111$ 0029 11001010x11111XX XXXXXXXX $002 \mathrm{E} 1110110 \mathrm{X0001011} \mathrm{XXXXXXXX}$ $002 \mathrm{~B} 11001100 \mathrm{X0001XXX} 0001 \ell 111$ $002 \mathrm{C} 11101110 \mathrm{X1001XXX} \mathrm{XXXXXXX}$ $002 \mathrm{~L} 11000011 \times 110001000101101$ $002 \mathrm{E} 11000011 \times 110001000101110$ $002 F 11000001 \times 110100000110011$ $003011000001 \times 110100100010110$ $003111000011 \times 110101000110201$ $003211000011 \mathrm{XXX011XX} 00101100$ $003301001001 \times 1101 \times \times \mathrm{X} 0010101$ $003401118110 \times X 011011$ XXXXXXXX $003511001100 \times \mathrm{XX11XXX} 01001010$ $003611000100 \times 1111011 \mathrm{XXXXXXXX}$ $003711000011 \times 111101000110111$ 0038 11001000X1111XXX XXXXXXXX $003 \mathrm{E} 11001100 \mathrm{X1111XXX00010111}$ 003 A 11001010 X 11111 XX XXXXXXXX $003 \mathrm{~B} 01110110 \mathrm{X0001011} \mathrm{XXXXXXXX}$ 003 C 11001100 x 001 XXX 00001111 003 D 11101110 X 1001 XXX XXXXXXXX 003 E 11000011X1100010 00111110 $003 F 11000011 \times 110001000111111$ $004011000001 \times 110100001000100$ 0041 11000001X1101001 00010110 $004211000011 \times 110101001000010$ 0043 11000011XXX011XX 00111101

AMDOS / 29 AMDASM MICRO ASSEMBLER, V1.1 CRT CONTROLLER



```
AMLOS/2S AMDASM MICRO ASSEMBLER, V1.1
CRT CONTROLLER
```

```
    015D C1001001X1101XXX 00010101
    015E 01110110XX011011 XXXXXXXX
    015F 11001100XXX11XXX 11001000
    0160 11000100X1111011 XXXXXXXX
    0161 11000011X1111010 01100001
    0162 11001000X1111XXX XXXXXXXX
    0163 11001100X1111XXX 01111001
    0164 11000100X1111011 XXXXXXXX
    0165 11000011 \1111010 01100101
    0166 11001000X11111XXX XXXXXXXX
    01€7 11001100X1111XXX 00001111
    0168 11001010X111111XX XXXXXXXX
    01F0 XXXX0011XXXXX1XX 000000000
    01F3 XXXX0011XXXXX1XX 00011000
    01FG XXXX0011XXXXX1XX 00101111
    01FB XXXX0011XXXXX1XX 01000000
    01FD XXXX0011XXXXX1XX 01010101
```

ENTRY POINTS
SYMBOLS
GOBACK 015
$\mathrm{H} \quad 00 \mathrm{D} 1$
I $\quad 0000$
IASTA 0016
M1E16 0052
M1E16E Ø157
M1632 e03D
M1E32E 0142
M2432 902 C
M2432E e131
M2464 01919
M2464E 011 A
Ni248e 0003
M2480E 0102
S1616 0850
S1616E e155
S1€32 013 B
S16こ2F 0140
S2432 002A
S2432E 012F
S2464 Ø017
S2464E 0118
S2480 0001
S2480E 0100
T1616 Q058
T1616E 015D
T1622 0044
T1E32E $\quad 149$
$\mathrm{T} 2432 \quad 9033$
T2432E 0138
T2464 00こ2
T2464E 0123
T2480 000D
T248民E 810 C
TCTAL PHASE 2 ERRORS $=\varnothing$

## APPENDIX D

The Microprogrammed CRT Controiler was built on a System 29 universal card and exercised by the System 29 support processor. An Am9080A program was written to fill the character memory. Figure D1 is the listıng of this program. In order to observe the
correct output of the controller, an oscilloscope or CRT monitor can be connected through an adaptation circuit shown in Figure D2.


Figure D1


Figure D1 (Cont.)


Figure D2.


## Chapter III The Data Path

## INTRODUCTION

The heart of most digital arithmetic processors is the arithmetic logic unit (ALU). The ALU can be thought of as a digital subsystem that performs various arithmetic and logic operations on two digital input variables. The Am2901A and Am2903 are Low Power Schottky TTL arithmetic logic unit/function generators that perform arithmetic/logic operations on two four-bit input variables. In most ALUs, speed is generally a key ingredient. Therefore, as much parallelism in the operation of the arithmetic logic unit as possible is desired.
The Am2901A and Am2903 ALUs are designed to operate with an Am2902A carry lookahead generator to perform multi-level full carry lookahead over any number of bits. Therefore, the devices have both the carry generate and carry propagate outputs required by the Am2902A carry lookahead generator. The devices also have the carry output ( $\mathrm{C}_{\mathrm{n}+4}$ ) and a two's complement overflow detection signal (OVR) available at the output. The net result is that a very high-speed 16 -bit arithmetic logic unit/function generator can be designed and assembled using four of these bit slice devices and one Am2902A (the Am2902A is a high-speed version of the '182 carry lookahead generator). In addition, the Am2901A and Am2903 provide a minimum of 16 working registers for providing source operands to the ALU.

## UNDERSTANDING THE BASIC FULL ADDER

The results of an arithmetic operation in any position in a word depends not only on the two-input operand bits at that position, but also on all the lesser significant operand bits of the two input variables. The final result for any bit, therefore, is not available until the carries of all the previous bits have rippled through the logic array starting from the least significant bit and propagating through to the most significant bit. A full adder is a device that accepts two individual operand bits at the same binary weight, and also accepts a carry input bit from the next lesser significant weight full adder. The full adder then produces the sum bit for this bit position and also produces a carry bit to be used in the next more significant weight full adder carry input. The truth table for a full adder is shown in Figure 1. From this truth table, the equations for the full adder:
$S=A \oplus B \oplus C$
$C_{O}=A B+B C+A C$,
where $A$ and $B$ are the input operands to the full adder and $C$ is the carry input into the adder.

| Inputs |  |  | Outputs |  |
| :---: | :---: | :---: | :---: | :---: |
| $\mathbf{A}$ | B | C | S | $\mathrm{C}_{\mathbf{0}}$ |
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 | 0 |
| 0 | 1 | 0 | 1 | 0 |
| 0 | 1 | 1 | 0 | 1 |
| 1 | 0 | 0 | 1 | 0 |
| 1 | 0 | 1 | 0 | 1 |
| 1 | 1 | 0 | 0 | 1 |
| 1 | 1 | 1 | 1 | 1 |

Figure 1. Full Adder Truth Table.

The sum output, $S$, represents the sum of the $A$ and $B$ operand inputs and the carry input. The carry output, $\mathrm{C}_{\mathrm{O}}$, represents the carry out of this cell and can be used in the next more significant cell of the adder. Full adder cells can be cascaded as depicted in Figure 2 to form a four-bit ripple carry parallel adder.
Note that once we have cascaded devices as shown in Figure 2, we may wish to discuss the equations for the $i$-th bit of the adder. In so doing, we might describe the equations of the full adder as follows:

$$
\begin{aligned}
& S_{i}=A_{i} \oplus B_{i} \oplus C_{i} \\
& C_{i+1}=A_{i} B_{i}+B_{i} C_{i}+A_{i} C_{i}
\end{aligned}
$$

where the $A_{i}$ and $B_{i}$ are the input operands at the $i$-th bit, and the $C_{i}$ is the carry input to the $i$-th bit. (Note that the equations for this adder are iterative in nature and each depends on the result of the previous lesser significant bits of the adder array.)
The connection scheme shown in Figure 2 requires a ripple propagation time through each full adder cell. If a 16 -bit adder is to be assembled, the carry will have to propagate through all 16 full adder cells. What is desired is some technique for anticipating the carry such that we will not have to wait for a ripple carry to propagate through the entire network. By using some additional logic, such an adder array can be constructed. This type of adder is usually called a carry lookahead adder.


Figure 2. Cascaded Full Adder Cells Connected as a Four-Bit Ripple-Carry Full Adder.

## A FOUR-BIT CARRY LOOKAHEAD ADDER

Looking back to the equations developed for i-th bit of an adder, let us now rewrite the carry equation in a slightly different form. When we factor the $C_{i}$ in this equation, the new equation becomes:

$$
C_{i+1}=A_{i} B_{i}+C_{i}\left(A_{i}+B_{i}\right)
$$

From the above equation, let us now define two additional equations. These are:

$$
\begin{aligned}
& G_{i}=A_{i} B_{i} \\
& P_{i}=A_{i}+B_{i}
\end{aligned}
$$

With these two new auxiliary equations, we can now rewrite the carry equation for the $i$-th bit as follows:

$$
C_{i+1}=G_{i}+P_{i} C_{i}
$$

Note that we have now developed two terms: the $P_{i}$ term is known as carry propagate and the $G_{i}$ term is known as carry generate. An anticipated carry can be generated at any stage of the adder by implementing the above equations and using the auxiliary functions $P_{i}$ and $G_{i}$ as required.
It is interesting to note that the sum equation can also be written in terms of these two auxiliary equations, $P_{i}$ and $G_{i}$. For this case, the equation is:

$$
S_{i}=\left(A_{i}+B_{i}\right)\left(\overline{A_{i} B_{i}}\right) \oplus C_{i}
$$

The auxiliary function $G_{i}$ is called carry generate, because if it is true, then a carry is immediately produced for the next adder stage. The function $P_{i}$ is called carry propagate because it implies there will be a carry into the next stage of the adder if there is a carry into this stage of the adder. That is, $\mathrm{G}_{\mathrm{i}}$, causes a carry signal at the $i$-th stage of the adder to be generated and presented to the next stage of the adder while $P_{i}$ causes an existing carry at the input to the $i$-th stage of the adder to propagate to the next stage of the adder.
Let us now write all of the sum and carry equations required for a full four-bit lookahead carry adder.

$$
\begin{aligned}
& \mathrm{S}_{0}=A_{0} \oplus \mathrm{~B}_{0} \oplus C_{0} \\
& \mathrm{~S}_{1}=A_{1} \oplus \mathrm{~B}_{1} \oplus\left(G_{0}+P_{0} C_{0}\right) \\
& \mathrm{S}_{2}=A_{2} \oplus \mathrm{~B}_{2} \oplus\left(G_{1}+\mathrm{P}_{1} G_{0}+P_{1} P_{0} C_{0}\right) \\
& \mathrm{S}_{3}=A_{3} \oplus B_{3} \oplus\left(G_{2}+\mathrm{P}_{2} G_{1}+P_{2} P_{1} G_{0}+P_{2} P_{1} P_{0} C_{0}\right) \\
& \mathrm{C}_{\mathrm{i}+4}=\mathrm{G}_{3}+\mathrm{P}_{3} G_{2}+P_{3} P_{2} G_{1}+P_{3} P_{2} P_{1} G_{0}+P_{3} P_{2} P_{1} P_{0} C_{0}
\end{aligned}
$$

An important point to note is that ALL of the sum equations and the final carry output equation, $\mathrm{C}_{\mathrm{i}+4}$, can be written in terms of the $A_{i}, B_{i}$, and $C_{0}$ inputs to the four-bit adder. The configuration as described above is shown in Figure 3. This figure is divided into two parts - the upper blocks show the auxiliary function generator circuitry required to implement the $P_{i}$ and $G_{i}$ equations while the lower block implements the logic required to generate the sum output at each bit position.
A serious drawback to the lookahead carry adder is that as the word length is increased, the carry functions become more and more complex, eventually becoming impractical due to the large number of interconnections and heavy loading of the $G_{i}$ and $P_{i}$ functions. The auxiliary function concept can be extended, however, by dividing the word length into fairly small increments and defining blocks of auxiliary functions $G$ and $P$.
It is possible for a given block to define a function $G$ as the carry out generated with the block; and $P$ can be defined as the carry propagate over the block. If the block size is set at four bits, then the functions for G and P for this block can be defined as follows:

$$
\begin{aligned}
& G=G_{3}+P_{3} G_{2}+P_{3} P_{2} G_{1}+P_{3} P_{2} P_{1} G_{0} \\
& P=P_{3} P_{2} P_{1} P_{0}
\end{aligned}
$$



Figure 3. Full Four-Bit Carry-Lookahead Adder.

It is important to note that neither of these terms involves a carry-in ( $\mathrm{C}_{0}$ ) to the block, so no matter how many blocks are tied in an adder, all the blocks have stable $G$ and $P$ functions available in a minimum number of gate delays.
The $G$ and $P$ functions can be gated to produce a carry-in to each four-bit block, as a function of the lesser significant blocks. The carry-in to a block is therefore:

$$
\begin{aligned}
C_{n}= & G_{n-1}+P_{n-1} G_{n-2}+P_{n-1} P_{n-2} G_{n-3}+\ldots \\
& +P_{n-1} P_{n-2} P_{n-3} \ldots P_{2} P_{1} P_{0} C_{0}
\end{aligned}
$$

Finally, the carry-in to each of the bits in a four-bit block must include a term for the actual least significant carry-in; note, therefore, that the equations for the four-bit full adder presented above include a term for carry-in at each bit position.
Figure 4 shows the technique for cascading typical bit slice ALUs such as the Am2901A or Am2903 and one Am2902A in a full 16 -bit high-speed carry lookahead connection. Figure 5 shows a connection scheme using only four bit slices in a 16-bit arithmetic logic unit connection where the carries are rippled between the devices. Each bit slice does use internal carry lookahead over the four-bit block.


Figure 4. Full Lookahead Carry 16-Bit Adder.


Figure 5. Connection of 16-Bit ALU Using Ripple Carry.

In summary, the ripple carry method can be used in conjunction with the lookahead technique in several ways.

1. Lookahead carry over sections of the adder and ripple carry between these sections of the adder can be used. This method is often the most efficient in terms of hardware for a given speed requirement. It does not require the use of a lookahead carry generator such as the Am2902A.
2. Lookahead carry across 16 -bit blocks with a ripple carry between 16 -bit blocks can be used. This technique is usually called two-level carry lookahead addition. This technique results in very high-speed arithmetic function generation and makes a reasonable tradeoff between the speed and hardware for word lengths greater than 16 bits.
3. Full lookahead carry across all levels and all block sizes can be used. This is the highest speed arithmetic logic unit connection scheme. For word sizes up to 64 bits, it is referred to as three-level lookahead carry addition. Such a 64-bit ALU requires the use of five Am2902A carry lookahead generator units in addition to the 16 bit slice ALU devices as shown in Figure 6.

## OVERFLOW

When two's complement numbers are added or subtracted, the result must lie within the range of the numbers that can be handled by the operand word length. Numbers are normally represented either as fractions with a binary point between the sign bit and the rest of the word, or as integers where the binary point is after the least significant bit. The actual choice for the location of the binary point is really up to the design engineer, as
the hardware configuration required for either technique is identical. It is also possible to use number notations that include both integer and fractional representations in the same numbering scheme. Overflow is defined as the situation in which the result of an arithmetic operation lies outside of the number range that can be represented by the number of bits in the word. For example, if two eight-bit numbers are added and the result does not lie within the number range that can be represented by an eight-bit word, we say that an overflow has occurred. This can happen at either the positive end of the number range or at the negative end of the number range. The logic function that indicates that the result of an operation is outside of the representable number range is:

$$
\mathrm{OVR}=\mathrm{C}_{\mathrm{s}} \oplus \mathrm{C}_{\mathrm{s}+1}
$$

where $\mathrm{C}_{\mathrm{s}}$ is the carry-in to the sign bit and $\mathrm{C}_{\mathrm{s}+1}$ is the carry-out of the sign bit.
Thus, for a four-bit ALU with the sign bit in the most significant bit position, the two's complement overflow can be defined as the $\mathrm{C}_{\mathrm{n}+4}$ term exclusive OR'ed with the $\mathrm{C}_{\mathrm{n}+3}$ term.

## Putting the ALU in the Data Path of a Simple Computer

Once the Design Engineer understands the basic configuration and operation of a simple high speed carry lookahead adder, he can begin to understand the configuration required to implement the data handling section of a typical computing machine. The simplest architecture for the data handling path of a minicomputer is shown in Figure 7. Here, an accumulator is used in conjunction with an ALU to perform a basic arithmetic/storage capability for data handling. The computer control unit of Figure 7 can be a simple or sophisticated state machine as described in Chapter 2.


Figure 6. 64-Bit ALU with Full Carry Lookahead Using 5 Am2902s and 16 4-Bit Slices.


Figure 7. Basic Computer Data Path.

While the introductory material of this chapter concentrated on full adders, it should be understood that more ALU functions than addition are required if we are in to implement the data path of a typical minicomputer. Typically, some or all of the functions shown in Figure 8 are needed if we are to implement a powerful data handling capability.
The operation of the ALU/accumulator configuration shown in Figure 7 can be described as follows. The accumulator can be loaded by bringing data in from the data-in port through the A input of the ALU, passed through the ALU and loaded into the accumulator. A second word of data can be presented at the data-in port to the A input of the ALU and the ALU can be used to perform an operation such as $A+B, A O R B, A$ AND $B, A-B$ and so forth. The results of this ALU operation can then be placed into the accumulator. The accumulator output is available at the data-out port for use elsewhere. Additional ALU functions such as


Figure 8. Basic ALU Instructions.
those shown in Figure 8 are easily implemented by adding some additional circuitry to the four-bit carry look ahead adder shown in Figure 3. If this circuitry is added, we will arrive at a logic diagram as shown in Figure 9. This diagram certainly is familiar to most CPU designers and is the well known Am74S181 four-bit arithmetic logic unit/function generator.
Once the operation of the simple computer data path as shown in Figure 7 is understood, the Design Engineer will soon recognize the need for additional registers if our machine is to be general purpose and execute instructions. Very rapidly the need arises for a register to hold a program counter (PC) and a memory address register (MAR). The purpose of the program counter is to point to the address of the next instruction in main memory. Typically it is loaded into the memory address register which actually provides the address on to the address bus of the machine. Then, the program counter is incremented through the ALU and stored until


Figure 9. Logic Diagram for Am25LS181.


Figure 10.Three Register Computer Data Path.
it is needed again. The block diagram of Figure 10 shows these additional registers connected in parallel at the output of the ALU. This ALU output is called the F bus. Each of these registers (the accumulator, the PC, and the MAR) has an enable input from the CCU so that they can selectively be loaded with data from the ALU. In addition, each of these registers has an output enable such that they can be selectively enabled onto the D bus. The D bus represents the data output path from the basic computer data
path and also is used as one of the inputs to the actual ALU/function generator. The other input in this example is called the $R$ bus and comes directly from the main memory data output as well as from the I/O data input. As shown in Figure 10, the memory address register (MAR) has a second output that is used to drive the address bus. In this example, this register always contains the address to be applied to the external memory whether it be the address of data or the address of an instruction.

The best way to understand the operation of this single ALU/three register machine is to take an example. Let us assume we have just completed the execution of one machine instruction and are ready to fetch the next instruction. The first operation would be to transfer the current value of the program counter onto the $D$ bus through the ALU onto the F bus and into the memory address register. This might be accomplished during one microcycle. The second operation might be to again put the PC on the D bus, pass it through the ALUB port and increment the value at the $B$ port and reload it into the PC register. Thus, the PC has again been updated to point to the address of the next intruction. During this time, the address from the MAR is on the address bus and we are fetching data from the external memory and placing it on the $R$ bus. The third microcycle would be to bring the data out of the external memory and pass it to the instruction register in the CCU. The next microcycle might be to decode this instruction and determine that the next word after the current instruction in memory (an immediate operation) is to be added to the value currently in the accumulator. Thus, we would again need to place the PC into the MAR on one cycle and then increment the PC on the next cycle. Following this, the data from the external memory could be brought to the R bus through the A port of the ALU and added to the accumulator value which is placed on the $D$ bus and brought through the B port of the ALU. The result would be placed in the accumulator. This operation would complete the example and we would be ready to fetch the next instruction. As can be seen, a number of microcycles are required to fetch the instruction, decode it, fetch the data and execute the instruction. One of the best ways to understand the flow needed to implement a typical instruction set is shown in Figure 11. Here, we see the basic instruction fetch and decode operation followed by the path used to execute each of the various instructions. Then, we see a return to the fetch operation to fetch the next instruction.

Certainly from this discussion we can see how three registers have enhanced the performance of the simple ALU/accumulator data path shown in Figure 7. Typically, even more registers than shown in Figure 10 are needed if we are to increase the power of


Figure 11. Steps for ADD Instruction.
our machine. If we examine the block diagram of Figure 12, we see a similar architecture to that as shown in Figure 10. Here, the number of working registers has been expanded to sixteen at the output of the ALU. These can be used to provide a program counter function and a number of accumulator functions simultaneously. In addition, note that the registers have two output ports such that the simultaneous selection of any two of the sixteen registers is possible. Both of these registers can be presented to the ALU so that operations on two registers simultaneously can be executed. In addition, a data input multiplexer is available at the A port of the ALU such that external data can be brought in to the configuration. Likewise, there is an output multiplexer such that either the A output of the registers or the ALU output can be selected. This output multiplexer is used to provide a data out port and the output can also be loaded into memory address register to provide an address as required. Thus, the architecture of Figure 12 is quite similar to that of Figure 10 except that the number of registers has been increased to provide additional flexibility.
If we assume that one of the sixteen registers inside of this register file is to be used as the program counter, we see that the program counter can be brought out of the A output port and loaded into the memory address register and at the same time it can also be brought out the B output port and incremented in ALU and reloaded into the register file. In this architecture it appears the A output of the register stack can also be brought to the input multiplexer and the A port of the ALU and incremented via that path and reloaded into the registers. While this is possible in the architecture of Figure 12, we are leading up to the implementation of an Am2901A and this path is not needed in the Am2901A. Thus, we can implement functions and operations in the diagram of Figure 12 just as we could in the diagram of Figure 10. However, what was previously performed in two microcycles can now be performed in one microcycle. That is, the MAR can be loaded with the current value of the PC and at the same time the PC can be incremented and the new value restored in the PC register.


Figure 12. Multi-Register ALU.

Another feature of the block diagram of Figure 12 is the depiction of the carry in bit to the ALU and the four output flags associated with the ALU. Here, carry in is the normal carry in as needed in any adder such that the device is cascadable. In addition, certain kinds of arithmetic functions such two's complement arithmetic also need the ability to provide a carry in for certain operations. The most common is two's complement subtract which is usually performed by complementing the operand to be subtracted, adding and adding one at the carry in. Also, the ALU shows the four output flags usually associated with a typical minicomputer. These are the carry output, the sign bit, the overflow detect, and the zero detect. These four status flags are used to determine various things about the operation being performed. The carry out flag and overflow flag are as described in the previous sections of this chapter. They provide the carry and overflow information about the addition.
The sign bit is simply the most significant bit of the ALU and represents the sign of a two's complement number. That is, when the sign bit is LOW, we assume the two's complement number is positive and when the sign bit is HIGH, we assume the two's complement number is negative. Thus, the sign bit is active HIGH and carries negative weight as we assume in any standard two's complement number representation. If the reader is unfamiliar with two's complement number notations, a discussion of this topic can be found in an application note entitled "The Am25S05, Am2505 and Am25L05 Schottky, Standard and Low Power TTL Two's Complement Digital Multipliers" as found in Advanced Micro Devices' Schottky and Low Power Schottky Data Book dated 10/77. This application note begins on page 5-49 and fully details two's complement number notation and gives examples.

The fourth status flag is called the zero flag and again is just what the name implies. This flag represents the fact that all of the ALU outputs are at logic zero. In this design, a logic zero means that all of the ALU output bits are LOW.

If the architecture of Figure 12 is extended a little more, we will arrive at the Am2901A as depicted in Figure 13. Here, we have redrawn the structure so that the registers are placed above the ALU; however, the function is identical. Two new functions have been added to this block diagram that have not previously been discussed. These are the RAM shift matrix located directly above the sixteen registers now described as a $16 \times 4$ dual port RAM. The purpose of the RAM shift network is to allow the ability of shifting the data word to be written into the register either up one bit position or down one bit position. The second function added to the block diagram is that of the Q register and shift network. Here, the Q register is used as an auxiliary register such that double length operations can be performed and it is also used in the multiply and divide algorithms. In addition, the shift network allows the $Q$ register contents to be shifted up one bit position or shifted down one bit position. In addition, it should be pointed out that the memory address register is not part of the Am2901A. This is because there were not enough pins on the package to implement the function and the additional power required by the output buffers would have reduced the performance of the ALU and register stack. Instead, this function is being designed into other 2900 family products.

## Am2901A ARCHITECTURE

A detailed block diagram of the Am2901A bipolar microprogrammable microprocessor structure is shown in Figure 14. The circuit is a four-bit slice cascadable to any number of bits. Therefore, all data paths within the circuit are four bits wide. The two key elements in the Figure 14 block diagram are the 16 -word by 4 -bit 2-port RAM and the high-speed ALU.


Figure 13. Am2901A Block Diagram.

Data in any of the 16 words of the Random Access Memory (RAM) can be read from the A-port of the RAM as controlled by the 4 -bit A address field input. Likewise, data in any of the 16 words of the RAM as defined by the $B$ address field input can be simultaneously read from the B-port of the RAM. The same code can be applied to the $A$ select field and $B$ select field in which case the identical file data will appear at both the RAM A-port and B-port outputs simultaneously.

When enabled by the RAM write enable (RAM EN), new data is always written into the file (word) defined by the $B$ address field of the RAM. The RAM data input field is driven by a 3 -input multiplexer. This configuration is used to shift the ALU output data ( $F$ ) if desired. This three-input multiplexer scheme allows the data to be shifted up one bit position, shifted down one bit position, or not shifted in either direction.

The RAM A-port data outputs and RAM B-port data outputs drive separate 4-bit latches. These latches hold the RAM data while the clock input is LOW. This eliminates any possible race conditions that could occur while new data is being written into the RAM.

The high-speed Arithmetic Logic Unit (ALU) can perform three binary arithmetic and five logic operations on the two 4-bit input words R and S . The R input field is driven from a 2 -input multiplexer, while the S input field is driven from a 3-input multiplexer. Both multiplexers also have an inhibit capability; that is, no data is passed. This is equivalent to a "zero" source operand.
Referring to Figure 14, the ALU R-input multiplexer has the RAM A-port and the direct data inputs (D) connected as inputs. Likewise, the ALU S-input multiplexer has the RAM A-port, the RAM B-port and the $Q$ register connected as inputs.



This multiplexer scheme gives the capability of selecting various pairs of the A, B, D, Q and " 0 " inputs as source operands to the ALU. These five inputs, when taken two at a time, result in ten possible combinations of source operand pairs. These combinations include $\mathrm{AB}, \mathrm{AD}, \mathrm{AQ}, \mathrm{AO}, \mathrm{BD}, \mathrm{BQ}, \mathrm{BO}, \mathrm{DQ}, \mathrm{D} 0$ and Q 0 . It is apparent that $A D, A Q$ and $A O$ are somewhat redundant with $B D$, $B Q$ and $B O$ in that if the $A$ address and $B$ address are the same, the identical function results. Thus, there are only seven completely non-redundant source operand pairs for the ALU. The Am2901A microprocessor implements eight of these pairs. The microinstruction inputs used to select the ALU source operands are the $I_{0}, I_{1}$ and $I_{2}$ inputs.
The two source operands not fully described as yet are the Dinput and Q input. The D input is the four-bit wide direct data field input. This port is used to insert all data into the working registers inside the device. Likewise, this input can be used in the ALU to modify any of the internal data files. The Q register is a separate 4-bit file intended primarily for multiplication and division routines but it can also be used as an accumulator or holding register for some applications.
The ALU itself is a high-speed arithmetic/logic operator capable of performing three binary arithmetic and five logic functions. The $I_{3}, I_{4}$ and $I_{5}$ microinstruction inputs are used to select the ALU function. The definition of these functions is shown in Figure 15. The normal technique for cascading the ALU of several devices is in a look-ahead carry mode. Carry generate, $\bar{G}$, and carry propagate, $\overline{\mathrm{P}}$, are outputs of the device for use with a carry-look-ahead-generator such as the Am2902A ('182). A carry-out, $\mathrm{C}_{\mathrm{n}+4}$, is also generated and is available as an output for use as the carry flag in a status register. Both carry-in $\left(C_{n}\right)$ and carry-out $\left(C_{n+4}\right)$ are active HIGH.

| SOURCE OPERANDS |  | DESTINATION |  |  |
| :---: | :---: | :---: | :---: | :---: |
| A, B <br> A, D <br> A, Q <br> A, 0 | $\begin{aligned} & \mathrm{B}, 0 \\ & \mathrm{D}, 0 \end{aligned}$ | SHIFT | LOAD | Y-OUT |
|  | Q, 0 | UP | RAM | F |
|  | D, Q | UP | RAM \& Q | F |
|  |  | DOWN | RAM | F |
|  |  | DOWN | RAM \& Q | F |
|  |  | NONE | NONE | F |
| ALU FUNCTIONS |  | NONE | Q | F |
| $\begin{aligned} & R+S \\ & R-S \\ & S-R \end{aligned}$ | R OR S | NONE | RAM | F |
|  | R AND S | NONE | RAM | A |
|  | R EXOR S |  |  |  |
|  | R EXNOR S |  |  |  |

Figure 15. Am2901A Microinstruction Control.

The ALU has three other status-oriented outputs. These are $F_{3}, F$ $=0$, and overflow (OVR). The $F_{3}$ output is the most significant (sign) bit of the ALU and can be used to determine positive or negative results without enabling the three-state data outputs. $F_{3}$ is non-inverted with respect to the sign bit output $Y_{3}$. The $F=0$ output is used for zero detect. It is an open-collector output and can be wire OR'ed between microprocessor slices. $\mathrm{F}=0$ is HIGH when all F outputs are LOW. The overflow output (OVR) is used to flag arithmetic operations that exceed the available two's complement number range. The overflow output (OVR) is HIGH when overflow exists; that is, when $\mathrm{C}_{\mathrm{n}+3}$ and $\mathrm{C}_{\mathrm{n}+4}$ are not the same polarity.

The ALU data output is routed to several destinations. It can be a data output of the device and it can also be stored in the RAM or the $Q$ register. Eight possible combinations of ALU destination functions are available as defined by the $I_{6}, I_{7}$ and $I_{8}$ microinstruction inputs. These combinations are shown in Figure 15.
The four-bit dața output field ( Y ) features three-state outputs and can be directly bus organized. An output control $(\overline{O E})$ is used to enable the three-state outputs. When $\overline{\mathrm{OE}}$ is HIGH, the Y outputs are in the high-impedance state.
A two-input multiplexer is also used at the data output such that either the A-port of the RAM or the ALU outputs ( $F$ ) are selected at the device $Y$ outputs. This selection is controlled by the $I_{6}, I_{7}$ and $I_{8}$ microinstruction inputs.
As was discussed previously, the RAM inputs are driven from a three-input multiplexer. This allows the ALU outputs to be entered non-shifted, shifted up one position (X2) or shifted down one position ( $\div 2$ ). The shifter has two ports; one is labeled RAM $_{0}$ and the other is labeled RAM . Both of these ports consist of a buffer-driver with a three-state output and an input to the multiplexer. Thus, in the shift up mode, the RAM ${ }_{3}$ buffer is enabled and the RAM ${ }_{0}$ multiplexer input is enabled. Likewise, in the shift down mode, the RAM ${ }_{0}$ buffer and RAM ${ }_{3}$ input are enabled. In the no-shift mode, both buffers are in the high-impedance state and the multiplexer inputs are not selected. This shifter is controlled from the $I_{6}, I_{7}$ and $I_{8}$ microinstruction inputs.
Similarly, the Q register is driven from a 3-input multiplexer. In the no-shift mode, the multiplexer enters the ALU data into the $\mathbf{Q}$ register. In either the shift-up or shift-down mode, the multiplexer selects the Q register data appropriately shifted up or down. The $Q$ shifter also has two ports; one is labeled $Q_{0}$ and the other is $Q_{3}$. The operation of these two ports is similar to the RAM shifter and is also controlled from $\mathrm{I}_{6}, \mathrm{I}_{7}$ and $\mathrm{I}_{8}$.
The clock input to the Am2901A controls the RAM, the Q register, and the $A$ and $B$ data latches. When enabled, data is clocked into the $Q$ register on the LOW-to-HIGH transition of the clock. When the clock input is HIGH, the A and B latches are open and will pass whatever data is present at the RAM outputs. When the clock input is LOW, the latches are closed and will retain the last data entered. If the RAM-EN is enabled, new data will be written into the RAM file (word) defined by the $B$ address field when the clock input is LOW.

## Am2903 GENERAL DESCRIPTION

The Am2903 is a four-bit expandable bipolar microprocessor slice that performs all functions performed by the industry standard Am2901A. In addition, it provides a number of significant enhancements that are especially useful in arithmetic oriented processors. The Am2903 contains sixteen internal working registers arranged in a two address architecture and it also provides all of the necessary signals to expand the register file externally using the Am29705 register stack. Any number of registers can be cascaded to the Am2903 using this technique. In addition to its complete arithmetic and logic instruction set, the Am2903 provides a special set of instructions which facilitate the implementation of multiplication, division, normalization and other previously time consuming operations such as parity generation and sign extension. A block diagram of the Am2903 is shown in Figure 16.

## ARCHITECTURE OF THE Am2903

The Am2903 is a high-performance, cascadable, four-bit bipolar microprocessor slice designed for use in CPU's, peripheral controllers, microprogrammable machines, and numerous other applications. The microinstruction flexibility of the Am2903 allows the efficient emulation of almost any digital computing machine.


Figure 16. Basic Am2903 Block Diagram.

The nine-bit microinstruction selects the ALU sources, function, and destination. The Am2903 is cascadable with full lookahead or ripple carry, has three-state outputs, and provides various ALU status flag outputs. Advanced Low-Power Schottky processing is used to fabricate this 48 -pin LSI circuit.
All data paths within the device are four bits wide. As shown in the block diagram of Figure 16, the device consists of a 16 -word by 4-bit, two-port RAM with latches on both output ports, a high-performance ALU and shifter, a multi-purpose Q Register with shifter input, and a nine-bit instruction decoder.

## Two-Port RAM

Any two RAM words addressed at the $A$ and $B$ address ports can be read simultaneously at the respective RAM $A$ and $B$ output ports. Identical data appear at the two output ports when the same address is applied to both address ports. The latches at the RAM output ports are transparent when the clock input, CP, is HIGH and they hold the RAM output data when CP is LOW. Under control of the $\overline{\mathrm{OE}}_{\mathrm{B}}$ three-state output enable, RAM data can be read directly at the Am2903 DB I/O port.

External data at the Am2903 Y I/O port can be written directly into the RAM, or ALU shifter output data can be enabled onto the $Y I / O$ port and entered into the RAM. Data is written into the RAM at the $B$ address when the write enable input, WE, is LOW and the clock input, CP, is LOW.

## Arithmetic Logic Unit

The Am2903 high-performance ALU can perform seven arithmetic and nine logic operations on two 4-bit operands. Multiplexers at the ALU inputs provide the capability to select various pairs of ALU source operands. The $\overline{E_{A}}$ input selects either the DA external data input or RAM output port A for use as one ALU operand and the $\overline{O E_{B}}$ and $I_{0}$ inputs select RAM output port $B$, DB external data input, or the Q Register content for use as the second ALU operand. Also, during some ALU operations, zeros are forced at the ALU operand inputs. Thus, the Am2903 ALU can operate on data from two external sources, from an internal and external source, or from two internal sources.
When instruction bits $I_{4}, I_{3}, I_{2}, I_{1}$ and $I_{0}$ are LOW, the Am2903 executes special functions. Figure 17 defines these special functions and the operation which the ALU performs for each. When the Am2903 executes instructions other than the nine special functions, the ALU operation is determined by instruction bits $\mathrm{I}_{4}$, $I_{3}, I_{2}$ and $I_{1}$. Figure 18 defines the ALU operation as a function of these four instruction bits.
Am2903s may be cascaded in either a ripple carry or lookahead carry fashion. When a number of Am2903s are cascaded, each slice must be programmed to be a most significant slice (MSS), intermediate slice (IS), or least significant slice (LSS) of the array. The carry generate, $\overline{\mathrm{G}}$, and carry propagate, $\overline{\mathrm{P}}$, signals required for a lookahead carry scheme are generated by the Am2903 and are available as ontputs of the least significant and intermediate slices.
The Am2903 also generates a carry-out signal, $\mathrm{C}_{\mathrm{n}+4}$, which is generally available as an output of each slice. Both the carry-in, $\mathrm{C}_{\mathrm{n}}$, and carry-out, $\mathrm{C}_{\mathrm{n}+4}$, signals are active HIGH. The ALU generates two other status outputs. These are negative, $N$, and overflow, OVR. The $N$ output is generally the most significant (sign) bit of the ALU output and can be used to determine positive or negative results. The OVR output indicates that the arithmetic operation being performed exceeds the available two's complement number range. The N and OVR signals are available as outputs of the most significant slice. Thus, the multi-purpose $\bar{G} / \mathrm{N}$ and $\overline{\mathrm{P}} / \mathrm{OVR}$ outputs indicate $\overline{\mathrm{G}}$ and $\overline{\mathrm{P}}$ at the least significant and intermediate slices, and sign and overflow at the most significant slice. To some extent, the meaning of the $C_{n+4}, \bar{P} / O V R$, and $\bar{G} / N$ signals vary with the ALU function being performed.

## ALU Shifter

Under instruction control, the ALU shifter passes the ALU output (F) non-shifted, shifts it up one bit position (2F), or shifts it down one bit position (F/2). Both arithmetic and logical shift operations are possible. An arithmetic shift operation shifts data around the most significant (sign) bit position of the most significant slice, and a logical shift operation shifts data through this bit position (see Figure 19). $\mathrm{SIO}_{0}$ and $\mathrm{SIO}_{3}$ are bidirectional serial shift inputs/outputs. During a shift-up operation, $\mathrm{SIO}_{0}$ is generally a serial shift input and $\mathrm{SIO}_{3}$ a serial shift output. During a shift-down operation, $\mathrm{SiO}_{3}$ is generally a serial shift input and $\mathrm{SIO}_{0}$ a serial shift output.
The ALU shifter also provides the capability to sign extend at slice boundaries. Under instruction control, the $\mathrm{SIO}_{0}$ (sign) input can be extended through $\mathrm{Y}_{0}, \mathrm{Y}_{1}, \mathrm{Y}_{2}, \mathrm{Y}_{3}$ and propagated to the $\mathrm{SIO}_{3}$ output.

| $\begin{array}{lllll}I_{8} & I_{7} & I_{6} & I_{5}\end{array}$ |  |  | Hex Code | Special Function | ALU Function | ALU Shifter Function | $\mathrm{SIO}_{3}$ |  | $\mathrm{SIO}_{0}$ | Q Reg \& Shifter Function | $\mathrm{ClO}_{3}$ | $\mathrm{QIO}_{0}$ | $\overline{\text { WRITE }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | Most Sig. Slice |  |  |  | Other Slices |  |  |  |  |  |
| L | L | L |  | 0 | Unsigned Multuply | $\begin{aligned} & F=S+C_{n} \text { if } Z=L \\ & F=R+S+C_{n} \text { if } Z=H \end{aligned}$ | $\begin{aligned} & \log F / 2 \rightarrow Y \\ & \text { (Note 1) } \end{aligned}$ | $\mathrm{Ht}-\mathrm{Z}$ | Input | $\mathrm{F}_{0}$ | Log $\mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $\mathrm{Q}_{0}$ | L |
| L L | L | H | 2 | Two's Complement Multuply | $\begin{aligned} & F=S+C_{n} \text { if } Z=L \\ & F=R+S+C_{n} \text { if } Z=H \end{aligned}$ | $\begin{aligned} & \log _{\log / 2 \rightarrow Y} \\ & \text { (Note 2) } \end{aligned}$ | Hi-Z | Input | $\mathrm{F}_{0}$ | $\log \mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $Q_{0}$ | L |
| L H | H | L | 4 | Increment by One or Two | $F=S+1+C_{n}$ | $F \rightarrow Y$ | Input | Input | Party | Hold | $\mathrm{Hi}-\mathrm{Z}$ | $\mathrm{Hi}-\mathrm{Z}$ | L |
| L H | H | L | 5 | Sign/MagnitudeTwo's Complement | $\begin{aligned} & F=S+C_{n} \text { if } Z=L \\ & F=\bar{S}+C_{n} \text { if } Z=H \end{aligned}$ | $\begin{aligned} & \hline \mathrm{F} \rightarrow \mathrm{Y} \\ & \text { (Note 3) } \end{aligned}$ | Input | Input | Parity | Hold | $\mathrm{Hi}-2$ | Hi-Z | L |
| L H | H | H | 6 | Two's Complement Multiply, Last Cycle | $\begin{aligned} & F=S+C_{n} \text { If } Z=L \\ & F=S-R-1+C_{n} \text { it } Z=H \end{aligned}$ | $\begin{aligned} & \log F / 2 \rightarrow Y \\ & \text { (Note 2) } \end{aligned}$ | Hi-Z | Input | $\mathrm{F}_{0}$ | Log $\mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $Q_{0}$ | L |
|  | L | L | 8 | Single Length Normalize | $F=S+C_{n}$ | $F \rightarrow Y$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{Hi}-\mathrm{Z}$ | $\log 2 Q \rightarrow Q$ | $\mathrm{Q}_{3}$ | Input | L |
| H L | L | H | A | Double Length Normalize and First Divide Op | $\mathrm{F}=\mathrm{S}+\mathrm{C}_{\mathrm{n}}$ | $\log 2 \mathrm{~F} \rightarrow \mathrm{Y}$ | $\mathrm{R}_{3} \forall \mathrm{~F}_{3}$ | $F_{3}$ | Input | $\log 2 Q \rightarrow Q$ | $\mathrm{Q}_{3}$ | Input | L |
| H H | H | L | C | Two's Complement Divide | $\begin{aligned} & F=S+R+C_{n} \text { if } Z=L \\ & F=S-R-1+C_{n} \text { if } Z=H \end{aligned}$ | $\underline{L o g} 2 \mathrm{~F} \rightarrow \mathrm{Y}$ | $\overline{\mathrm{R}_{3} \forall \mathrm{~F}_{3}}$ | $F_{3}$ | Input | $\log 2 Q \rightarrow Q$ | $\mathrm{Q}_{3}$ | Input | L |
| H H | H | H | E | Two's Complement Divide, Correction and Remainder | $\begin{aligned} & F=S+R+C_{n} \text { if } Z=L \\ & F=S-R-1+C_{n} \text { if } Z=H \end{aligned}$ | $F \rightarrow Y$ | $\mathrm{F}_{3}$ | $F_{3}$ | Hi-Z | $\log 2 Q \rightarrow Q$ | $\mathrm{Q}_{3}$ | Input | L |
| NOTES 1 |  |  | At the most signficant slice only, the $C_{n+4}$ signal is internally gated to the $Y_{3}$ outpu At the most significant slice only, $F_{3} \forall$ OVR is internally gated to the $Y_{3}$ output. At the most significant slice only, $S_{3} \forall F_{3}$ is generated at the $Y_{3}$ output Op codes 1, 3, 7, 9, B, D, and F are reserved for future use |  |  |  |  | $\begin{aligned} & \mathrm{L}=\text { LOW } \\ & \mathrm{H}=\mathrm{HIGH} \\ & \mathrm{X}=\text { Don't Care } \end{aligned}$ |  | $\mathrm{H}-\mathrm{Z}=\mathrm{High}$ Impedance <br> $\forall=$ Exclusive OR <br> Parity $=\mathrm{SIO}_{3} \forall \mathrm{~F}_{3} \forall \mathrm{~F}_{2}$ |  |  | $\forall F_{1} \forall F_{0}$ |

Figure 17. Special Functions: $I_{0}=I_{1}=I_{2}=I_{3}=I_{4}=$ LOW, $\overline{I E N}=$ LOW.

| $\mathrm{I}_{4}$ | $\mathrm{I}_{3}$ | $\mathrm{I}_{2}$ | $I_{1}$ | Hex Code |  | Functions |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| I | 1 | , | L | 0 | $\mathrm{I}_{0}=\mathrm{L}$ | Special Functions |
| L | L | L | L | 0 | $\mathrm{I}_{0}=\mathrm{H}$ | $F_{i}=\mathrm{HIGH}$ |
| L | L | L | H | 1 | $\mathrm{F}=\mathrm{S}$ | S R Minus 1 Plus $\mathrm{C}_{\mathrm{n}}$ |
| L | L | H | L | 2 | $\mathrm{F}=\mathrm{R}$ | $s$ S Minus 1 Plus $\mathrm{C}_{\mathrm{n}}$ |
| L | L | H | H | 3 | $F=R$ | $S$ Plus $\mathrm{C}_{\mathrm{n}}$ |
| L | H | L | L | 4 | $F=S$ |  |
| L | H | L | H | 5 | $\mathrm{F}=\overline{\mathrm{S}}$ |  |
| L | H | H | L | 6 | $F=R$ |  |
| L | H | H | H | 7 | $\mathrm{F}=\overline{\mathrm{R}}$ |  |
| H | L | L | L | 8 | $F_{1}=L$ |  |
| H | L | L | H | 9 | $\mathrm{F}_{1}=\overline{\mathrm{R}}^{\prime}$ | D S |
| H | L | H | L | A | $F_{i}=R^{\prime}$ | CLUSIVE NOR $\mathrm{S}_{\mathrm{i}}$ |
| H | L | H | H | B | $F_{i}=R^{\prime}$ | CLUSIVE OR $\mathrm{S}_{1}$ |
| H | H | L | L | C | $F_{i}=R^{\prime}$ | D $\mathrm{s}_{\mathrm{i}}$ |
| H | H | L | H | D | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}$ | R S |
| H | H | H | L | E | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}^{\prime}$ | ND $\mathrm{S}_{\mathrm{i}}$ |
| H | H | H | H | F | $F_{i}=\mathrm{R}$ |  |
| $\mathrm{L}=$ LOW |  |  |  | $H=H I G H$ |  | $1=0$ to 3 |

Figure 18. ALU Functions.

A cascadable, five-bit parity generator/checker is designed into the Am2903 ALU shifter and provides ALU error detection capability. Parity for the $\mathrm{F}_{0}, \mathrm{~F}_{1}, \mathrm{~F}_{2}, \mathrm{~F}_{3}$ ALU outputs and $\mathrm{SIO}_{3}$ input is generated and, under instruction control, is made available at the $\mathrm{SIO}_{0}$ output.


Figure 19.

The instruction inputs determine the ALU shifter operation. Figure 17 defines the special functions and the operation the ALU shifter performs for each. When the Am2903 executes instructions other than the nine special functions, the ALU shifter operation is determined by instruction bits $\mathrm{I}_{8} \mathrm{I}_{7} \mathrm{I}_{6} \mathrm{I}_{5}$. Figure 20 defines the ALU shifter operation as a function of these four bits.

## Q Register

The Q Register is an auxliary four-bit register which is clocked on the LOW-to-HIGH transition of the CP input. It is intended primarily for use in multiplication and division operations; however, it can also be used as an accumulator or holding register for some applications. The ALU output, F, can be loaded into the Q Register, and/or the Q Register can be selected as the source for the ALU S operand. The shifter at the input to the Q Register provides

|  |  |  |  | Hex Code | ALU Shifter Function | $\mathrm{SiO}_{3}$ |  | $\mathrm{Y}_{3}$ |  | $\mathrm{Y}_{2}$ |  | $Y_{1}$ | $Y_{0}$ | $\mathrm{SIO}_{0}$ | $\overline{\text { Write }}$ | Q Reg ${ }^{2}$ Shifter Function | $\mathrm{ClO}_{3}$ | $\mathrm{OIO}_{0}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 18 | $\mathrm{I}_{7}$ | ${ }_{6}$ | $\mathrm{I}_{5}$ |  |  | Most Sig. Slice | Other Slices | Most Sig. Slice | Other Slices | Most Sig Slice | Other Slices |  |  |  |  |  |  |  |
| L | L | L | L | 0 | Arth F/2 $\rightarrow Y$ | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | L | Hold | Hi.Z | Hi-Z |
| L | L | $L$ | H | 1 | $\log F / 2 \rightarrow Y$ | Input | Input | $\mathrm{SiO}_{3}$ | $\mathrm{SIO}_{3}$ | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $F_{0}$ | L | Hold | $\mathrm{Hi}-\mathrm{Z}$ | $\mathrm{H} \cdot \mathrm{Z}$ |
| L | L | H | L | 2 | Anth F/2 $\rightarrow$ Y | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | L | $\log Q / 2 \rightarrow Q$ | Input | $\mathrm{Q}_{0}$ |
| L | L | H | H | 3 | $\log F / 2 \rightarrow Y$ | Input | Input | $\mathrm{SiO}_{3}$ | $\mathrm{SIO}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | L | $\log \mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $\mathrm{Q}_{0}$ |
| L | H | L | L | 4 | $\mathrm{F} \rightarrow \mathrm{Y}$ | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $F_{0}$ | Parity | L | Hold | $\mathrm{H}-\mathrm{Z}$ | $\mathrm{H}-\mathrm{Z}$ |
| L | H | L | H | 5 | $F \rightarrow Y$ | Input | Input | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | Parity | H | $\underline{L o g} \mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $\mathrm{Q}_{0}$ |
| L | H | H | L | 6 | $F \rightarrow Y$ | Input | input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | Parity | H | $F \rightarrow 0$ | Hi-Z | $\mathrm{Hi}-\mathrm{Z}$ |
| L | H | H | H | 7 | $F \rightarrow Y$ | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | Parity | L | $F \rightarrow 0$ | Hi-Z | Hi-Z |
| H | L | L | L | 8 | Arth $2 \mathrm{~F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | Hold | HI-Z | Hi-Z |
| H | L | L | H | 9 | $\log 2 F \rightarrow Y$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | Hold | $\mathrm{H}-2$ | $\mathrm{HI}-\mathrm{Z}$ |
| H | L | H | L | A | Arth $2 F \rightarrow Y$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | $\log 2 Q \rightarrow Q$ | $Q_{3}$ | Input |
| H | L | H | H | B | Log $2 F \rightarrow Y$ | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $F_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | $\underline{L o g} 2 Q \rightarrow Q$ | $\mathrm{Q}_{3}$ | Input |
| H | H | L | L | C | $\mathrm{F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $F_{2}$ | ${ }^{\prime}{ }^{\prime}$ | $F_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{Hi}_{1} \mathrm{Z}$ | H | Hold | $\mathrm{H} \cdot \mathrm{Z}$ | $\mathrm{H} \cdot \mathrm{Z}$ |
| H | H | L | H | D | $\mathrm{F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | Hi-Z | H | $\log 2 \mathrm{Q} \rightarrow \mathrm{Q}$ | $\mathrm{Q}_{3}$ | Input |
| H | H | H | L | E | $\mathrm{SIO}_{0} \rightarrow \mathrm{Y}_{0}, Y_{1}, \mathrm{Y}_{2}, Y_{3}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SiO}_{0}$ | Input | L | Hold | $\mathrm{HI} \cdot \mathrm{Z}$ | $\mathrm{HI}-\mathrm{Z}$ |
| H | H | H | H | F | $\mathrm{F} \rightarrow \mathrm{Y}$ | $F_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | Hi-Z | L | Hold | $\mathrm{H} \cdot \mathrm{Z}$ | $\mathrm{Ht}-\mathrm{Z}$ |

Party $=\mathrm{F}_{3} \forall \mathrm{~F}_{2} \forall \mathrm{~F}_{1} \forall \mathrm{~F}_{0} \forall \mathrm{SIO}_{3}$
$\forall=$ Exclusive $O R$

L = LOW
$\mathrm{Hi}-\mathrm{Z}=\mathrm{High}$ Impedance
$H=H I G H$

Figure 20a. ALU Destination Control for $I_{0}$ or $I_{1}$ or $I_{2}$ or $I_{3}$ or $I_{4}=$ HIGH, $\overline{\operatorname{IEN}}=$ LOW.

|  | ATION | ALU SHIFTER | RAM WRITE | Q |
| :---: | :---: | :---: | :---: | :---: |
| SINGLE LENGTH SHIFT |  | UP <br> DOWN <br> ARITH UP <br> ARITH DOWN | YES | NC |
| DOUBLE LENGTH SHIFT |  | UP DOWN ARITH UP ARITH DOWN | YES | UP DOWN UP DOWN |
| Q-SHIFT |  | PASS | NO | UP DOWN |
| LOAD | RAM <br> RAM \& Q <br> Q <br> NONE | PASS | YES <br> YES <br> NO <br> NO | NC <br> LOAD <br> LOAD <br> NC |
| SIGN EXTEND |  | $\mathrm{SIO}_{0}$ | YES | NC |

NC = No Change

Figure 20b. Am2903 ALU Destination Control Summary.
the capability to shift the Q Register contents up one bit position (2Q) or down one bit position (Q/2). Only logical shifts are performed. $\mathrm{QIO}_{0}$ and $\mathrm{QIO}_{3}$ are bidirectional shift serial inputs/outputs. During a Q Register shift-up operation, $\mathrm{QIO}_{0}$ is a serial shift input and $\mathrm{QIO}_{3}$ is a serial shift output. During a shift-down operation, $\mathrm{QIO}_{3}$ is a serial shift input and $\mathrm{QIO}_{0}$ is a serial shift output.

Double-length arithmetic and logical shifting capability is provided by the Am2903. The double-length shift is performed by connecting $\mathrm{QIO}_{3}$ of the most significant slice to $\mathrm{SIO}_{0}$ of the least significant slice, and executing an instruction which shifts both the ALU output and the Q Register.

The Q Register and shifter operation is controlled by instruction bits $\mathrm{I}_{8} \mathrm{I}_{7} \mathrm{I}_{6} \mathrm{I}_{5}$. Figures 17 and 20 define the Q Register and shifter operation as a function of these four bits.

## Output Buffers

The DB and $Y$ ports are bidirectional I/O ports driven by threestate output buffers with external output enable controls. The $Y$ output buffers are enabled when the $O \mathrm{E}_{\boldsymbol{Y}}$ input is LOW and are in the high-impedance state when $\overline{\mathrm{O}}_{\mathrm{Y}}$ is HIGH. Likewise, the DB output buffers are enabled when the $\overline{O E_{B}}$ input is LOW and in the high-impedance state when $\overline{\mathrm{E}_{\mathrm{B}}}$ is HIGH.
The zero, $\mathbf{Z}$, pin is an open collector input/output that can be wire-OR'ed between slices. As an output it can be used as a zero detect status flag and generally indicates that the $Y_{0-3}$ pins are all LOW, whether they are driven from the $Y$ output buffers or from an external source connected to the $Y_{0-3}$ pins. To some extent the meaning of this signal varies with the instruction being performed.

## Instruction Decoder

The Instruction Decoder generates required internal control signals as a function of the nine Instruction inputs, $\mathrm{I}_{0-8}$; the Instruction Enable input, $\overline{I E N}$; the $\overline{\mathrm{LSS}}$ input; and the $\overline{\text { WRITE }} / \overline{\mathrm{MSS}}$ input/output. The WRITE output is LOW when an instruction which writes data into the RAM is being executed.

When $\overline{\text { EN }}$ is LOW, the WRITE output is enabled and the Q Register and Sign Compare Flip-Flop can be written according to the Am2903 instruction. The Sign Compare Flip-Flop is an onchip flip-flop which is used during an Am2903 divide operation.

## Programming the Am2903 Slice Position

Tying the $\overline{\mathrm{LSS}}$ input LOW programs the slice to operate as a least significant slice (LSS) and enables the WRITE output signal onto the WRITE/MSS bidirectional I/O pin. When LSS is tied HIGH, the $\overline{\text { WRITE }} / \overline{\text { MSS }}$ pin becomes an input pin; tying the WRITE/MSS pin HIGH programs the slice to operate as an intermediate slice (IS) and tying it LOW programs the slice to operate as a most significant slice (MSS). This is shown in Figure 21.


Figure 21. Am2903 - 16-Bit CPU with Carry Look Ahead.

## EXPANDING THE NUMBER OF Am2903 REGISTERS

The Am2903 contains 16 internal working registers configured in a standard two port architecture. The number of working registers in the ALU configuration can be increased by utilizing the Am29705 16-word by 4-bit two-port RAM. Any number of Am29705's can be connected to the Am2903 to increase the number of working registers. Figure 22 shows a block diagram of the basic Am29705. As is seen, the device consists of a 16 word by 4 bit two port RAM with latches at the $A$ and $B$ outputs similar to the RAM contained within the Am2903. Each of the latch outputs has three state drivers capable of driving the DA and DB inputs of the Am2903. The Am29705 is a non-inverting device. That is, data presented at the inputs is stored in the RAM and when brought to the RAM outputs, it is non-inverted from when it was orginally brought into the device.

The technique for using the Am29705 to expand the number of registers in the Am2903 can best be visualized by referring to Figures 23 and 24 simultaneously. In Figure 23, the data bus connections are shown such that the Am2903 Y output is used to drive the Am29705 inputs. Here, we also assume this bus may be tied to a data bus through a bi-directional buffer. In Figure 23, the A outputs of the Am29705 are connected together and also connected to the DA input of the Am2903. Likewise, the B outputs from the Am29705 are also shown connected to the DB inputs of the Am2903. In all cases, we are assuming 16-bit data busses. Thus, four Am2903's are assumed and eight Am29705's are assumed. As shown in Figure 23, one of the write enable inputs to the Am29705 is tied to the latch enable input of the Am29705 and these pins are also tied to the clock input of the Am2903. This allows the latches in the Am29705 to perform identically to those in the Am2903.


MPR-532
Figure 22. Am29705 Block Diagram.

If we refer to Figure 24, we see the connections required to set up the addressing for additional registers associated with the Am2903. Here, three two-line to four-line decoders are used to properly control the $A$ address, $B$ address and write enable signals to the devices. As shown in Figure 24, the four A address lines are all tied in parallel between the Am2903 and the Am29705's. The two-line to four-line decoder is used to enable the appropriate output enable from the Am29705's or switch the EA MUX inside the Am2903 such that the proper register is selected. The $B$ address operates in a similar fashion in that the four B address lines are also all tied together. Likewise, a two-line to four-line decoder is used to properly select the output enable of either the Am29705's or the Am2903 such that the correct source


Figure 23. Am2903 - Data Bus Cascading.
operand register is selected. In addition, a two-line to four-line decoder is used to control the write enable signal such that only one register is written into as a destination. This is controlled by properly selecting the write enable of either the Am2903 or the Am29705 as determined by the two most significant bits of the B address.

If this technique is used properly, any number of Am29705's can be used in conjunction with the Am2903. It may be necessary to use either a three-line to eight-line decoder or perhaps even a larger circuit to decode the more significant bits of the $A$ and $B$ addresses. Likewise, the write enable signal must be controlled so that the correct destination register will be written.

## UNDERSTANDING BIT SLICE TIMING

Perhaps one of the most important aspects of designing with either the Am2901A or the Am2903 is understanding the calculations required to compute the worst case AC performance. In order to perform these calculations, we have selected a number of standard Schottky devices and assigned minimum, typical and maximum speeds at $25^{\circ} \mathrm{C}$ and 5 V for use in these calculations as shown in Figure 25. Certainly the design engineer should use the exact specifications of the devices he has selected for his design in order to perform the worst case calculations. What is intended here is an understanding of the technique to perform these calculations and some method to allow a comparison of the Am2901A and Am2903 in terms of their AC performance. Since at the time of this writing the Am2903 is still being characterized, only the typical AC data is currently available. Thus, all calculations will be made using the typical AC times such that we can compare the Am2901A with the Am2903. When final characterization data on the Am2903 is available, the designer can then compute his performance by selecting the appropriate temperature range and power supply variations as required by his design.
Figure 26 shows the typical AC calculations for the functions usually considered in an Am2901A design. These functions are usually the speed for a logic operation, arithmetic operation, logic operation with shift and arithmetic operation with shift. In each case, we are computing speeds from the LOW-to-HIGH transition of a clock through an entire microcycle to the next LOW-to-HIGH transition of a clock.


Figure 24. Am2903 - RAM Address Cascading.

| DEVICE \& PATH | MIN. | TYP. | MAX. |
| :---: | :---: | :---: | :---: |
| S Register Clock to Output $\overline{\mathrm{OE}}$ to Output Set-Up | 5 | $\begin{array}{r} 9 \\ 13 \\ 2 \end{array}$ | $\begin{aligned} & 15 \\ & 20 \end{aligned}$ |
| S MUX <br> Data to Output Select to Output OE to Output |  | $\begin{array}{r} 5 \\ 12 \\ 13 \end{array}$ | $\begin{array}{r} 8 \\ 18 \\ 20 \end{array}$ |
| Microprogram PROM <br> Address to Output OE to Output |  | $\begin{aligned} & 30 \\ & 18 \end{aligned}$ | $\begin{aligned} & 50 \\ & 25 \end{aligned}$ |
| Mapping PROM Address to Output $\overline{\mathrm{OE}}$ to Output |  | $\begin{aligned} & 25 \\ & 18 \end{aligned}$ | $\begin{aligned} & 45 \\ & 25 \end{aligned}$ |
| Decoder <br> Select to Output |  | 8 | 12 |
| Counter <br> Clock to Q Clock to TC CET to TC Data Set-Up Load Set-Up CEP or CET Set-Up | $\begin{array}{r} 8 \\ 16 \\ 12 \end{array}$ | $\begin{array}{r} 9 \\ 12 \\ 8 \\ 4 \\ 10 \\ 7 \end{array}$ | $\begin{aligned} & 13 \\ & 18 \\ & 12 \end{aligned}$ |
| S-EXOR <br> IN to OUT |  | 7 | 11 |
| Am2922 <br> Clock to Output Data to Output $\overline{\mathrm{OE}}$ to Output Data Set-Up | 10 | $\begin{array}{r} 21 \\ 13 \\ 10 \\ 5 \end{array}$ | $\begin{aligned} & 32 \\ & 19 \\ & 17 \end{aligned}$ |
| Am29811A Input to Output |  | 25 | 35 |
| Am29803A Input to Output |  | 25 | 35 |
| Am2902A <br> $C_{n}$ to $C_{n+x, y, z}$ <br> G, P to G, P <br> $G, P$ to $C_{n+x, y, z}$ |  | 7 7 5 | $\begin{array}{r} 11 \\ 10 \\ 7 \end{array}$ |

Figure 25. Standard Device Schottky Speeds.

Similarly, Figure 27 shows the same type of computations for an Am2903 system. There is one very important distinction that should be made in computing the timing of an Am2903 16-bit ALU when compared with an Am2901A ALU in that in the Am2903, the shifter is at the output of the ALU and is followed by the zero detector. Thus, in an Am2903 design, the flags are no longer
independent of the shift operation. This is easily seen in Figure 27.

By way of comparison, Figure 28 shows speeds for the four types of operations for the Am2901A 16-bit system as compared with the Am2903 16-bit system.
a)

LOGIC OPERATION
SPEED COMPUTATIONS

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| S - REG | CP to Q | 9 | 9 | 9 |
| 2901A | READ-MODIFY-WRITE | 55 | - | - |
| 2901A | AB - Y | - | 45 | - |
| 2901A | AB - Zero | - | - | 65 |
| S-REG | SET-UP D | - | 2 | 2 |
| TOTAL-ns |  | 64 | 56 | 76 |

PATH $1-\square$
PATH 2 - -
PATH 3 - - -
MPR-535
b)


ARITHMETIC OPERATION SPEED COMPUTATIONS

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| S-REG | CP to Q | 9 | 9 | 9 |
| 2901A | AB to GP | 40 | 40 | 40 |
| 2902A | GP to $C_{n+x y z}$ | 5 | 5 | 5 |
| 2901A | SET-UP C | 40 | - | - |
| 2901A | C $_{n}$ to $Y$ | - | 20 | - |
| 2901A | C $_{n}$ to Zero | - | - | 35 |
| S-REG | SET-UP D | - | 2 | 2 |
| TOTAL-ns |  | 94 | 76 | 91 |

PATH 1
PATH 2 —
PATH 3 $\qquad$
MPR-536

Figure 26. Typical AC Calculations for the Am2901A.
c)

LOGIC OPERATION WITH SHIFT
SPEED COMPUTATIONS

| DEVICE NO. | DEVICE PATH | PATH 1 | PATH 2 | PATH 3 |
| :--- | :--- | :---: | :---: | :---: |
| S - REG | CP to Q | 9 | 9 | 9 |
| 2901A | AB to RAM 03 | 60 | - | - |
| S-MUX | D to Y | 5 | - | - |
| 2901A | SET-UP RAM |  |  |  |
| 293 | 15 | - | - |  |
| 2901A | AB to Y | - | 45 | - |
| S-REG | AB to Z | - | - | 65 |
| SOTAL-ns | SET-UP D | - | 2 | 2 |

PATH 1 -
PATH 2
PATH 3


PATH 1
PATH 2
PATH 3 …

Figure 26. (Cont.)


Figure 26. (Cont.)


Figure 27. Typical AC Calculations for the Am2903.


Figure 27. (Cont.)


Figure 27. (Cont.)

| Functional <br> Operation | Am2901A | Am2903 |
| :--- | :---: | :---: |
| Logic | 76 | 83 |
| Arithmetic | 94 | 113 |
| Logic with Shift | 89 | 109 |
| Two's Complement <br> Arithmetic with <br> Shift Down | 101 | 151 |
| Magnitude Only <br> Arithmetic with <br> Shift Down | 127 |  |

Figure 28. Summary of Am2901A and Am2903 AC Performance in a 16-Bit Configuration.

## USING THE Am2903 IN A 16-BIT DESIGN

Perhaps the best technique for understanding the design of the 16 -bit ALU is to simply take an example. Figure 29 shows a block diagram overview of four Am2903's with the appropriate shift matrix control, status register, MAR and the usual interface to a CCU and main memory. This block diagram represents the normal data handling path associated with a simple 16-bit minicomputer. If we expand this block diagram to show what would normally be considered to be the complete 16 -bit central processing unit, the block diagram of Figure 30 results. Here, we see the Am2903's surrounded by a typical set of MSI support chips. In addition, the block diagram shows a typical computer control unit as described in Chapter 2 of this series. Thus, all of the blocks are
now in place to show a simple 16-bit microcomputer built using the Am2900 family devices. The full design for such a machine is shown in Figure 31.

Figures 31A, Figure 31B and Figure 31C detail the connection of each IC used in this design. Quite simply, the design can be described as follows. Figure 31A represents the microprogram sequencer portion of the design. U1, U2 and U3 are the instruction register that receive a 16-bit instruction from main memory. U4, U5 and U6 are the mapping PROMs used to decode the OP code portion of the instruction to arrive at a starting address for the microprogram sequencer. The microprogram sequencer is the Am2910 and is shown as U7. The branch address pipeline register is U8, U9 and U10 and can be enabled to the D inputs of the Am2910 sequencer to provide the jump address from microcode. The pipeline register for the instruction inputs to the Am2910 is U14. This machine also has the ability to select the A and $B$ addresses for the Am2903 devices from the microprogram as well as the instruction register and U11 and U12 provide this capability as a part of the pipeline register. U13 is a two line to four line decoder used as part of the control for the $A$ and $B$ address select for the Am2903's. U15 is part of the pipeline register and provides both true and complement outputs for bit 11. U16 and U17 represent a one of sixteen decoder whose output can be applied to the DA bus to allow the implementation of all the bit operations. These include bit set, bit clear, bit toggle and bit test. U18 and U19 are PROM's that provide the ablity to enter one of thirty-two preprogrammed constants onto the DA bus.
Figure 31 B is predominately the data handling portion of the design. Here, U20 and U21 represent a data register that receives data from the data bus. U26, U27, U28 and U29 are the four Am2903's that form a 16-bit register/ALU combination. U30 is the carry look ahead generator for the ALU section. U22, U23


Figure 29. Am2903 with Shift Mux and Status Register.


Figure 30.
and U24 represent the status register with the ability to save and restore the flags in main memory. U25 is the condition code multiplexer for the microprogram sequencer. U33, U34, U35 and U36 represent the shift linkage multiplexers that tie together the internal shifters within the Am2903's. U37 is part of the pipeline register and provides both true and complement outputs of a number of the microprogram bits. U38 is part of the carry in logic control such that double length arithmetic operations can be performed. U31 and U32 are the data out register that can be used to accept data from the Am2903s and enable this data onto the data bus. U39 and U40 represent the memory address register and are used to hold the address provided from the CPU to main memory.
The microprogram store is shown in Figure 31C. Here, we have used both the $512 \times 8$ registered PROM's and $512 \times 4$ nonregistered PROM's in this design. A total of 68 microprogram bits have been depicted in this design. These are shown so that maximum flexibility is achieved. In most typical designs some 10 to 20 of these bits would not be used. Figure 31C shows four 512-word by 8-bit registered PROM's (U41, U42, U43 and U44). It also shows nine 512-word by 4-bit PROM's represented as U45 through U53.

Perhaps the best way to review the design is to simply understand the function of each of the microprogram control bits. If the purpose of each of these bits is well understood, the design engineer will be well along in understanding the design of the simple minicomputer CPU presented here.

## The Microprogram Structure

The microprogram for the design shown in Figure 31 is 68 bits wide. The functions of the microprogram control bits are as follows:

Bits PLO through PL8

Bits PL9,
PL10, PL11

Bits PL12
through PL14
( $\mu 12$ through $\mu 14)$

The 9 instruction bits of the Am2903 superslices.

The IEN, EA, $\overline{O E B}$ control inputs of the Am2903 superslices, respectively. PL11 is also connected to the data-in registers (U20 and U21) output-enable. This connection assures that there will be no conflict on the DB pins. Select the source for SIO of the Am2903, both for shift-up and for shift-down operations. The following table summarizes the functions of these bits

| Microprogram Bits |  |  | $\begin{gathered} \mathrm{SIO}_{\mathrm{n}} \\ \text { (Shift-down) } \end{gathered}$ | $\begin{gathered} \text { SIO }_{o} \\ \text { (Shift-up) } \end{gathered}$ |
| :---: | :---: | :---: | :---: | :---: |
| L | L | L | 0 | 0 |
| L | L | H | $\mathrm{SIO}_{0}$ | $\mathrm{SIO}_{n}$ |
| L | H | L | $\mathrm{QIO}_{0}$ | $\mathrm{QIO}_{n}$ |
| L | H | H | Carry | Carry |
| H | L | L | Zero | Zero |
| H | L | H | Sign | Sign |
| H | H | L | Not allocated | Not allocated |
| H | H | H | 1 | 1 |

Bits PL15 through PL17 ( $\mu 15$ through $\mu 17)$

Select the source for QIO of the Am2903, both for shift-up and shift-down operations. The following table summarizes the functions of these bits

| $\begin{gathered} \text { Micr } \\ 17 \end{gathered}$ | 16 | $\begin{aligned} & \text { Bits } \\ & 15 \end{aligned}$ | QIO $_{n}$ (Shift-down) | $\begin{gathered} \text { QlO }_{0} \\ \text { (Shift-up) } \end{gathered}$ |
| :---: | :---: | :---: | :---: | :---: |
| L | L | L | 0 | 0 |
| L | L | H | $\mathrm{SIO}_{0}$ | $\mathrm{SIO}_{n}$ |
| L | H | L | $\mathrm{QIO}_{0}$ | $\mathrm{QlO}_{n}$ |
| L | H | H | Carry | Carry |
| H | L | L | Zero | Zero |
| H | L | H | Sign | Sign |
| H | H | L | Not allocated | Not allocated |
| H | H | H | 1 | 1 |

Bit PL18 When LOW, enables the MAR clock input, i.e. the data appearing on the Y output pins of the Am2903 Superslices ${ }^{\text {tM }}$ will be clocked into the MAR at the LOW-to-HIGH transition of the clock pulse.

Bit PL19 When LOW, enables the MAR output onto the Memory Address Bus.

Bit PL20 When LOW, enables the data output register clock, i.e. the data appearing in the $Y$ output pins of the Am2903 Superslices ${ }^{\text {TM }}$ will be clocked into the data output registers (U31 and U32) at the LOW-to-HIGH transition of the clock pulse.
Bit PL21 When LOW, enables the data output registers onto the Data Bus

When LOW, enables the data-in register clock, i.e. the data appearing in the Data-Bus will be clocked into the data-in registers at the LOW-to-HIGH transition of the clock pulse.

Bit PL23

Bits PL24
through PL27

Bits PL28 This is a 4-bit wide field, which can be through PL31 used for either the A-address of the Am2903 superslice or to designate one of sixteen bits to the DA inputs of the Am2903 superslice via the Am2921's ( $\mu 16$ and $\mu 17$ ).

Bits PL32 Select the source for the Am2903 A-address, and PL33 according to the table below:

| Bits |  | A-Address Source |
| :---: | :---: | :--- |
| 33 | $\mathbf{3 2}$ |  |
| L | L | Data Bus bits 0 through 3 |
| L | $H$ | Microprogram bits 28 through 31 |
| $H$ | $L$ | Data Bus bits 4 through 7 |
| $H$ | $H$ | Microprogram bits 24 through 27 |

Bit PL34 Selects the source of the Am2903 B-address, according to the table below:

| Bit <br> 34 | B-Address Source |
| :---: | :---: |
| L | Data Bus bits 4 through 7 |
| H | Microprogram bits 24 through 27 |



Figure 31a.



Figure 31b.



Figure 31c.


Bit PL35

Bits PL36 Affect the status register input signals, acand PL37

Is the $C_{n}$ input of the least significant Am2903 via an Am74S157 mux ( $\mu 38$ ).

Bit PL51

Bits PL52 through PL55
Bits PL56 through PL67

Is the condition code polarity control. When HIGH, the condition code selected will pass noninverted. When LOW, the selected condition code will be complemented.
Are the I inputs of the Am2910 sequencer.
This is a 12 -bit wide field and it serves, usually as the next microprogram address. However, the 5 least significant bits of this field (bits 56-60) serve also as an address field of the Am29771 "constant" PROM's (U18 and U19).

## Some Sample Microroutines

Figure 32 shows the microprogram code for a few sample microroutines. Different addressing schemes are demonstrated with the "ADD" operation. All the other arithmetic or logic operations can be easily programmed by substituting the $I_{1}-I_{4}$ field of the Am2903 with the appropriate function. Since the main memory address is generated by the Am2903 superslices, the internal register No. 15 serves as the program counter.
The following is a description of some sample microroutines. The reader should refer to the description of the microprogram bits given earlier in this chapter and to the data sheets of the Am2910 sequencer and of the Am2903 superslice.

## Microword INIT.

This microword should be at address 0 and when the machine is reset, the Am2910 will start executing from here. The purpose of this location is to reset the machine program counter (Register 15) to zero. Ultimately more microinstructions can be added, should the necessity of other reset functions arise.

Bits 1-4 (Am2903 $I_{1}-I_{4}$ ) being $8_{H}$ will cause the superslices to generate all zeroes at the F -points (internal). Bits 5-8 (Am2903 $I_{5}-l_{8}$ ) being $F_{H}$ will cause this data (all zeroes) to appear on the $Y$ outputs. Bit 9 ( $\overline{(E N)}$ ) is LOW and therefore, WRITE will be LOW and this data will be written into the internal register selected by the B-address inputs. Bit 34 is HIGH; therefore, microprogram bits 24-27 will be selected as $B$ address source. Since $F_{H}$ is in these bits, all zeroes will be written into the program counter (Register 15). Bit 18 is LOW; therefore, the data at the Y outputs (all zeroes) wil be latched into the MAR at the next clock pulse. Bits 36 and 37 are set such that the flags will be updated, namely $C Y=N=O V F=0, Z=1$.
Bits 42,43 are both LOW so no memory reference signal is sent to the main memory (the MAR is still in an undetermined state). Bits 52-55 (Am2910 I) are set to $\mathrm{E}_{\mathrm{H}}$ which will force the sequencer to continue to the next sequential address (1) as the Cl (bit 23) is HIGH.
Bits 21 and 39 are both HIGH to ensure that there is no conflict on the data bus though in this case one of them could be a DON'TCARE. Bit 38 could also be a DON'T-CARE as the carry is zeroed by the ALU. Making a HIGH in bit 46 enables executing this microstep without disturbing the Am2910 sequencer's internal register which at power-up has no significance but may be important, should a software restart be issued.
All the other bits are DON'T-CAREs.

## Microword FETCH

This is the first step in the machine instruction fetch routine. In this step, the main memory is addressed by the MAR, a read signal is issued (bit $42=$ HIGH), and the machine instruction (macroinstruction) is placed on the data bus by the memory. It is

|  | PL | 1 | 2910 |  |  |  | DA CONS BIT |  | MMW MMR |  | $\overline{\text { IRE }}$ | POL | FDOE | $\mathrm{CY}=0$ | Flags |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | CCP | $\overline{\mathbf{C C}}$ | CLEN | $\overline{\text { RLD }}$ |  |  |  |  |  |  |  |  |  |
| Number of Bits | 12 | 4 | 1 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 |
| Bit No. | $\begin{aligned} & \hat{\circ} \\ & \dot{\oplus} \end{aligned}$ | $\begin{aligned} & \stackrel{\leftrightarrow}{6} \\ & \text { ஸin } \end{aligned}$ | 5 | $\begin{aligned} & \text { Ọ } \\ & \dot{\$} \end{aligned}$ | F | 9 | \% | \# | \% | \% | $\overline{7}$ | \% | ¢ | $\infty$ | $\begin{aligned} & \hat{\text { Nop }} \end{aligned}$ |
| INIT | X | E | X | X | X | 1 | X | x | 0 | 0 | x | X | 1 | 0 | 2 |
| FETCH | X | E | X | X | X | 1 | X | X | 0 | 1 | 0 | X | 1 | 0 | 0 |
| FETCH + 1 | X | 2 | X | X | X | 1 | X | x | 0 | 0 | 1 | x | 1 | 0 | 0 |
| ADD | FETCH + 1 | 7 | $x$ | x | 1 | 1 | x | x | 0 | 1 | 0 | x | 1 | 0 | 2 |
| ADDIMM <br> ADDIMM + 1 | $\begin{gathered} x \\ \text { FETCH + } 1 \end{gathered}$ | E | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & x \\ & x \end{aligned}$ | $\begin{aligned} & \mathrm{X} \\ & 1 \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & 0 \\ & 0 \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & 1 \\ & 0 \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & 0 \\ & 0 \end{aligned}$ | $\begin{aligned} & \hline 0 \\ & 2 \end{aligned}$ |
| $\begin{aligned} & \text { ADD DIR } \\ & \text { ADD DIR + } 1 \\ & \text { ADD DIR + } 2 \end{aligned}$ | $X$ <br> $X$ <br> ADDIMM +1 | E | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & \hline \mathrm{X} \\ & \mathrm{X} \\ & 1 \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & x \end{aligned}$ | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | $\begin{aligned} & 0 \\ & 0 \\ & 0 \end{aligned}$ | $\begin{aligned} & 1 \\ & 0 \\ & 1 \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & \hline x \\ & x \\ & x \end{aligned}$ | $\begin{aligned} & 1 \\ & 1 \\ & 1 \end{aligned}$ | $\begin{aligned} & 0 \\ & 0 \\ & 0 \end{aligned}$ | $\begin{aligned} & 0 \\ & 0 \\ & 0 \end{aligned}$ |
| $\begin{aligned} & \text { ADD RR1 } \\ & \text { ADD RR1 + } 1 \\ & \text { ADD RR1 + } 2 \end{aligned}$ | $\mid x$ <br> $X$ <br> FETCH +1 | E | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | X X X | X X 1 | $\begin{aligned} & 1 \\ & 1 \\ & 1 \end{aligned}$ | X X X | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | 0 0 0 | 1 0 1 1 | 1 1 0 | $\begin{aligned} & \mathrm{x} \\ & \mathrm{x} \\ & \mathrm{x} \end{aligned}$ | 1 1 1 | 0 0 0 | 0 0 2 |


|  | $\mathrm{C}_{\mathrm{n}}$ | B | A 2903 | $\mathrm{R}_{2}$ | $\mathrm{R}_{1}$ | $2910$ $\mathbf{C l}$ | $\overline{\text { DDBE }}$ | $\overline{Y-D}$ | $\bar{E}$ |  |  | Q | S | OEB | $\overline{E A}^{2}$ | $\frac{2903}{\text { IEN }}$ | $I_{5-8}$ | $\mathrm{I}_{1-4}$ | $\mathrm{I}_{0}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Number of Bits | 1 | 1 | 2 | 4 | 4 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 3 | 1 | 1 | 1 | 4 | 4 | 1 |
| Bit No. | ¢0 | \# | ल్ల | - | N N | N |  | $\bar{\sim}$ | N | - | $\cdots$ | $\stackrel{\text { N }}{\text { ف̇ }}$ | $\underset{\text { ̇ }}{\stackrel{ \pm}{\text { ® }}}$ | $F$ | 으 | の | ¢ | $\pm$ | 0 |
| INIT | X | 1 | X | X | F | 1 | X | 1 | X | X | 0 | X | X | X | X | 0 | F | 8 | X |
| FETCH | X | X | X | X | X | 1 | 1 | 1 | 1 | 0 | 1 | X | X | 0 | X | 1 | X | X | X |
| FETCH + 1 | 1 | 1 | X | X | F | 1 | 1 | 1 | 1 | 0 | 0 | X | X | 0 | X | 0 | F | 4 | 0 |
| ADD | 0 | 0 | 0 | X | X | 1 | 1 | 1 | 1 | 0 | 1 | X | X | 0 | 0 | 0 | F | 3 | 0 |
| ADDIMM | 1 | 1 | X | X | F | 1 | 0 | 1 | 1 | 0 | 0 | X | X | 0 | X | 0 | F | 4 | 0 |
| ADDIMM + 1 | 0 | 0 | 0 | X | X | 1 | 1 | 1 | 1 | 0 | 1 | X | X | 1 | 0 | 0 | F | 3 | 0 |
| ADD DIR | 1 | 1 | X | X | F | 1 | 0 | 1 | 1 | 0 | X | X | X | 0 | X | 0 | F | 4 | 0 |
| ADD DIR + 1 | 0 | X | X | X | X | 1 | 1 | 1 | 1 | X | 0 | X | X | 1 | X | 1 | X | 4 | 0 |
| ADD DIR + 2 | 0 | X | 3 | X | F | 1 | 0 | 1 | 1 | 0 | 0 | X | X | X | 0 | 1 | F | 6 | X |
| ADD RR1 | 0 | X | 0 | X | X | 1 | X | 1 | 1 | X | 0 | X | X | X | 0 | 1 | F | 6 | X |
| ADD RR1 + 1 | 0 | X | 3 | X | F | 1 | 0 | 1 | 1 | 0 | 0 | X | X | X | 0 | 1 | F | 6 | X |
| ADD RR1 + 2 | 0 | 0 | 2 | X | X | 1 | 1 | 1 | 1 | 0 | 1 | X | X | 1 | 0 | 0 | F | 3 | 0 |

1. 4-bit fields in hex, others in octal.
2. $X=$ Don't Care.

Figure 32. Example Microcode for Figure 31 Design.
latched into the instruction register (U1, U2, and U3) at the next clock LOW-to-HIGH transition (bit $41=$ LOW). It is assumed that if a relatively slow main memory is used, the clock is halted until the data is stable on the data bus and the register set up times are met. We will see in a later chapter how easy it is to implement this requirement using the Am2925 clock generator. The same assumption will also be made in a memory write cycle.
Bit 9 (Am2903 $\overline{\text { IEN }}$ ) is HIGH; thus, we don't care what the ALU does during this microstep. We prevent the flags from changing by setting bits $36-38$ LOW. Also, the registers at the $Y$ output have the $\bar{E}$ input HIGH (bits 18, 20). Bits 21 and 39 are both HIGH; thus, the data bus is free to accept data from the main memory (bit 42 is HIGH, signaling memory read request). The MAR is enabled to the address bus (bit $19=$ LOW) and at the next clock, the macroinstruction will be latched into the instruction registers (bit 41 = LOW). The Am2910 sequencer will continue to the next instruction (bits 52-55 = $\mathrm{E}_{\mathrm{H}}$ ).

## Microword FETCH + 1

This is the second step in the macroinstruction fetch routine. The instruction already resides in the instruction registers U1, U2 and U3).
The Am2910 sequencer receives a JUMP MAP instruction (bits 52 though $55=2$ ). The next microinstruction will begin to execute the present macroinstruction - according to the mapping PROM.
We use this microstep to update (increment) the program counter (Register 15). Bit 34 being HIGH, microprogram bits 24-27 ( $=F_{H}$ ) will be the B address. The Am2903 $\overline{\mathrm{OEB}}$ and $\mathrm{I}_{0}$ are LOW, therefore, the contents of Register 15 will serve as the S operand for the ALU. $C_{n}$ being HIGH, a 4 in the $I_{1}-I_{4}$ field will increment this value. $\overline{\mathrm{IEN}}=\mathrm{LOW}$ with $\mathrm{I}_{5}-\mathrm{I}_{8}=\mathrm{F}$ will write this (incremented) value into the same register (R15). At the same time, the MAR is also updated (bit 18 LOW).
We could update the program counter and the MAR in the previous microstep (location FETCH), but then we had to leave the ALU idle during this microcycle. By adopting the present scheme, we can overlap the first step of the macroinstruction fetch routine (the memory-read cycle) with the execution of the last step of the previous macroinstruction - provided the memory and the data bus are free to perform it. The JUMP MAP cycle is always necessary - and that is why we prefer to update the PC at this step.

## Microword ADD

This is a sample register-to-register operation. The two operands reside in the internal registers pointed to by the two 4-bit fields of the macroinstruction:

| 15 | 87 |  |
| :---: | :---: | :---: |
| OPCODE | 1st Operand and <br> Destination Register <br> Number | 2nd Operand <br> Register Number |

Bits 32-33 are set LOW, instruction register bits 0-3 are selected as $A$ address. Bit $34=$ LOW selects instruction register bits 4-7 as $B$ address (see Fig. above). Bit $1\left(\mathrm{I}_{0}\right)$, bit $10(\overline{\mathrm{EA}})$ and bit $11(\overline{\mathrm{OEB}})$ are also LOW; therefore, the contents of the selected registers will be presented to the ALU's R and S inputs. Bits 1-4 $\left(I_{1}-I_{4}\right)=3$, the ALU will perform:

$$
F=R \text { plus } S \text { plus } C_{n} \text {. }
$$

Note that bit 35 and 38 are LOW. With $\mathrm{I}_{5}-\mathrm{I}_{8}($ bits $5-8)=\mathrm{F}_{\mathrm{H}}$ and $\overline{\text { IEN }}$ (bit 0 ) $=$ LOW, the result will be written into the internal register pointed at by the $B$ address lines.

Bits 18 and 20 are HIGH and inhibit the MAR and the data out registers from being affected, while bits 36, 37 (=2) allow the flags to assume values according to the result of the operation.
During the execution of the function required (ADD in this example) we fetch the next OP CODE from the main memory. The MAR is enabled to the address bus (bit $19=$ LOW) and a memory read is requested (bit $42=$ HIGH). At the end of this microstep the next macroinstruction will be latched into the instruction registers (bit 41 = LOW).
The Am2910 sequencer is instructed to select the pipeline register bits 56-67 as the next microprogram address (bits 52-57 = 7, bit $47=$ HIGH) where the location of FETCH +1 ( 2 in this example) is written. The next step will be JUMP MAP and update PC.

## Microword ADD IMMEDIATE

This 2 step microroutine adds the contents of an internal register, pointed at by bits $0-3$ of the macroinstruction with its second word, placing the result into the internal register pointed at by bits 4-7 of the OPCODE.


First word of the macroinstruction

15


Second (next consecutive) word of the macroinstruction

The first step is to read the first operand from the memory (bit 19 $=$ LOW, bit $42=$ HIGH) and to latch it into the data-in register (U20 and U21) (bit 22 = LOW). At the same time the ALU updates (increments) the program counter (register 15) and the MAR (bit 18 = LOW). (Compare the location FETCH + 1 ). The Am2910 sequencer will continue to the next microprogram address (compare to location FETCH).
Location ADDIMM +1 is the second step of this macroinstruction. It is very similar to location ADD, the only difference is that bit $11(\overline{\mathrm{OEB}})$ is HIGH, selecting the Data-in register as source for the ALU's S operand. The same macroinstruction fetch overlap technique is used again.

## Microword ADD DIRect

This is the starting location to execute a macroinstruction where the second word is the address of the operand:

| 87 |  | 43 |  | 0 |
| :--- | :---: | :---: | :---: | :---: |
| OPCODE | Result <br> Register Number | 2nd Operand <br> Register Number |  |  |

First word of the macroinstruotion

## Address of the 1st operand

Second (next consecutive) word of the macroinstruction
The first step is to read the second word of the macroinstruction into the Data-in register. This microword is identical to the one written at location ADDIMM.

## Microword ADD DIR + 1

The Data-in register now contains the address of the operand. We have to transfer it into the MAR.
With $\mathrm{I}_{0}$ (bit 0) LOW and $\overline{\mathrm{OEB}}$ (bit 11) HIGH, the ALU's operand will be the DB bus, i.e., the Data-in register. $I_{1}-I_{4}$ (bits $1-4$ ) $=4$ will pass this input to its output, as $\mathrm{C}_{\mathrm{n}}$ (bit 3 ) is LOW. With IEN (bit 9$)=$ HIGH, the WRITE line will be HIGH too, assuring that the internal registers maintain their contents. Since $I_{5}-I_{8}$ (bits $\left.5-8\right)=F_{H}$, the ALU output will appear on the Am2903 Y pins. This data which is actually the operand address and will be transferred into the MAR at the next clock cycle. The Am2910 sequencer continues to the next consecutive microstep.

## Microword ADD DIR + 2

Now we read in the operand from the main memory. The MAR is enabled to address bus (bit $19=$ LOW), a memory read signal is issued (bit $42=$ HIGH) and the data-in register's clock is enabled (bit 22 L LOW). At the next LOW-to-HIGH transition of the clock, the operand will be placed in the data-in register.
Meanwhile, we need to restore the address of the next macroinstruction in the MAR. Bits $32-33=3$ select microprogram bits 24-27 as the $A$ address ( $a \mathrm{~F}_{\mathrm{H}}$ is written there); therefore, the internal program counter will be addressed, as $\overline{E A}$ (bit 10) $=$ LOW. The ALU performs an $F=R+C_{n}$ with $C_{n}$ (bit 35) LOW, thus passing the program counter contents to the output. IEN (bit 9) $=$ HIGH prevents disturbance of internal Am2903 registers and bit 18 will enable the MAR to receive the next macroinstruction address.
Note that the situation now is exactly the same as after the first step of ADD IMMediate. The operand is in the data register and the MAR points to the next macroinstruction. Therefore, the Am2910 sequencer will address, as the next microstep, location ADDIMM + 1. The step after this will, of course, be FETCH + 1. A total of 5 microsteps were needed to execute this macroinstruction but it occupies only 3 microprogram locations.
It is worthwhile to note here that by adding two more Am2920 registers between the Data-bus and the Address-bus and a couple of control-bits in the microprogram, we could shorten the microprogram by one step. In this design we chose not to do so in order to demonstrate the Data-bus to Address-bus path through the ALU.

## Microword ADD RR1

The macroinstruction to be excuted here points to the register in which the first operand is written, and also into which the result should be written. The second 4 -bit field of the OP-CODE (bits $0-3$ ) points to the register in which the address of the second operand is stored.

| 87 |  | 43 |  |
| :---: | :---: | :---: | :---: |

Bits 32 and 33 are LOW. Therefore, instruction register bits 0-3 will form the A-address. Now we take the contents of this register and place it in the MAR exactly the same way as we did in location ADD DIR +2 with the program counter. The Am2910 continues.

## Microword ADD RR1 + 1

Here we fetch the operand and place it in the Data-in register. At the same time, we restore the program counter into the MAR.

## Microword ADD RR1 + 2

Bits $32,33=2$ and instruction register bits 4-7 serve as the A-address. Bit 34 = LOW; the same instruction register bits serve as B-address, too. Note, that $\overline{\mathrm{OEB}}$ (bit 11) is HIGH; therefore, the ALU R-source will be the Data-in register and the $S$-source will be the register addressed by A-address. The result (sum), however, will be written to the correct register, as IEN (bit 9) is LOW.
At the same time, the next macroinstruction is fetched in the usuall oooverlapping way and the next microinstruction to be excuted will be at location FETCH +1 .

## Summary

In this design shown in Figure 31, we have demonstrated some of the addressing schemes mentioned in Chapter 1. We used the ADD instruction throughout these examples, but any other arithmetic or logic instruction can be executed, in exactly the same manner by changing the microcode bits $1-4$ to the appropriate ALU code.
The reader is encouraged to write several microcode-lines to execute the other addressing modes mentioned in Chapter 1. He will discover that when the result of the macroinstruction is to be written into main memory, the overlapping instruction-fetch is not feasible. In some cases, when the MAR no longer contains the Program Counter value, an additional microstep is needed in order to restore the Program Counter into the MAR. The reader is again encouraged to modify location FETCH in order to save this additional microstep.

## Appendix

Throughout Chapter 3, a number of AC calculations have been made to show typical speeds for an Am2901A and Am2903 16-bit ALU configuration. This Appendix shows the latest SWITCHING CHARACTERISTICS for the Am2901A and Am2903.
The typical data on the Am2901A shown in this Appendix supersedes that shown on page 2-12 of the Am2900 Family Data Book dated 4-78 (AM-PUB003). The only difference between the data shown in the typical column of the switching characteristic and this Appendix appears in Table 3. The typical carry in set-up time should be 40 ns .

The typical switching characteristic data for the Am2903 as shown in this Appendix supersedes the data presented in the Am2903 Bipolar Microprocessor Slice/Am2910 Microprogram Controller Data Booklet dated 3-78. Here, a number changes have been made to the table for both the combinatorial propaga-tion delays and the set-up and hold times.
Should any questions arise concerning the switching characteristics for either the Am2901A or Am2903, please do not hesitate to contact the AMD factory and ask for Bipolar Microprocessor Marketing or Bipolar Microprocessor Applications.

## ROOM TEMPERATURE

## SWITCHING CHARACTERISTICS

(See next page for AC Characteristics over operating range.)
Tables I, II, and III below define the timing characteristics of the Am2901A at $25^{\circ} \mathrm{C}$. The tables are divided into three types of parameters; clock characteristics, combinational delays from inputs to outputs, and set-up and hold time requirements. The latter table defines the time prior to the end of the cycle (i.e., clock LOW-to-HIGH transition) that each input must be stable to guarantee that the correct data is written into one of the internal registers.

All values are at $25^{\circ} \mathrm{C}$ and 5.0 V . Measurements are made at 1.5 V with $\mathrm{V}_{\mathrm{IL}}=0 \mathrm{~V}$ and $\mathrm{V}_{\mathrm{IH}}=3.0 \mathrm{~V}$. For three-state disable tests, $\mathrm{C}_{\mathrm{L}}=5.0 \mathrm{pF}$ and measurement is to 0.5 V change on output voltage level. All outputs fully loaded.

TABLE I

## CYCLE TIME AND CLOCK CHARACTERISTICS

| TIME | TYPICAL | GUARANTEED |
| :--- | :---: | :---: |
| Read-Modify-Write Cycle <br> (time from selection of <br> A, B registers to end of <br> cycle) | 55 ns | 93 ns |
| Maximum Clock Frequency to <br> Shift Q Register (50\% duty <br> cycle) | 40 MHz | 20 MHz |
| Minimum Clock LOW Tıme | 30 ns | 30 ns |
| Minımum Clock HIGH Tıme | 30 ns | 30 ns |
| Minimum Clock Period | 75 ns | 93 ns |

TABLE II
COMBINATIONAL PROPAGATION DELAYS (all in ns, $\mathrm{C}_{\mathrm{L}}=\mathbf{5 0 p F}$ (except output disable tests))

|  | TYPICAL $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |  |  |  |  |  |  | GUARANTEED $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | Y | $F_{3}$ | $c_{n+4}$ | $\overline{\mathbf{G}}, \overline{\mathbf{P}}$ | $\begin{aligned} & \mathrm{F}=0 \\ & \mathrm{R}_{\mathrm{L}}= \\ & 270 \end{aligned}$ | OVR | Shift Outputs |  | Y | F3 | $C_{n+4}$ | $\overline{\mathrm{G}}, \overline{\mathbf{P}}$ | $\begin{aligned} & \mathrm{F}=0 \\ & \mathrm{R}_{\mathrm{L}}= \\ & 270 \end{aligned}$ | OVR | Shift Outputs |  |
|  |  |  |  |  |  |  | RAM $\mathrm{RAM}_{3}$ | $\begin{aligned} & \mathbf{a}_{0} \\ & \mathbf{o}_{3} \end{aligned}$ |  |  |  |  |  |  | $\begin{array}{\|l\|} \mathrm{RAM}_{0} \\ \mathrm{RAM}_{3} \end{array}$ | $\begin{aligned} & \mathbf{a}_{0} \\ & \mathbf{o}_{3} \end{aligned}$ |
| A, B | 45 | 45 | 45 | 40 | 65 | 50 | 60 | - | 75 | 75 | 70 | 59 | 85 | 76 | 90 | - |
| D (arithmetic mode) | 30 | 30 | 30 | 25 | 45 | 30 | 40 | - | 39 | 37 | 41 | 31 | 55 | 45 | 59 | - |
| D ( $1=\times 37$ ) (Note 5) | 30 | 30 | - | - | 45 | - | 40 | - | 36 | 34 | - | - | 51 | - | 53 | - |
| $\mathrm{C}_{\mathrm{n}}$ | 20 | 20 | 10 | - | 35 | 20 | 30 | - | 27 | 24 | 20 | - | 46 | 26 | 45 | - |
| ${ }^{\prime} 012$ | 35 | 35 | 35 | 25 | 50 | 40 | 45 | - | 50 | 50 | 46 | 41 | 65 | 57 | 70 | - |
| I345 | 35 | 35 | 35 | 25 | 45 | 35 | 45 | - | 50 | 50 | 50 | 42 | 65 | 59 | 70 | - |
| 1678 | 15 | - | - | - | - | - | 20 | 20 | 26 | - | - | - | - | - | 26 | 26 |
| $\overline{\text { OE Enable/Disable }}$ | 20/20 | - | - | - | - | - | - | - | 30/33 | - | - | - | - | - | - | - |
| $\begin{aligned} & \text { A bypassing } \\ & \text { ALU }(1=2 x x) \end{aligned}$ | 30 | - | - | - | - | - | - | - | 35 | - | - | - | - | - | - | - |
| Clock F(Note 6) | 40 | 40 | 40 | 30 | 55 | 40 | 55 | 20 | 52 | 52 | 52 | 41 | 70 | 57 | 71 | 30 |

SET-UP AND HOLD TIMES (all in ns) (Note 1)
TABLE III

| From Input | Notes | TYPICAL $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  | GUARANTEED $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Set-Up Time | Hold Time | Set-Up Time | Hold Time |
| $\begin{aligned} & \hline \text { A, B } \\ & \text { Source } \end{aligned}$ | $\begin{aligned} & 2,4 \\ & 3,5 \\ & \hline \end{aligned}$ | $\begin{gathered} 40 \\ t_{\text {pw }} L+15 \end{gathered}$ | 0 | $\begin{gathered} 93 \\ t_{\mathrm{pw}} \mathrm{~L}+25 \\ \hline \end{gathered}$ | 0 |
| B Dest. | 2,4 | $t_{\text {pw }} \mathrm{L}+15$ | 0 | $t_{\text {pw }} \mathrm{L}+15$ | 0 |
| D (arıthmetic mode) |  | 25 | 0 | 70 | 0 |
| $\mathrm{D}(1=\times 37)$ (Note 5) |  | 25 | 0 | 60 | 0 |
| $\mathrm{C}_{\mathrm{n}}$ |  | 40 | 0 | 55 | 0 |
| ${ }^{\prime} 012$ |  | 30 | 0 | 64 | 0 |
| ${ }^{\prime} 345$ |  | 30 | 0 | 70 | 0 |
| ${ }^{1} 678$ | 4 | $t_{\text {pw }} \mathrm{L}+15$ | 0 | $\mathrm{t}_{\mathrm{pw}} \mathrm{L}+25$ | 0 |
| $\mathrm{RAM}_{0,3}, \mathrm{Q}_{0,3}$ |  | 15 | 0 | 20 | 0 |

Notes 1. See next page.
2. If the $B$ address is used as a source operand, allow for the " $A$, $B$ source" set-up time, if it is used only for the destination address, use the " $B$ dest." set-up time
3 Where two numbers are shown, both must be met.
4 "tpwL" is the clock LOW tıme
5 DVO is the fastest way to load the RAM from the $D$ inputs. This function is obtained with $1=337$
6 Using $Q$ register as source operand in arithmetic mode Clock is not normally in critical speed path when $Q$ is not a source.

## A. Am 2903 SWITCHING CHARACTERISTICS (TYPICAL ROOM TEMPERATURE PERFORMANCE) - (MAY 18, 1978)

Tables IA, IIA, and IIIA define the nominal timing characteristics of the Am2903 at $25^{\circ} \mathrm{C}$ and 5.0 V . The Tables divide the parameters into three types: pulse characteristics for the clock and write enable, combinational delays from input to output, and set-up and hold times relative to the clock and write pulse.
Measurements are made at 1.5 V with $\mathrm{V}_{\mathrm{IL}}=0 \mathrm{~V}$ and $\mathrm{V}_{\mathrm{IH}}=$ 3.0 V . For three-state disable tests, $\mathrm{C}_{\mathrm{L}}=5.0 \mathrm{pF}$ and measurement is to 0.5 V change on output voltage level.

TABLE IA - Write Pulse and Clock Characteristics

| Time |  |
| :--- | :---: |
| Minimum Time CP and WE both LOW <br> to write | 15 ns |
| Minimum Clock LOW Time | 15 ns |
| Minimum Clock HIGH Time | 35 ns |

TABLE IIA - Combinational Propagation Delays (All in ns) Outputs Fully Loaded. CL $=50 \mathrm{pF}$ (except output disable tests)

| $\qquad$ <br> From Input | Y | $C_{n+4}$ | $\overline{\mathbf{G}}, \overline{\mathbf{P}}$ | (S) Z | N | OVR | DB | $\overline{\text { WRITE }}$ | $\mathbf{Q l O}_{0}, \mathrm{OlO}_{3}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SIO}_{3}$ | $\begin{array}{\|c\|} \hline \mathbf{S I O}_{0} \\ \text { (Parity) } \\ \hline \end{array}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| A, B Addresses (Arith. Mode) | 65 | 60 | 56 | - | 64 | 70 | 33 | - | - | 65 | 69 | 87 |
| A, B Addresses (Logic Mode) | 56 | - | 46 | - | 56 | - | 33 | - | - | 55 | 64 | 81 |
| DA, DB Inputs | 39 | 38 | 30 | - | 40 | 56 | - | - | - | 39 | 47 | 60 |
| $\overline{\text { EA }}$ | 38 | 33 | 26 | - | 36 | 41 | - | - | - | 36 | 41 | 58 |
| $\mathrm{C}_{\mathrm{n}}$ | 25 | 21 | - | - | 20 | 38 | - | - | - | 21 | 25 | 48 |
| $\mathrm{I}_{0}$ | 40 | 31 | 24 | - | 37 | 42 | - | 15(1) | - | 41 | 39 | 63 |
| $\mathrm{l}_{4321}$ | 45 | 45 | 32 | - | 44 | 52 | - | 17(1) | - | 45 | 51 | 68 |
| $\mathrm{I}_{8765}$ | 25 | - | - | - | - | - | - | 21 | 22/29(2) | 24/17(2) | 27/17(2) | 24/17(2) |
| IEN | - | - | - | - | - | - | - | 10 | - | - | - | - |
| $\overline{\mathrm{OEB}}$ Enable/Disable | - | - | - | - | - | - | 12/15(2) | - | - | - | - | - |
| $\overline{\mathrm{OEY}}$ Enable/Disable | 14/14(2) | - | - | - | - | - | - | - | - | - | - | - |
| $\mathrm{SIO}_{0}, \mathrm{SIO}_{3}$ | 13 | - | - | - | - | - | - | - | - | - | 19 | 20 |
| Clock | 58 | 57 | 40 | - | 56 | 72 | 24 | - | 28 | 56 | 63 | 76 |
| $Y$ | - | - | - | 16 | - | - | - | - | - | - | - | - |
| $\overline{\text { MSS }}$ | 25 | - | 25 | - | 25 | 25 | - | - | - | 24 | 27 | 24 |

Notes 1. Applies only when leaving special functions.
2. Enable/Disable. Enable is defined as output active and correct. Disable is a three-state output turning off.
3. For delay from any input to $Z$, use input to $Y$ plus $Y$ to $Z$.

TABLE IIIA - Set-Up and Hold Times (All in ns)
CAUTION: READ NOTES TO TABLE III. NA = Not Applicable; no timing constraint.

| Input | With Respect to to this Signal | HIGH-to-LOW |  | LOW-to-HIGH |  | Comment |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Set-up | Hold | Set-up | Hold |  |
| Y | Clock | NA | NA | 9 | -3 | To store Y in RAM or Q |
| $\overline{\text { WE HIGH }}$ | Clock | 5 | Note 2 | Note 2 | 0 | To Prevent Writing |
| WE LOW | Clock | NA | NA | 15 | 0 | To Write into RAM |
| A, B as Sources | Clock | 19 | -3 | NA | NA | See Note 3 |
| B as a Destination | Clock and $\overline{\text { WE }}$ both LOW | -4 | Note 4 | Note 4 | -3 | To Write Data only into the Correct B Address |
| Q1O ${ }_{0}, \mathrm{QIO}_{3}$ | Clock | NA | NA | 10 | -4 | To Shift Q |
| $\mathrm{I}_{8765}$ | Clock | 2 | Note 5 | Note 5 | -18 |  |
| IEN HIGH | Clock | 10 | Note 2 | Note 2 | 0 | To Prevent Writing into Q |
| IEN LOW | Clock | NA | NA | 10 | -5 | To Write into Q |



Chapter IV
The Data Path - Part II

## CHAPTER IV <br> THE DATA PATH

The previous CPU example (See Chapter III) utilized SSI and MSI components to accomplish the shift-linkage, carry control, and status register functions associated with the ALU. These functions can all be implemented with the Am2904 status and shift control unit.

The Am2904 is an LSI device that contains all the logic necessary to perform the shift and status control operations associated with the ALU portion of a microcomputer. These operations include storage for ALU status flags; carry-in generation and selection; data-path, carry bit linkage for shift/rotate instructions; and status condition code generation and selection. The ALU status flags: carry, zero, negative, and overflow; may be stored in either of two registers, a machine status register or a micro status register. The carry-in multiplexer can select the true or complement of the microstatus carry flag or machine status carry flag, as well as an external carry, a logical one, or a logical zero. The shift linkage multiplexers provide paths to rotate/shift single and double length words up, down, around the carry flag, and through the carry flag. The status condition code multiplexer provides tests on the true or complement of any status flag, as well as more complicated logical combinations of these flags to facilitate magnitude comparisons on unsigned and two's complement numbers, and normalization operations.

## STATUS REGISTERS

The status registers contained in the Am2904 are shown in the upper portion of Figure 1. Each register is independently controlled by a combination of instruction signals and enable signals.

## MICRO STATUS REGISTER ( $\mu$ SR)

The $\mu$ SR is enabled when the $\overline{\mathrm{CE}} \mu$ signal is low. When $\overline{\mathrm{CE}} \mu$ is low the instruction present on $I_{5}$ through $I_{0}$ will be executed on the LOW to HIGH transition of the Clock input. These instructions fall into three main categories: Bit Operations, Register Operations and Load Operations.
The bit operations allow individual bits of the $\mu \mathrm{SR}$ to be set or reset. (See Table 1.1).
The register operations allow the $\mu$ SR to be loaded from the machine status register, to be set to all one's, reset to all zero's, or swapped with the machine status register. (See Table 1.2).

The load operations allow the $\mu$ SR to be loaded from the I inputs directly, from the $I$ inputs with $I_{C}$ complemented, or from the $I$ inputs with overflow retained, love $+\mu_{\text {OVR }} \rightarrow \mu_{\text {OVR }}$ (See Table 1.3). The load operation with $\mathrm{I}_{\mathrm{C}}$ complemented can be used to emulate machines which use direct subtraction and thus need to complement the carry to obtain a borrow. The load with overflow retained allows a series of arithmetic instructions to be executed without the need for a check for overflow after each instruction. If an overflow occurred at any time during the series it will be "trapped." Thus a single test for overfiow, at the end of the series, is all that is required.

## MACHINE STATUS REGISTER (MSR)

The MSR is enabled when $\overline{\mathrm{CE}}_{M}$ is low. If $\overline{\mathrm{CE}}_{M}$ is low the instruction present on $I_{5}$ through $I_{0}$ will be executed on the LOW to HIGH transition of the Clock input. Additionally the individual bits of the MSR may be selectively enabled through the use of the Enable inputs $\bar{E}_{Z}, \bar{E}_{C}, \bar{E}_{N}$ and $\bar{E}_{\text {OVR }}$ (See Figure 1). This allows all possible combinations of the four status flags to be selectively operated on for maximum flexibility. Thus the instruction specified by $\mathrm{I}_{5}-\mathrm{I}_{0}$ only effect the enabled status flags.


Figure 1. Am2904 Block Diagram.

The MSR instructions fall into two main categories: register operations and load operations (bit operations can be implemented through the use of the selective enable control lines).
The register operations allow the MSR to be loaded from the bi-directional Y port, or the $\mu$ SR. Additionally the MSR may be set, reset, or complemented (See Table 2.1). These three instructions, combined with the selective enables, allow any combination of MSR bits to be set, reset, or complemented.
The load operations allow the MSR to be loaded directly from the I inputs, from the l inputs with $\mathrm{I}_{\mathrm{C}}$ complemented, or from the l inputs for shift through overflow (See Table 2.2). The load with $\mathrm{I}_{\mathrm{C}}$ complemented can be used to produce a borrow. The load for shift through overflow loads the zero flag and the negative flag from the I inputs while swapping the overflow and carry flags. This allows the shift through overflow operation to be easily implemented.

## SHIFT LINKAGE MULTIPLEXERS

The shift linkage multiplexers control bi-directional shift lines SIOn, $\mathrm{SIO}_{0}$ (RAM shifter on the Am2903) and QIOn, QIO $\mathrm{O}_{0}$ (Q register shifter on the Am2903). To enable the shift linkage multiplexers the shift enable line $\overline{\mathrm{SE}}$ must be low. When $\overline{\mathrm{SE}}$ is low the
table 1. Micro status register INSTRUCTION CODES.

Table 1-1. Bit Operations.

| $\mathbf{I}_{543210}$ <br> Octal | $\mu$ SR <br> Operation | Comments |
| :---: | :--- | :--- |
| 10 | $0 \rightarrow \mu_{\mathrm{Z}}$ | RESET ZERO BIT |
| 11 | $1 \rightarrow \mu_{\mathrm{Z}}$ | SET ZERO BIT |
| 12 | $0 \rightarrow \mu_{\mathrm{C}}$ | RESET CARRY BIT |
| 13 | $1 \rightarrow \mu_{\mathrm{C}}$ | SET CARRY BIT |
| 14 | $0 \rightarrow \mu_{\mathrm{N}}$ | RESET SIGN BIT |
| 15 | $1 \rightarrow \mu_{\mathrm{N}}$ | SET SIGN BIT |
| 16 | $0 \rightarrow \mu_{\mathrm{OVR}}$ | RESET OVERFLOW BIT |
| 17 | $1 \rightarrow \mu_{\text {OVR }}$ | SET OVERFLOW BIT |

Table 1-2. Register Operations.

| $\mathbf{I}_{543210}$ <br> Octal | $\mu \mathbf{S R}$ <br> Operation | Comments |
| :---: | :---: | :--- |
| 00 | $M_{\mathrm{X}} \rightarrow \mu_{\mathrm{X}}$ | LOAD MSR TO $\mu$ SR |
| 01 | $1 \rightarrow \mu_{\mathrm{X}}$ | SET $\mu$ SR |
| 02 | $M_{\mathrm{X}} \rightarrow \mu_{\mathrm{X}}$ | REGISTER SWAP |
| 03 | $0 \rightarrow \mu_{\mathrm{X}}$ | RESET $\mu$ SR |

Table 1-3. Load Operations.

| $I_{543210}$ Octal | $\mu$ SR <br> Operation | Comments |
| :---: | :---: | :---: |
| 06, 07 | $\begin{aligned} & \mathrm{I}_{\mathrm{Z}} \rightarrow \mu_{\mathrm{Z}} \\ & \mathrm{I}_{\mathrm{C}} \rightarrow \mu_{\mathrm{C}} \\ & \mathrm{I}_{\mathrm{N}} \rightarrow \mu_{\mathrm{N}} \\ & \mathrm{I}_{\text {OVR }}+\mu_{\mathrm{OVR}} \rightarrow \mu_{\text {OVR }} \end{aligned}$ | LOAD WITH OVERFLOW RETAIN |
| $\begin{aligned} & 30,31 \\ & 50,51 \\ & 70,71 \end{aligned}$ | $\begin{aligned} & I_{Z} \rightarrow \mu_{Z} \\ & I_{C} \rightarrow \mu_{\mathrm{C}} \\ & I_{N} \rightarrow \mu_{N} \\ & I_{\text {OVR }} \rightarrow \mu_{\text {OVR }} \end{aligned}$ | LOAD WITH CARRY INVERT |
| $\begin{aligned} & \hline 04,05 \\ & 20-27 \\ & 32-47 \\ & 52-67 \\ & 72-77 \\ & \hline \end{aligned}$ | $\begin{aligned} & \mathrm{I}_{\mathrm{Z}} \rightarrow \mu_{\mathrm{Z}} \\ & \mathrm{I}_{\mathrm{C}} \rightarrow \mu_{\mathrm{C}} \\ & \mathrm{I}_{\mathrm{N}} \rightarrow \mu_{\mathrm{N}} \\ & \mathrm{I}_{\mathrm{OVR}} \rightarrow \mu_{\mathrm{OVR}} \end{aligned}$ | LOAD DIRECTLY <br> FROM <br> $\mathrm{I}_{\mathrm{z}}, \mathrm{I}_{\mathrm{C}}, \mathrm{I}_{\mathrm{N}}, \mathrm{l}_{\mathrm{love}}$ |

Note- The above tables assume $\overline{\mathrm{CE}}$ is LOW.
shift linkage data path will be set-up depending on the state of instruction lines $I_{10}$ through $I_{6}$ (See Table 3). These instructions allow single length or double length shifts/rotates either up, or down. Additionally shifts/rotates may be done through or around the MSR carry and negative flag. Special operations exist to provide support for add and shift (multiply) instructions. These instructions select the present carry $\mathrm{I}_{\mathrm{C}}$ (for unsigned multiply), or the Exclusive-OR of the sign flag $\mathrm{I}_{\mathrm{n}}$ with the overflow flag $\mathrm{I}_{\mathrm{OVR}}$ (for two's complement multiplication).

## CONDITION CODE MULTIPLEXER

The condition code multiplier selects one of sixteen possible logical combinations of the $\mu \mathrm{SR}$, MSR or I inputs, depending on the state of the $I_{5}-I_{0}$ input lines. These combinations include the true or complement form of any individual bit in the $\mu \mathrm{SR}$, MSR or I inputs. Additionally several more complicated logical operations may be performed to provide magnitude tests on both two's
complement numbers and unsigned numbers. Table 5 lists the conditional test outputs (CT) corresponding to the state of the $\mathrm{I}_{5}-\mathrm{I}_{0}$ instruction lines. Table 6 lists the possible relations between two unsigned or two's complement numbers and the corresponding status and instruction codes. The three-state conditional test output CT is active only if $\overline{\mathrm{O}} \mathrm{E}_{\mathrm{CT}}$ is low.

## CARRY IN MULTIPLEXER

The Carry output can be selected from one of seven different sources depending on the state of instruction input lines. The seven possible sources are: logical zero, logical one, the $\mu$ SR carry flag, the complement of the $\mu$ SR carry flag, the MSR carry flag, the complement of the MSR carry flag, or the external carry input $\mathrm{C}_{\mathrm{X}}$ (See Table 4).
tABLE 2. MACHINE STATUS REGISTER INSTRUCTION CODES.

Table 2-1. Register Operations.

| $I_{543210}$ <br> Octal | MSR <br> Operation | Comments |
| :---: | :---: | :--- |
| 00 | $Y_{X} \rightarrow M_{X}$ | LOAD $Y_{Z}, Y_{C}, Y_{N}, Y_{\text {OVR }}$ |
| 01 | $1 \rightarrow M_{X}$ | TO MSR |
| 02 | $\mu_{X} \rightarrow M_{X}$ | RET MSR |
| 03 | $0 \rightarrow M_{X}$ | RESET MSR SWAP |
| 05 | $\bar{M}_{X} \rightarrow M_{X}$ | INVERT MSR |

Table 2-2. Load Operations.

| $I_{543210}$ Octal | MSR Operation | Comments |
| :---: | :---: | :---: |
| 04 | $\begin{aligned} & \mathrm{I}_{\mathrm{Z}} \rightarrow \mathrm{M}_{\mathrm{Z}} \\ & \mathrm{M}_{\mathrm{OVR}} \rightarrow \mathrm{M}_{\mathrm{C}} \\ & \mathrm{I}_{\mathrm{N}} \rightarrow \mathrm{M}_{\mathrm{N}} \\ & \mathrm{M}_{\mathrm{C}} \rightarrow \mathrm{M}_{\mathrm{OVR}} \end{aligned}$ | LOAD FOR SHIFT THROUGH OVERFLOW OPERATION |
| $\begin{aligned} & 10,11 \\ & 30,31 \\ & 50,51 \\ & 70,71 \end{aligned}$ | $\begin{aligned} & \mathrm{I}_{\mathrm{Z}} \rightarrow \mathrm{M}_{\mathrm{Z}} \\ & \mathrm{I}_{\mathrm{C}} \rightarrow \mathrm{M}_{\mathrm{C}} \\ & \mathrm{I}_{\mathrm{N}} \rightarrow \mathrm{M}_{\mathrm{N}} \\ & \mathrm{I}_{\mathrm{OVR}} \rightarrow \mathrm{M}_{\mathrm{OVR}} \end{aligned}$ | LOAD WITH CARRY INVERT |
| $\begin{array}{r} 06,07 \\ 12-17 \\ 20-27 \\ 32-37 \\ 40-47 \\ 52-67 \\ 72-77 \end{array}$ | $\begin{aligned} & I_{Z} \rightarrow M_{Z} \\ & I_{C} \rightarrow M_{C} \\ & I_{N} \rightarrow M_{N} \\ & I_{\text {OVR }} \rightarrow M_{\text {OVR }} \end{aligned}$ | LOAD DIRECTLY FROM IZ, IC $I_{N}, l_{\text {love }}$ |

Note 1. The above tables assume $\overline{\mathrm{CE}_{\mathrm{M}}}, \overline{\mathrm{E}_{\mathbf{Z}}}, \overline{\mathrm{E}_{\mathrm{C}}}, \overline{\mathrm{E}_{\mathrm{N}}}, \overline{\mathrm{E}_{\mathrm{OVR}}}$ are LOW.

## Y INPUT/OUTPUT LINES

The bi-directional Y data lines may be used for extra data input lines when the $Y$ output buffer is disabled ( $\overline{O E}_{Y}$ high). Additionally, when $I_{5}-l_{0}$ are low, the $Y$ buffer is disabled, irrespective of the $\overline{\mathrm{OE}}_{Y}$ signal. When the $Y$ buffer is enabled $\left(\overline{\mathrm{OE}}_{Y}\right.$ is low) the Y data lines are selected from the MSR, $\mu \mathrm{SR}$, or I input lines depending on the state of instruction lines $\mathrm{I}_{5}$ and $\mathrm{I}_{4}$ (See Table 7).

TABLE 3．SHIFT LINKAGE MULTIPLEXER INSTRUCTION CODES．

| $l_{10}$ | $\mathrm{l}_{9}$ | $\mathrm{I}_{8}$ | $\mathrm{I}_{7}$ | $I_{6}$ | $M_{C}$ | RAM | Q | SIO。 | $\mathbf{S I O}$ | Q10 | $\mathrm{QIO}_{n}$ | Loaded into $M_{C}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | 0 | 0 | 0 | $\square$ | $\begin{aligned} & \text { ASB I.SB } \\ & \rightarrow-a \end{aligned}$ | $\begin{aligned} & \text { USB } \\ & -\square \end{aligned}$ | Z | 0 | Z | 0 |  |
| 0 | 0 | 0 | 0 | 1 |  | $\rightarrow$ | － | Z | 1 | Z | 1 |  |
| 0 | 0 | 0 | 1 | 0 |  |  |  | Z | 0 | Z | $\mathrm{M}_{\mathrm{N}}$ | $\mathrm{SIO}_{0}$ |
| 0 | 0 | 0 | 1 | 1 |  | － |  | Z | 1 | Z | $\mathrm{SIO}_{0}$ |  |
| 0 | 0 | 1 | 0 | 0 | $\square$ | $\rightarrow$ |  | Z | $M_{C}$ | z | $\mathrm{SIO}_{0}$ |  |
| 0 | 0 | 1 | 0 | 1 |  | － |  | z | $M_{N}$ | Z | $\mathrm{SIO}_{0}$ |  |
| 0 | 0 | 1 | 1 | 0 |  | $\rightarrow$ | $\rightarrow$ | z | 0 | z | $\mathrm{SIO}_{0}$ |  |
| 0 | 0 | 1 | 1 | 1 |  | $\rightarrow$ |  | Z | 0 | Z | $\mathrm{SIO}_{0}$ | $\mathrm{QIO}_{0}$ |
| 0 | 1 | 0 | 0 | 0 |  | $\rightarrow$－ | － | Z | $\mathrm{SIO}_{0}$ | Z | QIO。 | $\mathrm{SIO}_{0}$ |
| 0 | 1 | 0 | 0 | 1 |  | $-$ |  | Z | $M_{C}$ | Z | Q10 | $\mathrm{SIO}_{0}$ |
| 0 | 1 | 0 | 1 | 0 |  | － |  | Z | $\mathrm{SIO}_{0}$ | z | QIO |  |
| 0 | 1 | 0 | 1 | 1 | $\square \mathrm{c}$ | $\rightarrow$ | － | z | $\mathrm{IC}^{\text {c }}$ | z | $\mathrm{SIO}_{0}$ |  |
| 0 | 1 | 1 | 0 | 0 |  | $\rightarrow$ |  | Z | $M_{C}$ | Z | $\mathrm{SIO}_{0}$ | QIO |
| 0 | 1 | 1 | 0 | 1 |  |  |  | Z | QIO。 | Z | $\mathrm{SIO}_{0}$ | QIO。 |
| 0 | 1 | 1 | 1 | 0 |  | $\rightarrow$ |  | z | $\mathrm{I}_{\mathrm{N}} \oplus \mathrm{l}_{\text {OVR }}$ | z | $\mathrm{SIO}_{0}$ |  |
| 0 | 1 | 1 | 1 | 1 |  | $\rightarrow$ |  | Z | Q1O。 | Z | $\mathrm{SIO}_{0}$ |  |
| 1 | 0 | 0 | 0 | 0 |  | $8 \text { LSE }$ | SB | 0 | Z | 0 | Z | $\mathrm{SIO}_{n}$ |
| 1 | 0 | 0 | 0 | 1 |  | － |  | 1 | z | 1 | z | $\mathrm{SIO}_{n}$ |
| 1 | 0 | 0 | 1 | 0 |  | － | － | 0 | z | 0 | z |  |
| 1 | 0 | 0 | 1 | 1 |  | － | － | 1 | Z | 1 | Z |  |
| 1 | 0 | 1 | 0 | 0 |  | － |  | $\mathrm{QIO}_{n}$ | z | 0 | Z | $\mathrm{SIO}_{n}$ |
| 1 | 0 | 1 | 0 | 1 |  |  |  | QIOn | z | 1 | z | $\mathrm{SIO}_{n}$ |
| 1 | 0 | 1 | 1 | 0 |  | $\square$ |  | QIOn | z | 0 | z |  |
| 1 | 0 | 1 | 1 | 1 |  |  |  | QIOn | Z | 1 | Z |  |
| 1 | 1 | 0 | 0 | 0 |  | $\square$ | $-$ | $\mathrm{SIO}_{\mathrm{n}}$ | Z | $\mathrm{QIO}_{n}$ | Z | $\mathrm{SIO}_{n}$ |
| 1 | 1 | 0 | 0 | 1 |  | $-1$ | $-$ | $\mathrm{Mc}_{\mathrm{C}}$ | Z | $\mathrm{QIO}_{n}$ | Z | $\mathrm{SIO}_{n}$ |
| 1 | 1 | 0 | 1 | 0 |  | $-1$ | － | $\mathrm{SIO}_{n}$ | Z | QIO $_{n}$ | Z |  |
| 1 | 1 | 0 | 1 | 1 |  | － | － | $\mathrm{Mc}_{\mathrm{C}}$ | Z | 0 | Z |  |
| 1 | 1 | 1 | 0 | 0 |  | $-1$ |  | $\mathrm{QIO}_{n}$ | Z | $\mathrm{M}_{\mathrm{C}}$ | Z | $\mathrm{SIO}_{n}$ |
| 1 | 1 | 1 | 0 | 1 |  | － | － | $\mathrm{QIO}_{n}$ | Z | $\mathrm{SIO}_{n}$ | Z | $\mathrm{SIO}_{n}$ |
| 1 | 1 | 1 | 1 | 0 |  | － | － | $\mathrm{QIO}_{n}$ | Z | $\mathrm{Mc}_{\mathrm{C}}$ | Z |  |
| 1 | 1 | 1 | 1 | 1 |  | － | － | $\mathrm{QIO}_{n}$ | Z | $\mathrm{SIO}_{n}$ | Z |  |

TABLE 4. CARRY-IN CONTROL MULTIPLEXER INSTRUCTION CODES.

| $\mathrm{I}_{12}$ | $I_{11}$ | $I_{5}$ | $\mathrm{I}_{3}$ | $\mathrm{I}_{2}$ | $I_{1}$ | $\mathrm{C}_{0}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | x | x | X | x | 0 |
| 0 | 1 | X | X | X | x | 1 |
| 1 | 0 | x | x | x | x | $C_{X}$ |
| 1 | 1 | 0 | 0 | X | X | $\mu_{C}$ |
| 1 | 1 | 0 | x | 1 | X | $\mu_{C}$ |
| 1 | 1 | 0 | X | X | 1 | $\mu_{\text {c }}$ |
| 1 | 1 | 0 | 1 | 0 | 0 | $\bar{\mu}_{C}$ |
| 1 | 1 | 1 | 0 | X | x | $M_{C}$ |
| 1 | 1 | 1 | X | 1 | X | $M_{C}$ |
| 1 | 1 | 1 | X | X | 1 | $M_{C}$ |
| 1 | 1 | 1 | 1 | 0 | 0 | $\bar{M}_{C}$ |

TABLE 5. CONDITION CODE OUTPUT (CT) INSTRUCTION CODES.

| $\begin{gathered} \mathrm{I}_{3}-0 \\ \text { HEX } \end{gathered}$ | $\mathrm{I}_{3}$ | $I_{2}$ | $l_{1}$ | $\mathrm{I}_{0}$ | $I_{5}=I_{4}=0$ | $l_{5}=0, l_{4}=1$ | $l_{5}=1, l_{4}=0$ | $I_{5}=I_{4}=1$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | 0 | 0 | 0 | $\left(\mu_{N} \oplus \mu_{\mathrm{OVR}}\right)+\mu_{Z}$ | $\left(\mu_{N} \oplus \mu_{O V R}\right)+\mu_{Z}$ | $\left(M_{N} \oplus M_{\text {OVR }}\right)+M_{Z}$ | $\left(I_{N} \oplus l_{\text {OVR }}\right)+I_{Z}$ |
| 1 | 0 | 0 | 0 | 1 | $\left(\mu_{N} \odot \mu_{\text {OVR }}\right) \cdot \bar{\mu}_{Z}$ | $\left(\mu_{N} \odot \mu_{\text {OVR }}\right) \cdot \bar{\mu}_{\mathbf{Z}}$ | $\left(M_{N} \odot M_{\text {OVR }}\right) \cdot \bar{M}_{Z}$ | $\left(I_{N} \bigcirc I_{\text {OVR }}\right) \cdot \bar{I}_{Z}$ |
| 2 | 0 | 0 | 1 | 0 | $\mu_{N} \oplus \mu_{\text {OVR }}$ | $\mu_{N} \oplus \mu_{\text {OVR }}$ | $M_{N} \oplus \mathrm{M}_{\text {OVR }}$ | $\mathrm{I}_{\mathrm{N}} \oplus \mathrm{I}_{\text {OVR }}$ |
| 3 | 0 | 0 | 1 | 1 | $\mu_{N} \odot \mu_{\text {OVR }}$ | $\mu_{N} \odot \mu_{\text {OVR }}$ | $M_{N} \odot M_{\text {OVR }}$ | $\mathrm{IN}_{\mathrm{N}}$ Olovr |
| 4 | 0 | 1 | 0 | 0 | $\mu_{Z}$ | $\mu_{Z}$ | $\mathrm{M}_{\mathrm{Z}}$ | $\mathrm{I}_{2}$ |
| 5 | 0 | 1 | 0 | 1 | $\bar{\mu}_{Z}$ | $\bar{\mu}_{Z}$ | $\bar{M}_{Z}$ | $T_{Z}$ |
| 6 | 0 | 1 | 1 | 0 | $\mu$ OVR | $\mu \mathrm{OVR}$ | $\mathrm{M}_{\text {OVR }}$ | Iovr |
| 7 | 0 | 1 | 1 | 1 | $\bar{\mu}_{\text {OVR }}$ | $\bar{\mu}_{\text {OVR }}$ | $\bar{M}_{\text {OVR }}$ | Tovr |
| 8 | 1 | 0 | 0 | 0 | $\mu_{C}+\mu_{Z}$ | $\mu_{C}+\mu_{Z}$ | $M_{C}+M_{Z}$ | $T_{C}+I_{z}$ |
| 9 | 1 | 0 | 0 | 1 | $\bar{\mu}_{C} \cdot \bar{\mu}_{Z}$ | $\bar{\mu}_{C} \cdot \bar{\mu}_{Z}$ | $\bar{M}_{C} \cdot \bar{M}_{Z}$ | $\mathrm{I}_{\mathrm{C}} \cdot \mathrm{I}_{\mathrm{Z}}$ |
| A | 1 | 0 | 1 | 0 | $\mu_{C}$ | $\mu_{C}$ | $\mathrm{M}_{\mathrm{C}}$ | ${ }^{\prime} \mathrm{C}$ |
| B | 1 | 0 | 1 | 1 | $\bar{\mu}_{C}$ | $\bar{\mu}_{C}$ | $\bar{M}_{C}$ | $T_{C}$ |
| C | 1 | 1 | 0 | 0 | $\bar{\mu}_{C}+\mu_{Z}$ | $\bar{\mu}_{C}+\mu_{Z}$ | $\bar{M}_{C}+M_{Z}$ | $T_{C}+I_{z}$ |
| D | 1 | 1 | 0 | 1 | $\mu_{C} \cdot \bar{\mu}_{Z}$ | $\mu_{C} \cdot \bar{\mu}_{Z}$ | $M_{C} \cdot \bar{M}_{Z}$ | $\mathrm{I}_{\mathrm{C}} \cdot \mathrm{I}_{\mathrm{Z}}$ |
| E | 1 | 1 | 1 | 0 | $\mathrm{I}_{\mathrm{N}} \oplus \mathrm{M}_{\mathrm{N}}$ | $\mu_{N}$ | $M_{N}$ | $\mathrm{I}_{\mathrm{N}}$ |
| F | 1 | 1 | 1 | 1 | $I_{N} \odot M_{N}$ | $\bar{\mu}_{N}$ | $\bar{M}_{N}$ | $T_{N}$ |

Notes $1 \oplus$ Represents EXCLUSIVE-OR
$\odot$ Represents EXCLUSIVE-NOR or coincidence.

TABLE 6. CRITERIA FOR COMPARING TWO NUMBERS FOLLOWING "A MINUS B" OPERATIONS.

| Relation | For Unsigned Numbers |  |  | For 2's Complement Numbers |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | Status | $\mathrm{I}_{3-0}$ |  | Status | $\mathrm{I}_{3-0}$ |  |
|  |  | $\mathbf{C T}=\mathrm{H}$ | $\mathbf{C T}=\mathrm{L}$ |  | $\mathbf{C T}=\mathrm{H}$ | $\mathbf{C T}=\mathbf{L}$ |
| $\mathrm{A}=\mathrm{B}$ | Z $=1$ | 4 | 5 | $\mathrm{Z}=1$ | 4 | 5 |
| $A=B$ | $\mathrm{Z}=0$ | 5 | 4 | $\mathrm{Z}=0$ | 5 | 4 |
| $A \geqslant B$ | $C=1$ | A | B | $N \odot O V R=1$ | 3 | 2 |
| $\mathrm{A}<\mathrm{B}$ | $\mathrm{C}=0$ | B | A | $N \oplus$ OVR $=1$ | 2 | 3 |
| A $>$ B | $\mathrm{C} \cdot \overline{\mathrm{z}}=1$ | D | C | $(\mathrm{N} \odot$ OVR) $\cdot \overline{\mathrm{Z}}=1$ | 1 | 0 |
| $A \leqslant B$ | $\overline{\mathrm{C}}+\mathrm{Z}=1$ | C | D | $(\mathrm{N} \oplus$ OVR) $+\mathrm{Z}=1$ | 0 | 1 |

$\oplus=$ Exclusive OR $\quad H=$ HIGH $\quad$ Note For Am2910, the CC input is active LOW, so use $I_{3-0}$ code to produce
$\odot=$ Exclusive NOR L = LOW CT $=\mathrm{L}$ for the desired test.

TABLE 7. Y OUTPUT INSTRUCTION CODES.

| $\overline{\mathbf{O E}} \mathrm{E}_{Y}$ | $\mathrm{I}_{5}$ | $\mathrm{I}_{4}$ | Y Output | Comment |
| :---: | :---: | :---: | :---: | :---: |
| 1 | X | X | Z | Output Off High Impedance |
| 0 | $\bigcirc$ | x | $\mu_{i} \rightarrow Y_{i}$ | See Note 1 |
| 0 | 1 | 0 | $\mathrm{Mi}_{\mathbf{i}} \rightarrow \mathrm{Y}_{\mathrm{i}}$ |  |
| 0 | 1 | 1 | $\mathrm{I}_{\mathrm{i}} \rightarrow \mathrm{Y}_{\mathrm{i}}$ |  |

Notes 1. For the conditions.
$I_{5}, I_{4}, I_{3}, I_{2}, I_{1}, I_{0}$ are LOW, $Y$ is an input.
$\overline{O_{Y}}$ is "Don't Care" for this condition.
2. $X$ is "Don't Care" condition.

## TIMING ANALYSIS

In the previous chapter a timing analysis was presented with the shift-linkage, carry-control, and status registers implemented in SSI and MSI. This timing analysis will be repeated with the SSI and MSI logic replaced with the Am2904. Tables 8.1, 8.2, 8.4 and 8.5 list the typical AC characteristics of the registers, Am2902A, Am2901A, Am2903, and Am2904 used in these calculations. Table 8.3 lists the assumed AC characteristics for the set-up time of the Am2904.

Figure 2 illustrates the tıming analysis for an Am2901A based design. The analysis begins with the LOW to HIGH transition of the system clock. All signals must be valid for the next LOW to HIGH transition of the system clock, i.e. one-microcycle later.
Figure 3 illustrates a similar timing analysis for the Am2903. The results of both analysis are listed in Table 9.

## USING THE Am2904 IN A 16-BIT DESIGN

Perhaps the best technique for understanding the Am2904 is to simply compare 16 -bit ALU designs with and without the Am2904. The first design, Figure 4a, is an example of a 16-bit CPU design using SSI/MSI parts instead of the Am2904. In Figure 4b, the second 16 -bit CPU design, the Am2904 is shown replacing the SSI/MSI. The Am2904 substitutes for the appropriate shift matrix control and status registers. A more detailed comparison may be obtained by referring to the 16 -bit ALU designs in Chapter III and the one in Appendix C of this chapter. To understand the Am2904 further, the usage of the Am2904 is described through the microprogram bits in the microprogram structure and shown later in the actual microprograms.

TABLE 8-1. STANDARD DEVICE SCHOTTKY SPEEDS.

| Device and Path | Min. | Typ. | Max. |
| :--- | :---: | :---: | :---: |
| S-REGISTER |  |  |  |
| Clock to Output |  | 9 | 15 |
| $\overline{O E}$ to Output | 5 | 13 | 20 |
| Set-up |  | 2 |  |
| Am2902A |  | 7 |  |
| Cn to $C n+x, Y, Z$ |  | 7 | 11 |
| G, $P$ to $G, P$ | 10 |  |  |
| G, $P$ to $C n+x, Y, Z$ |  | 5 | 7 |

TABLE 8-2.
PRELIMINARY SWITCHING CHARACTERISTICS.
Combinational Delays (ns)

| From (Input) | To (Output) | $\mathrm{t}_{\mathrm{pd}}$ |
| :---: | :---: | :---: |
| $\begin{aligned} & \mathrm{I}_{\mathrm{z}} \\ & \mathrm{I}_{\mathrm{C}} \\ & \mathrm{I}_{\mathrm{N}} \\ & \mathrm{lovi} \end{aligned}$ | $\begin{aligned} & Y_{Z} \\ & Y_{C} \\ & Y_{N} \\ & Y_{\text {OVR }} \end{aligned}$ | 20 |
| CP | $Y_{Z}, Y_{C}, Y_{N}, Y_{\text {OVR }}$ | 30 |
| $\mathrm{I}_{4}, \mathrm{I}_{5}$ | $Y_{Z}, Y_{C}, Y_{N}, Y_{\text {OVR }}$ | 23 |
| $\mathrm{I}_{2}, \mathrm{I}_{\mathrm{C}}, \mathrm{I}_{\mathrm{N}}, \mathrm{l}_{\text {OVR }}$ | CT | 30 |
| CP | CT | 30 |
| $\mathrm{I}_{0}-\mathrm{I}_{5}$ | CT | 30 |
| $\mathrm{C}_{\mathrm{X}}$ | $\mathrm{C}_{0}$ | 12 |
| CP | $\mathrm{C}_{0}$ | 20 |
| $\mathrm{l}_{1,2,3,5,11,12}$ | $\mathrm{C}_{0}$ | 24 |
| $\mathrm{SIO}_{\mathrm{n}}, \mathrm{QIO}_{\mathrm{n}}$ | $\mathrm{SIO}_{0}$ | 16 |
| $\mathrm{SIO}_{0}, \mathrm{QlO}_{0}$ | $\mathrm{SIO}_{\mathrm{n}}$ | 16 |
| $\mathrm{I}_{\mathrm{C}}, \mathrm{I}_{\mathrm{N}}, \mathrm{l}_{\text {OVR }}$ | $\mathrm{SIO}_{n}$ | 20 |
| $\mathrm{SIO}_{\mathrm{n}}, \mathrm{QIO}_{\mathrm{n}}$ | $\mathrm{QIO}_{0}$ | 16 |
| $\mathrm{SIO}_{0}, \mathrm{QlO}_{0}$ | $\mathrm{QIO}_{n}$ | 16 |
| CP | $\begin{aligned} & \mathrm{SIO}_{\mathrm{o}}, \mathrm{SIO}_{\mathrm{n}} \\ & \mathrm{QIO}_{\mathrm{o}}, \mathrm{QIO}_{\mathrm{n}} \\ & \hline \end{aligned}$ | 21 |
| $\mathrm{I}_{6} \mathrm{l}_{10}$ | $\begin{aligned} & \mathrm{SIO}_{\mathrm{o}}, \mathrm{SIO}_{\mathrm{n}} \\ & \mathrm{QIO}_{\mathrm{o}}, \mathrm{QIO}_{\mathrm{n}} \\ & \hline \end{aligned}$ | 19 |

TABLE 8-3. ASSUMED SET-UP TIME.*

| Input | TS |
| :---: | :---: |
| IOVR, IZ, IN, IC | 20ns |

[^1]Am2901A - (MAY 18, 1978)
ROOM TEMPERATURE SWITCHING CHARACTERISTICS
Tables I, II, and III below define the timing characteristics of the Am2901A at $25^{\circ} \mathrm{C}$. The tables are divided into three types of parameters; clock characteristics, combinational delays from inputs to outputs, and set-up and hold time requirements. The latter table defines the time prior to the end of the cycle (i.e., clock LOW-to-HIGH transition) that each input must be stable to guarantee that the correct data is written into one of the internal registers.
All values are at $25^{\circ} \mathrm{C}$ and 5.0 V . Measurements are made at 1.5 V with $\mathrm{V}_{1 \mathrm{~L}}=0 \mathrm{~V}$ and $\mathrm{V}_{1 H}=3.0 \mathrm{~V}$. For three-state disable tests, $C_{L}=5.0 \mathrm{pF}$ and measurement is to 0.5 V change on output voltage level. All outputs fully loaded.

TABLE 8-4.
TABLE I
CYCLE TIME AND CLOCK CHARACTERISTICS

| TIME | TYPICAL | GUARANTEED |
| :--- | :---: | :---: |
| Read-Modify-Wrıte Cycle <br> (tıme from selectıon of <br> A, B regısters to end of <br> cycle) | 55 ns | 93 ns |
| Maxımum Clock Frequency to <br> Shift Q Register (50\% duty <br> cycle) | 40 MHz | 20 MHz |
| Minımum Clock LOW Tıme | 30 ns | 30 ns |
| Mınımum Clock HIGH Tıme | 30 ns | 30 ns |
| Mınımum Clock Period | 75 ns | 93 ns |

TABLE II
COMBINATIONAL PROPAGATION DELAYS (all in ns, $C_{L}=50 \mathrm{pF}$ (except output disable tests))

|  | TYPICAL $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |  |  |  |  |  |  | GUARANTEED $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  | F=0 |  | $\begin{aligned} & \text { Shi } \\ & \text { Outp } \end{aligned}$ |  |  |  |  |  | F=0 |  | Shift Outputs |  |
|  | $Y$ | F3 | $C_{n+4}$ | $\mathbf{G}, \mathbf{P}$ | $\begin{aligned} & \mathbf{R}_{\mathbf{L}}= \\ & \mathbf{2 7 0} \end{aligned}$ | OVR | RAM 0 RAM $_{3}$ | $\begin{aligned} & \mathrm{a}_{0} \\ & \mathrm{o}_{3} \end{aligned}$ | Y | F3 | $C_{n+4}$ | G, $\mathbf{P}$ | $\begin{aligned} & \mathbf{R}_{\mathrm{L}}= \\ & \mathbf{2 7 0} \end{aligned}$ | OVR | $\begin{array}{\|l\|} \hline \text { RAM }_{0} \\ \text { RAM }_{3} \end{array}$ | $\begin{aligned} & \mathbf{a}_{0} \\ & \mathbf{o}_{3} \end{aligned}$ |
| A, B | 45 | 45 | 45 | 40 | 65 | 50 | 60 | - | 75 | 75 | 70 | 59 | 85 | 76 | 90 | - |
| D (arıthmetic mode) | 30 | 30 | 30 | 25 | 45 | 30 | 40 | - | 39 | 37 | 41 | 31 | 55 | 45 | 59 | - |
| D (1 = X37) (Note 5) | 30 | 30 | - | - | 45 | - | 40 | - | 36 | 34 | - | - | 51 | - | 53 | - |
| $\mathrm{C}_{n}$ | 20 | 20 | 10 | - | 35 | 20 | 30 | - | 27 | 24 | 20 | - | 46 | 26 | 45 | - |
| $\mathrm{I}_{012}$ | 35 | 35 | 35 | 25 | 50 | 40 | 45 | - | 50 | 50 | 46 | 41 | 65 | 57 | 70 | - |
| 1345 | 35 | 35 | 35 | 25 | 45 | 35 | 45 | - | 50 | 50 | 50 | 42 | 65 | 59 | 70 | - |
| 1678 | 15 | - | - | - | - | - | 20 | 20 | 26 | - | - | - | - | - | 26 | 26 |
| $\overline{\mathrm{OE}}$ Enable/Disable | 20/20 | - | - | - | - | - | - | - | 30/33 | - | - | - | - | - | - | - |
| $\begin{aligned} & \text { A bypassing } \\ & \text { ALU }(1=2 x x) \end{aligned}$ | 30 | - | - | - | - | - | - | - | 35 | - | - | - | - | - | - | - |
| Clock 5 ( Note 6) | 40 | 40 | 40 | 30 | 55 | 40 | 55 | 20 | 52 | 52 | 52 | 41 | 70 | 57 | 71 | 30 |

TABLE III
SET-UP AND HOLD TIMES (all in ns) (Note 1)

| From Input | Notes | TYPICAL $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  | GUARANTEED $25^{\circ} \mathrm{C}, 5.0 \mathrm{~V}$ |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Set-Up Time | Hold Time | Set-Up Time | Hold Time |
| $\begin{aligned} & \hline A, B \\ & \text { Source } \end{aligned}$ | $\begin{aligned} & 2,4 \\ & 3,5 \end{aligned}$ | $\stackrel{40}{t_{p w} L+15}$ | 0 | $\frac{93}{t_{p w}+25}$ | 0 |
| B Dest. | 2,4 | $t_{p w} L+15$ | 0 | $t_{\text {pw }} \mathrm{L}+15$ | 0 |
| D (arithmetıc mode) |  | 25 | 0 | 70 | 0 |
| D ( $1=\times 37$ ) (Note 5) |  | 25 | 0 | 60 | 0 |
| $\mathrm{C}_{\mathrm{n}}$ |  | 40 | 0 | 55 | 0 |
| ${ }^{\prime} 12$ |  | 30 | 0 | 64 | 0 |
| ${ }^{\prime} 345$ |  | 30 | 0 | 70 | 0 |
| ${ }_{6} 678$ | 4 | $t_{\text {pw }} \mathrm{L}+15$ | 0 | $t_{\text {pw }} \mathrm{L}+25$ | 0 |
| $\mathrm{RAM}_{0,3} \mathrm{l}^{0} \mathrm{O}_{0,3}$ |  | 15 | 0 | 20 | 0 |

Notes: 1. See next page.
2. If the $B$ address is used as a source operand, allow for the " $A$, $B$ source" set-up time, if it is used only for the destination address, use the "B dest." set-up time.
3. Where two numbers are shown, both must be met.
4. "tpw $L$ " is the clock LOW time
5. $D V O$ is the fastest way to load the RAM from the $D$ inputs. This function is obtanned with $1=337$
6. Using $Q$ register as source operand in arithmetic mode. Clock is not normally in critical speed path when $Q$ is not a source.

TABLE 8-5.

## A. Am2903 SWITCHING CHARACTERISTICS (TYPICAL ROOM TEMPERATURE PERFORMANCE) - (MAY 18, 1978)

Tables IA, IIA, and IIIA define the nominal timing characteristics of the Am2903 at $25^{\circ} \mathrm{C}$ and 5.0 V . The Tables divide the parameters into three types pulse characteristics for the clock and write enable, combinational delays from input to output, and set-up and hold times relative to the clock and write pulse
Measurements are made at 1.5 V with $\mathrm{V}_{1 \mathrm{~L}}=0 \mathrm{~V}$ and $\mathrm{V}_{1 \mathrm{H}}=$ 3.0 V For three-state disable tests, $\mathrm{C}_{\mathrm{L}}=5.0 \mathrm{pF}$ and measurement is to 05 V change on output voltage level.

TABLE IA - Write Pulse and Clock Characteristics

| Time |  |
| :--- | :---: |
| Mınımum Time CP and WE both LOW <br> to write | 15 ns |
| Minımum Clock LOW Time | 15 ns |
| Mınımum Clock HIGH Time | 35 ns |

TABLE IIA - Combinational Propagation Delays (All in ns)
Outputs Fully Loaded. CL $=50 \mathrm{pF}$ (except output disable tests)

| To Output <br> From Input | Y | $C_{n+4}$ | $\overline{\mathbf{G}}, \overline{\mathbf{P}}$ | (S) Z | N | OVR | DB | $\overline{\text { WRITE }}$ | $\mathbf{Q l O}_{0}, \mathrm{ClO}_{3}$ | $\mathbf{S I O} 0$ | $\mathrm{SIO}_{3}$ | $\begin{gathered} \mathrm{SIO}_{0} \\ \text { (Parity) } \end{gathered}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| A, B Addresses (Arith. Mode) | 65 | 60 | 56 | - | 64 | 70 | 33 | - | - | 65 | 69 | 87 |
| A, B Addresses (Logic Mode) | 56 | - | 46 | - | 56 | - | 33 | - | - | 55 | 64 | 81 |
| DA, DB Inputs | 39 | 38 | 30 | - | 40 | 56 | - | - | - | 39 | 47 | 60 |
| $\overline{E A}$ | 38 | 33 | 26 | - | 36 | 41 | - | - | - | 36 | 41 | 58 |
| $\mathrm{C}_{n}$ | 25 | 21 | - | - | 20 | 38 | - | - | - | 21 | 25 | 48 |
| $\mathrm{I}_{0}$ | 40 | 31 | 24 | - | 37 | 42 | - | 15(1) | - | 41 | 39 | 63 |
| $\mathrm{l}_{4321}$ | 45 | 45 | 32 | - | 44 | 52 | - | 17(1) | - | 45 | 51 | 68 |
| $\mathrm{I}_{8765}$ | 25 | - | - | - | - | - | - | 21 | 22/29(2) | 24/17(2) | 27/17(2) | 24/17(2) |
| $\overline{\text { IEN }}$ | - | - | - | - | - | - | - | 10 | - | - | - | - |
| $\overline{\text { OEB Enable/Disable }}$ | - | - | - | - | - | - | 12/15(2) | - | - | - | - | - |
| $\overline{\text { OEY Enable/Disable }}$ | 14/14(2) | - | - | - | - | - | - | - | - | - | - | - |
| $\mathrm{SIO}_{0}, \mathrm{SIO}_{3}$ | 13 | - | - | - | - | - | - | - | - | - | 19 | 20 |
| Clock | 58 | 57 | 40 | - | 56 | 72 | 24 | - | 28 | 56 | 63 | 76 |
| $Y$ | - | - | - | 16 | - | - | - | - | - | - | - | - |
| $\overline{\text { MSS }}$ | 25 | - | 25 | - | 25 | 25 | - | - | - | 24 | 27 | 24 |

Notes 1 Applies only when leaving special functions
2 Enable/Disable Enable is defined as output active and correct. Disable is a three-state output turning off 3 For delay from any input to $Z$, use input to $Y$ plus $Y$ to $Z$.

TABLE IIIA - Set-Up and Hold Times (All in ns)
CAUTION: READ NOTES TO TABLE III. NA = Note Applicable; no timing constraint.

| Input | With Respect to to this Signal | HIGH-to-LOW |  | LOW-to-HIGH |  | Comment |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Set-up | Hold | Set-up | Hold |  |
| Y | Clock | NA | NA | 9 | -3 | To store Y in RAM or Q |
| $\overline{\text { WE HIGH }}$ | Clock | 5 | Note 2 | Note 2 | 0 | To Prevent Writing |
| WE LOW | Clock | NA | NA | 15 | 0 | To Write into Ram |
| A, B as Sources | Clock | 19 | -3 | NA | NA | See Note 3 |
| B as a Destination | Clock and $\overline{\mathrm{WE}}$ both LOW | -4 | Note 4 | Note 4 | -3 | To Write Data only into the Correct B Address |
| QIO ${ }_{0}, \mathrm{QIO}_{3}$ | Clock | NA | NA | 10 | -4 | To Shift Q |
| $\mathrm{l}_{8765}$ | Clock | 2 | Note 5 | Note 5 | -18 |  |
| $\overline{\text { IEN HIGH }}$ | Clock | 10 | Note 2 | Note 2 | 0 | To Prevent Writing into Q |
| $\overline{\text { IEN LOW }}$ | Clock | NA | NA | 10 | -5 | To Write into Q |



Figure 2-1.


Figure 2-2.


Figure 2-3.


Figure 2-4.


Figure 2-5.


Figure 3-1.


Figure 3-2.


Figure 3-3.


Figure 3-4.


Figure 3-5.


Figure 4a.


Figure 4b.

TABLE 9. TIMING ANALYSIS SUMMARY (ns).

| Operation | Am2901A | Am2903 |
| :--- | :---: | :---: |
| Logic | 94 | 101 |
| Arithmetic | 109 | 131 |
| Logic w/Shift | 113 | 138 |
| Two's Complement <br> Arithmetic with <br> Shift Down | 109 | 161 |
| Magnitude only <br> Arithmetic with <br> Shift Down | 142 |  |

## THE MHCROPROGRAM STRUCTURE

The functions of the pipelined (PL) microprogram bits are illustrated in Figure 5 and as follows:

Bits PLO This is a shared control field. The field is used through PL11 for branching to a microprogram address or to load the CCU counter or control bits for I/O.
Bit PL12 The shared control field is determined by PL12, LOW for branching and counting or HIGH for I/O control.
Bit PL13 When LOW, enables the WRITE output and allows the Q Register and Sign Compare flipflop to be written into.

Bits PL14 and PL15

The $\overline{\mathrm{CE} \mu}$ and $\overline{\mathrm{SE}}$ control inputs of the Am2904, respectively. $\overline{\mathrm{CE}} \mu$ enables the Micro Status Register. $\overline{\text { SE }}$ enables the Am2904 shift operations.
Bits PL16 CCU Next Address.
through PL19
Bits PL20 CCU Multiplex test select.
through PL23
Bit PL24 This bit determines the polarity of the incoming test signal to the CCU.
Bit PL25 Active LOW Instruction Register enable.
Bits PL26 CCU multi-way branching select.
through PL29
Bits PL30 Selects the ALU operand sources.
through PL32

| PL30 | PL31 | PL32 | ALU Operand R | ALU Operand S |
| :--- | :---: | :---: | :---: | :--- |
| L | L | L | RAM Output A | RAM Output B |
| L | L | H | RAM Output A | DB $_{0-3}$ |
| L | H | X | RAM Output A | Q Register |
| H | L | L | DA |  |
| H | L | H | DA $_{0-3}$ | RAM Output B |
| H | H | X | DA $_{0-3}$ | DB |

$L=$ Low
$\mathrm{H}=\mathrm{HIGH}$
$\mathrm{x}=$ Don't Care

Bits PL33 Selects the ALU functions.
through PL36

| $\mathrm{I}_{4}$ | $\mathrm{I}_{3}$ | $\mathrm{I}_{2}$ | $\mathrm{I}_{1}$ | Hex Code | ALU Functions |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| L | L | L | L | 0 | $\mathrm{I}_{0}=\mathrm{L}$ | Special Functions |
|  |  |  |  |  | $\mathrm{I}_{0}=\mathrm{H}$ | $\mathrm{F}_{\mathrm{i}}=\mathrm{HIGH}$ |
| L | L | L | H | 1 | $\mathrm{F}=\mathrm{S}$ Minus R Minus 1 Plus $\mathrm{C}_{\mathrm{n}}$ |  |
| L | L | H | L | 2 | $F=R$ Minus $S$ Minus 1 Plus $\mathrm{C}_{\mathrm{n}}$ |  |
| L | L | H | H | 3 | $F=R$ Plus $S$ Plus $\mathrm{C}_{\mathrm{n}}$ |  |
| L | H | L | L | 4 | $F=S$ Plus $C_{n}$ |  |
| L | H | L | H | 5 | $F=\bar{S}$ Plus $\mathrm{C}_{\mathrm{n}}$ |  |
| L | H | H | L | 6 | $F=R$ Plus $C_{n}$ |  |
| L | H | H | H | 7 | $\mathrm{F}=\overline{\mathrm{R}}$ Plus $\mathrm{C}_{\mathrm{n}}$ |  |
| H | L | L | L | 8 | $\mathrm{F}_{\mathrm{i}}$ = LOW |  |
| H | L | L | H | 9 | $\mathrm{F}_{\mathrm{i}}=\overline{\mathrm{R}}_{\mathrm{i}}$ AND S ${ }_{i}$ |  |
| H | L | H | L | A | $F_{i}=R_{i}$ EXCLUSIVE NOR $S_{i}$ |  |
| H | L | H | H | B | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}_{\mathrm{i}}$ EXCLUSIVE OR $\mathrm{S}_{\mathrm{i}}$ |  |
| H | H | L | L | C | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}_{\mathrm{i}}$ AND $^{\text {d }}$ |  |
| H | H | L | H | D | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}_{\mathrm{i}}$ NOR $\mathrm{S}_{\mathrm{i}}$ |  |
| H | H | H | L | E | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}_{\mathrm{i}}$ NAND $^{\text {S }}$ |  |
| H | H | H | H | F | $\mathrm{F}_{\mathrm{i}}=\mathrm{R}_{\mathrm{i}} \mathrm{ORS}_{\mathrm{i}}$ |  |

Bits PL37 through 40

Selects the ALU destination controls.

|  |  |  |  | Hex Code | Special Function |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | L | L | L | 0 | Unsigned Multiply |
|  | L | H | L | 2 | Two's Complement Multiply |
|  | H | L | L | 4 | Increment by One or Two |
|  | H | L | H | 5 | SigriMagnitudeTwo's Complement |
| L | H | H | L | 6 | Two s Complement Multiply, Last Cycle |
| H | L | 1 | L | 8 | Single Length Normalize |
| H | L | H | L | A | Double Length Normalize and First Divide Op |
| H | H | L | 1 | C | Two's Complement Divide |
| H | H | H | L | E | Two's Complement Divide, Correction and Remainder |

Bits PL41 This 4-bit wide field is used for the A-address through PL44 source.

Bits PL45 This 4-bit wide field is used for the B-address through PL48 source.

Bits PL49 This 4-bit wide field is the B destination adthrough PL52 dress into which new data is written.

Bit PL53 Am2903 control input $\overline{\mathrm{OE}}_{\mathrm{Y}}$. When LOW enables the ALU shifter output data onto the $Y$ bus.

Bits PL54 Am2904 instruction code field.
through PL59
Bits PL60 through PL63

Am2904 shift linkage multiplexer instruction

Bits PL64 Am2904 "carry-in" control multiplexer field.
and PL65
Bits PL66 The $\overline{\mathrm{CE}}_{\mathrm{M}}, \overline{\mathrm{OE}}_{\mathrm{CT}}, \overline{\mathrm{OE}}_{\boldsymbol{Y}}$ control inputs of the through PL68 Am2904, respectively.

Bit PL69 This bit when LOW, enables bits PL74 through PL89 onto the Am2903 DA Bus.

Bit PL70 When LOW, zeros the carry in's to the Am2903 slices.

Bit PL71
When HIGH, enables a status register used in BCD calculations.

Bit PL72 When LOW, clears the status register.
Bit PL73 When LOW, enables Am2909/11 registers.
Bits PL74 This field contains a 16 -bit constant from mithrough PL89 crocode that is passed to the Am2903's via the DA bus. Constant is enabled by PL69.
$\mathrm{I}_{0} \mathrm{ORI}_{1} \mathrm{ORI}_{2} \mathrm{ORI}_{4}=\mathrm{HIGH}, \overline{\mathrm{I}} \overline{\mathrm{EN}}=$ LOW

| ${ }_{8}$ |  |  |  | Hex Code | ALU Shifter Function | $\mathrm{SIO}_{3}$ |  | $\mathrm{Y}_{3}$ |  | $\mathrm{Y}_{2}$ |  | $\mathrm{V}_{1}$ | $\mathrm{Y}_{0}$ | $\mathrm{SIO}_{0}$ | $\overline{\text { Write }}$ | Q Rege Shifter Function | $\mathrm{OlO}_{3}$ | $\mathrm{ClO}_{0}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | $\mathrm{I}_{7}$ | ${ }_{6}$ | $l_{5}$ |  |  | Most Sig. Slice | Other Slices | Most Sig. Stice | Other Slices | Most Sig. Slice | Other Slices |  |  |  |  |  |  |  |
| L | L | L | $L$ | 0 | Anth $\mathrm{F} / 2 \rightarrow \mathrm{Y}$ | Input | Input | $F_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{SiO}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | L | Hold | $\mathrm{H}_{1} \mathrm{Z}$ | Hi-Z |
| L | L | L | H | 1 | $\log F / 2 \rightarrow Y$ | Input | Input | $\mathrm{SiO}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{F}_{3}$ | $F_{3}$ | $F_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | L | Hold | $\mathrm{H}_{\mathrm{i}}-\mathrm{Z}$ | $\mathrm{Hi}-\mathrm{Z}$ |
| L | L | H | L | 2 | Arith F/2 $\rightarrow$ Y | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{SIO}_{3}$ | $F_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | L | $\underline{L o g} \mathrm{Q} / 2 \rightarrow \mathrm{Q}$ | Input | $\mathrm{Q}_{0}$ |
| L | L | H | H | 3 | $\log F / 2 \rightarrow Y$ | Input | Input | $\mathrm{SiO}_{3}$ | $\mathrm{SiO}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | L | $\underline{\log Q / 2 \rightarrow 0}$ | Input | $\mathrm{Q}_{0}$ |
| L | H | L | L | 4 | $\mathrm{F} \rightarrow \mathrm{Y}$ | input | Input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | Parity | L | Hold | $\mathrm{H}_{1} \mathrm{Z}$ | $\mathrm{Hi}-\mathrm{Z}$ |
| L | H | L | H | 5 | $F \rightarrow Y$ | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | Parity | H | $\underline{\log } \mathrm{Q} / 2 \rightarrow 0$ | Input | $Q_{0}$ |
| L | H | H | L | 6 | $F \rightarrow Y$ | Input | Input | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | Parity | H | $F \rightarrow 0$ | Hi-Z | $\mathrm{Hi}-\mathrm{Z}$ |
| L | H | H | H | 7 | $\mathrm{F} \rightarrow \mathrm{Y}$ | input | Input | $F_{3}$ | $\mathrm{F}_{3}$ | $F_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | Parity | L | $\mathrm{F} \rightarrow \mathrm{Q}$ | Hi-Z | $\mathrm{H}-\mathrm{Z}$ |
| H | L | L | L | 8 | Arth $2 \mathrm{~F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | Hold | Hi-Z | $\mathrm{Hi}-\mathrm{Z}$ |
| H | L | L | H | 9 | Log $2 F \rightarrow Y$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $F_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | Hold | $\mathrm{H}_{1}-\mathrm{Z}$ | $\mathrm{Hi}-\mathrm{Z}$ |
| H | L | H | L | A | Arth $2 \mathrm{~F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $F_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | $\log 2 Q \rightarrow Q$ | $\mathrm{O}_{3}$ | Input |
| H | L | H | H | B | $\log 2 F \rightarrow Y$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $F_{1}$ | $F_{0}$ | $\mathrm{SIO}_{0}$ | Input | L | $\log 2 Q \rightarrow Q$ | $\mathrm{O}_{3}$ | Input |
| H | H | L | L | C | $\mathrm{F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{Hi}-\mathrm{Z}$ | H | Hold | Hi-Z | Hi-Z |
| H | H | L | H | D | $\mathrm{F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $F_{1}$ | $\mathrm{F}_{0}$ | $\mathrm{H}-\mathrm{Z}$ | H | Log $2 \mathrm{Q} \rightarrow \mathrm{Q}$ | $\mathrm{Q}_{3}$ | Input |
| H | H | H | L | E | $\mathrm{SiO}_{0} \rightarrow \mathrm{Y}_{0}, Y_{1}, Y_{2}, Y_{3}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SiO}_{0}$ | $\mathrm{SIO}_{0}$ | $\mathrm{SiO}_{0}$ | Input | L | Hold | Hi-Z | Hi-Z |
| H | H | H | H | F | $\mathrm{F} \rightarrow \mathrm{Y}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{3}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{2}$ | $\mathrm{F}_{1}$ | $\mathrm{F}_{0}$ | Hi-Z | L | Hold | $\mathrm{H} / 2$ | Hi-Z |

The Am2903 special functions can be selected by the following conditions: $I_{0}=I_{1}=I_{2}=I_{3}=I_{4}=L O W, \overline{E N}=$ LOW


Figure 5.

## SOME SAMPLE MICROROUTINES

The following algorithms are implemented using the Am2903 Superslices ${ }^{\text {TM }}$ and Am2904 status and shift control unit. The algorithms were developed with the aid of AMDASM on System 29. All algorithms assume values and constants to be initialized prior to the entrance of the algorithms. Appendix $A$ relates the actual microcode to the microword fields. Appendix $B$ is the AMDASM Phase 1 and Phase 2 listings of the microprograms and the definitions of mnemonics. Figure 4 b is a block diagram of the CPU hardware including the Am2904 Status and Shift Control Unit from which the microroutines were developed. A detailed diagram of the CPU hardware is in Appendix C.

## Normalization, Single- and Double-Length

Normalization is used as a means of referencing a number to a fixed radix point. Normalization strips out all leading sign bits such that the two bits immediately adjacent to the radix point are of opposite polarity.
Normalization is commonly used in such operations as fixed-tofloating point conversion and division. The Am2903 provides for normalization by using the Single-Length and Double-Length Normalize commands. Figure 6a represents the Q Register of a 16 -bit processor which contains a positive number. When the Single-Length Normalize command is applied, each positive edge of the clock will cause the bits to shift toward the most significant bit (bit 15) of the Q Register. Zeros are shifted in via the QIOO port. When the bits on either side of the radix point (bits 14 and 15) are of opposite value, the number is considered to be normalized as shown in Figure 6b. The event of normalization is externally indicated by a HIGH level on the $\mathrm{Cn}+4$ pin of the most significant slice ( $\mathrm{Cn}+4$ MSS $=$ Q3 MSS $\forall$ Q2 MSS).
There are also provisions made for a normalization indication via the OVR pin one microcycle before the same indication is available on the $\mathrm{Cn}+4$ pin (OVR = Q2 MSS $\forall$ Q1 MSS). This is for use in applications that require a stage of register buffering of the normalization indication.
Since a number comprised of all zeros is not considered for normalization, the Am2903 indicates when wuch a condition arises. If the Q Register is zero and the Single-Length Normalization command is given, a HIGH level will be present on the $Z$ line.


Figure 6.

The sign output, N , indicates the sign of the number stored in the Q register, Q3 MSS. An unnormalized negative number (Figure 7a) is normalized in the same manner as a positive number. The results of single-length normalization are shown in Figure 7 b . The device interconnection for single-length normalization is outlined in Figure 8. During single-length normalization, the number of shifts performed to achieve normalization can be counted and stored in one of the working registers. This can be achieved by forcing a HIGH at the Cn input of the least significant slice, since during this special function the ALU performs the function $[B]+$ Cn and the result is stored in B. Figure 9 illustrates the singlelength normalize. However, the microcode is shown in Figure 10. Microcode for both single and double normalization can be reduced by one step by testing for zero during passing of number into Q .
Normalizing a double-length word can be done with the DoubleLength Normalize command which assumes that a user-selected

RAM Register contains the most significant portion of the word to be normalized while the $Q$ Register holds the least significant half (Figure 11.) The device interconnection for double-length normalization is shown in Figure 12. The Cn+4, OVR, N, and Z outputs of the most significant slice perform the same functions in double-length normalization as they did in single-length normalization except that $\mathrm{C}+4$, OVR, and $N$ are derived from the output of the ALU of the most significant slice in the case of doublelength normalization, instead of the Q Register of the most significant slice as in single-length normalization. A high-level Z line in double-length normalization reveals that the outputs of the ALU and Q Register are both zero, hence indicating that the doublelength word is zero.
When double-length normalization is being performed, shift counting is done either with an extra microcycle or with an external counter. Figure 13 illustrates the double-length normalize flowchart and Figure 14 shows the microcode.

a) Unnormalized Negative Single Length Number.

b) Normalized Negative Single Length Number.

Figure 7.


Figure 8. Single Length Normalize.

## Unsigned Multiply

This Special Function allows for easy implementation of unsigned multiplication. Figure 15 is the unsigned multiply flow chart. The algorithm requires that initially the RAM word addressed by Address port $B$ be zero, that the multiplier be in the $Q$ Register, and that the multiplicand be in the register addressed by Address port A. The initial conditions for the execution of the algorithm are that: 1) register $R_{1}$ be reset to zero; 2) the multıplicand be in $R_{0}$ and 3) the multiplier be in $\mathrm{R}_{15}$. The first operation transfers the
multiplier, $\mathrm{R}_{15}$, to the Q Register. The Unsigned Multiply instruction is then executed 16 times. During the Unsigned Multiply instruction, R1 is addressed by RAM address port B and the multiplicand is addressed by RAM address port A.
When the unsigned Multiply command is given, the $Z$ pin of device 1 becomes an output while the $Z$ pins of the remaining devices are specified as inputs as shown in Figure 18. The $Z$ output of device 1 is the same state as the least significant bit of the multiplier in the Q Register. The Z output of device 1 informs


Figure 9. Single Length Normalize.


Figure 10.


Figure 11. Double Length Word.


Figure 12. Double Length Normalize.


Figure 13. Double Length Normalize.

|  |  |
| :--- | :--- |
| 0148 | DLN R15,R15,0FF \& CONT \& SHOLD |
| 0149 | MAZ \& T \& CJP \& GOTO ABORT |
| 014 A | LOW R2 \& MAC \& T \& CJP \& GOTO END2 |
| 014 B | DLN R15,R15 \& SDUL \& MAO \& T \& CJP \& GOTO JUMP1 |
| 014 C | LOOP4 |
| $014 \mathrm{DLN} \mathrm{R15,R15} \mathrm{\&} \mathrm{SDUL} \mathrm{\&} \mathrm{MIO} \mathrm{\&} \mathrm{T} \mathrm{\&} \mathrm{CJP} \mathrm{\&} \mathrm{GOTO} \mathrm{JUMP1}$ |  |
| 014 E | JUMP1 |
| 014 F |  |
|  | PAR R2,R2 \& JP ONE \& GOTO LOOP4 |
|  | SDRQ R15, R15 \& SDMS \& END |

Figure 14.


Figure 15. Unsigned $16 \times 16$ Multiply.
the ALUs of all the slices, via their Z pins, to add the partial product (referenced by the B address port) to the multiplicand (referenced by the $A$ address port) if $Z=1$. If $Z=0$, the output of the $A L U$ is simply the partial product (referenced by the B address port). Since Cn is held LOW, it is not a factor in the computation. Each positive-going edge of the clock will internally shift the ALU outputs toward the least significant bit and simultaneously store the shifted results in the register selected by the B address port, thus becoming the new partial sum. During the down shifting process, the $C n+4$ generated in device 4 is internally shifted into the $Y_{3}$ position of device 4. At this time, one bit of the multiplier will down shift out of the $\mathrm{QIO}_{0}$ ports of each device into the $\mathrm{QIO}_{3}$ port of the next less significant slice. The partial product is shifted down between chips in a like manner, between the $\mathrm{SIO}_{0}$ and $\mathrm{SIO}_{3}$ ports, with $\mathrm{SIO}_{0}$ of device 1 being connected to $\mathrm{QIO}_{3}$ of device 4 for purposes of constructing a 32-bit long register to hold the 32-bit product. Shifting of the partial product between the $B$ address and $Q$ registers are accomplished via the Am2904. At the finish of the $16 \times 16$ multiply, the most significant 16 bits of the product will be found in the register referenced by the $B$ address lines while the least significant 16 bits are stored in the Q Register. Using a typical Computer Control Unit (CCU), as shown in Appendix $C$, the unsigned multiply operation requires only two lines of microcode, as shown in Figure 16, and is executed in 17 microcycles.

LQPT R15 \& F \& GRD \& PUSH \& COUNT 00E

## Two's Complement Multiplication

The algorithm for two's complement multiplication is illustrated by Figure 17. The initial conditions for two's complement multiplication are the same as for the unsigned multiply operation. The Two's Complement Multiply Command is applied for 15 clock cycles in the case of a $16 \times 16$ multiply. During the down shifting process the term $N \forall$ OVR generated in device 4 is internally shifted into the $Y_{3}$ position of device 4. The data flow shown in Figure 18a is still valid. After 15 cycles, the sign bit of the multiplier is present at the Z output of device 1. At this time, the user must place the Two's Complement Multiply Last cycle command on the instruction lines. The interconnection for this instruction is shown in Figure 18b. On the next positive edge of the clock, the Am2903 will adjust the partial product, if the sign of the multiplier is negative, by subtracting out the two's complement representation of the multiplicand. If the sign bit is positive, the partial product is not adjusted. At this point, two's complement multiplication is completed. Using a typical CCU, as shown in Appendix C, the two's complement multiply operation requires only three lines of microcode, as shown in Figure 19, and is executed in 17 microcycles.

## TWO'S COMPLEMENT DIVISION

The division process is accomplished using a four quadrant nonrestoring algorithm which yields an algebraically correct answer such that the divisor times the quotient plus the remainder equals the dividend. The algorithm works for both single precision and


Figure 17. 2's Complement $16 \times 16$ Multiply.


Figure 18.


Figure 19.
multı-precision divide operations. The only condition that needs to be met is that the absolute magnitude of the divisor be greater than the absolute magnitude of the dividend. For multi-precision divide operations the least significant bit of the dividend is truncated. This is necessary if the answer is to be algebraically correct. Bias correction is automatically provided by forcing the least significant bit of the quotient to a one, yet an algebraically correct answer is still maintained. Once the algorithm is completed, the answer may be modified to meet the user's format requirements, such as rounding off or converting the remainder
so that its sign is the same as the dividend. These format modifications are accomplished using the standard Am2903 instructions.

The true value of the remainder is equal to the value stored in the working register times $2^{n-1}$ when $n$ is the number of quotient digits.

The following paragraphs describe a double precision divide operator.

Referring to the flow chart outlined in Figure 20, we begin the algorithm with the assumption that the divisor is contained in $R_{0}$, while the most significant and least significant halves of the dividend reside in $R_{1}$ and $R_{4}$ respectively. The first step is to duplicate the divisor by copying the contents of $\mathrm{R}_{0}$ into $\mathrm{R}_{3}$. Next the most significant half of the dividend is copied by transferring the contents of $R_{1}$ into $R_{2}$ while simultaneously checking to ascertan if the divisor $\left(R_{0}\right)$ is zero. If the divisor is zero then division is aborted. If the divisor is not zero, the copy of the most significant half of the dividend in $\mathrm{R}_{2}$ is converted from its two's complement to its sign magnitude representation. The divisor in $\mathrm{R}_{3}$ is converted in like manner in


Figure 20. Two's Complement Division.
the next step, while testing to see if the results of the dividend conversion yielded an indication on the overflow pin of the Am2903. If the output of the overflow pin is 'one' then the dividend is $-2^{n}$ and hence is the largest possible number, meaning that it cannot be less than the divisor. What must be done in this case is to scale the dividend by down shifting the upper and lower halves stored in $\mathrm{R}_{1}$ and $\mathrm{R}_{4}$ respectively. After scaling, the routine requires that the algorithm be reinitiated at the beginning.
Conversely, if the output of the overflow pin is not a one, the sign magnitude representation of the divisor $\left(\mathrm{R}_{3}\right)$ is shifted up in the Am2903, removing the sign while at the same time testing the results of two's complement to sign magnitude conversion of the divisor in the Am2910. If the results of the test indicate that the divisor is $-2^{n}$ i.e., overflow equals one, then the lower half of the dividend is placed in the $Q$ register and division may proceed. This is possible because the divisor is now guaranteed to be greater than the dividend. If overflow is not a one then we must proceed by shifting out the sign of the sign magnitude representation of the dividend stored in $\mathrm{R}_{2}$. At this point we are able to check if the divisor is greater than the dividend by subtracting the absolute value of the divisor $\left(R_{3}\right)$ from the absolute value of the upper half of the dividend $\left(R_{2}\right)$ and storing the results in $R_{3}$. Next, the least significant half of the dividend is transferred from $\mathrm{R}_{4}$ to the $Q$ register while simultaneously testing the carry from the result of the divisor/dividend subtraction. If the carry $(C n+4)$ is
one, indicating the divisor is not greater than the dividend then a scaling operation must occur. This involves either shifting up the divisor or shifting down the dividend. If the carry is not one then the divisor is greater than the dividend and division may now begin.

The first divide operation is used to ascertain the sign bit of the quotient. The two's complement divide instruction is then executed repetitively, fourteen times in the case of a sixteen bit divisor and a thirty-two bit dividend. The final step is the two's complement correction command which adjusts the quotient by allowing the least significant bit of the quotient to be set to one. At the end of the division algorithm the sixteen bit quotient is found in the $Q$ register while the remainder now replaces the most significant half of the dividend in $\mathrm{R}_{1}$. It should be noted that the remainder must be shifted down fifteen places to represent its true value. The interconnections for these instructions are shown in Figures 21, 22, 23. Using a typical CCU as shown in Appendix C, the double precision divide operation microcode, is shown in Figure 24.

For those applications that require truncation instead of bias correction, the same algorithm as above should be implemented except one additional Two's Complement Divide instruction should be used in lieu of the Two's Complement Divide Correc* tion and Remainder instruction. However, this technique results in an invalid remainder.


Figure 21. Double Length Normalize/First Divide Operation.


Figure 22. 2's Complement Divide.


Figure 23. 2's Complement Divide Correction.

| 0119 | DIV | LOW R10 \& JSR \& GOTO INP |
| :---: | :---: | :---: |
| 011A |  | PAR R7,R15 \& JSR \& GOTO INP |
| 011B |  | PAR R1,R15, \& JSR \& GOTO INP |
| 011C |  | PAR R4,R15 \& CONT |
| 011D | L00P1 | PAR R3,R7 \& CONT |
| 011E |  | PAR R2,R1 \& T \& MIZ \& CJP \& GOTO ABORT |
| 011F |  | SMTC R2,R2 \& CONT Z |
| 0120 |  | SMTC R3,R3 \& T \& MIO \& CJP CZ \& GOTO SCALE1 |
| 0121 |  | ALUOFF \& T \& MIO \& CJP \& GOTO SKIP6 |
| 0122 |  | SURL R3,R3 \& SUL \& CONT |
| 0123 |  | SURL R2, Pi2 \& SUL \& CONT |
| 0124 |  | ALUOFF \& JP \& GOTO LOOP2 |
| 0125 | SCALE1 | LQPT R4 \& JSR \& GOTO SDIVD |
| 0126 |  | ALUOFF \& JP LOOP1 |
| 0127 | L00P2 | SSR R3,R2,YBUS \& CONT ONE |
| 0128 | SKIP6 | LQPT R4 \& F \& MIC \& CJP \& GOTO SKIP3 |
| 0129 |  | ALUOFF \& JSR \& GOTO SDIVD |
| 012A |  | SURL R2,R2 \& SDL \& CONT |
| 012B |  | ALUOFF \& JP \& GOTO LOOP2 |
| 012C | SKIP3 | ALUOFF \& F \& GRD \& LDCT \& COUNT OOC |
| 012D |  | DLN R1,R1',R7 \& T \& GRD \& SDUL \& PUSH |
| 012E |  | TDIV R1,R1,R7 \& F \& CNT- \& SDUL \& RFCT CZ |
| 012F |  | TDC R1,R1,R7 \& SUH \& CONT CZ |
| 0130 |  | QMOV R15 \& JSR \& GOTO OUTP |
| 0131 |  | PAR R15,R1 \& JSR \& GOT0 OUTP |
| 0132 |  | ALUOFF \& JP \& GOTO DIV |
| 0133 | SDIVD | PAR R1,R1 \& CONT |
| 0134 |  | ALUOFF \& T \& MIS \& CJP \& GOTO NEG |
| 0135 |  | PAR R1,R1,ADRQ \& SDDL \& CONT |
| 0136 |  | ALUOFF \& JP \& GOTO RET |
| 0137 | NEG. | PAR R1,R1,ADRQ \& SDDL \& CONT |
| 0138 | RET | QMOV R4 \& CONT |
| 0139 |  | PAR R10,R10 \& RTN ONE |

Figure 24.

## NON-RESTORING BINARY ROOTS

The algorithm for Non-Restoring Binary Roots is illustrated in Figure 25. The initial conditions required are: 1) the non-negative number to be rooted in the radicand register, $R_{1}$; 2) $R_{2}$ has the positive append bits $101_{B}$; 3) $R_{3}$ has the negative append bits $011_{B}$; 4) $R_{4}$ is the mask register with BFFF $_{H}$; 5) $R_{5}$ is the partial register with $4000_{H}$; and 6) the counter register, $\mathrm{R}_{6}$, with the value $08_{\mathrm{H}}$.
An example of the Non-Restoring Binary Root algorithm is shown in Figure 26. Starting at the binary point, the number to be rooted is partitioned into pairs. The partial value is subtracted from the first pair. An intermediate remainder and sign are then produced.


Figure 25. Non-Restoring Binary Root.


Figure 26. Non-Restoring Binary Root Example.

If the remainder is positive, a 1 is entered in the corresponding root bit. Then a 01 is appended to the partial, shifted and subtracted from the present remainder to produce the next remainder. When the remainder becomes negative, the present remainder is not restored. A 0 is entered in the next corresponding root bit. Then an 11 is appended to the partial, shifted and added to the present remainder. The entire process is repeated until the partial root has developed into 8 bits or the remainder is zero.
Referring to Figure 26, the same method of finding the root applies. A starting partial value, $R_{5}$, is subtracted from the radicand, $R_{1}$, which produces the intermediate remainder $R_{0}$. During this time, the sign of the remainder is stored within the Am2904. Then $R_{5}$ is masked by $R_{4}$ to obtain the next partial value and $R_{4}$ is shifted to obtain a new mask for the next cycle. Status is obtained from the Am2904 and tested. If the remainder is positive, a root bit of 1 is developed and bits 01 appended by $\mathrm{R}_{2}$. When negative, a root bit of 0 is developed and bits 11 appended by $R_{3}$. At this point $R_{6}$ is decremented and tested for zero. If $R_{6} \neq$ 0 , then addition or subtraction is performed on the remainder depending on the sign bit stored in the Am2904. A new remainder is produced and cycled through the procedure again. Figure 27 illustrates the microcode.

## BCD HARDWARE ADDITIONS

In applications where fast BCD operations are needed the designer has the option of using a slight amount of additional hardware to dramatically increase the performance of these operations. These firmware/hardware trade-off's are very application sensitive. The hardware-firmware examples given below are specifically for an intensive BCD system with a large fraction of conventional logic-arithmetic operations. The designer is willing to reduce cycle time slightly to increase BCD thru-put. Small hardware additions are acceptable as long as flexibility is retained.

| 0152 | SQRT. | LOW R10 \& CONT |
| :--- | :--- | :--- |
| 0153 |  | LOW R0 \& CONT |
| 0154 |  | PAR R1,R15 \& CONT |
| 0155 |  | PAR R2,R0,,DARB \& CONST 0005 \& CONT |
| 0156 |  | PAR R3,R0,,DARB \& CONST 0003 \& CONT |
| 0157 |  | PAR R4,R0,,DARB \& CONST H \# BFFF \& CONT |
| 0158 |  | PAR R4,R0,,DARB \& CONST 4000 \& CONT |
| 0159 |  | PAR R6,R0,,DARB \& CONST 0008 \& CONT |
| $015 A$ |  | SRS R0,R1,R5 \& CONT \& SHOLD |
| $015 B$ | CYCLE | AND R5,R5,R4 \& CONT |
| $015 C$ |  | SDRL R4,R4 \& MAS \& CJP \& GOTO END3 |
| $015 D$ |  | SDRL R0,R0, \& T \& MAS \& CJP \& GOTO POS |
| $015 E$ |  | OR R5,R3 \& JP \& GOT0 CNT |
| $015 F$ | POS | OR R5,R2 \& CONT |
| 0160 | CNT | SRS R6,R6,RIO \& CONT |
| 0161 |  | SDRL R2,R2, \& T \& MIZ \& CJP \& GOTO END3 |
| 0162 |  | SDRLR3,R3 \& T \& MAS \& CJP \& GOTO SUB |
| 0163 |  | ADD RO,R0,R5 \& JP \& GOTO CYCLE \& SHOLD |
| 0164 | SUB | SRS R0,R0,R5 \& JP \& GOTO CYCLE \& SHOLD |
| 0165 | END3 | JP \& GOTO SQRT |

Figure 27.

The hardware additions finally decided on were chosen to increase the performance of BCD to binary conversion, binary to $B C D$ conversion and BCD addition. The performance increases were approximately an order of magnitude in the first two cases, and a factor of 4 or 5 in the last case. A diagram of the additions ( $31 / 4 \mathrm{ICs}$ ) is given in Figure 28.
The 74S08 AND gates normally pass the carry from the Am2902A to the Am2903s. When microbit CZER is low the Carries-in are forced to zero. This is used to "disconnect" the carry so that a test may be done on each slice simultaneously. For example if a test for 5 or greater is desired a HEX B is added and


Figure 28.
the carry out of each slice will indicate the result of the test. This allows simultaneous tests on each individual slice and greatly increases thru-put. This addition increases the performance of BCD to binary conversion and binary to BCD conversion by at least an order of magnitude. The drawback to this addition is that the AND-gates introduce an extra gate delay in a critical path. The machine cycle time may be increased by about 8 ns . The increase in BCD performance will more than offset this delay for BCD intensive systems.

Another hardware addition is the Am25LS241 three-state buffer. This buffer allows the Am2904 to be used to store the carry-out status bits via the bi-directional Y bus.

The 25LS195A is wired as a 4-bit register with clear and enable. This register is used to store the carry-out bits from a test cycle. The outputs of the 25LS195A are ORed with the output of the Am2904 Y-bus and connected to the Am2909 OR inputs in the CCU. This allows a multi-way branch on the OR of two test cycles, greatly increasing the performance of BCD addition.

## BCD TO BINARY CONVERSION

The usual method of BCD to binary conversion is to divide the BCD number by 2. The 1 -bit remainder will indicate if a 1 existed in the BCD number. The previous division result is divided by 2 again and the remainder will indicate if a 2 existed in the BCD number. In general the remainder from a division by $2^{n}$ will indicate if a $2^{n-1}$ existed in the $B C D$ number.

These remainders can be used to construct the binary representation, $b_{n} 2^{n}+b_{n-1} 2^{n-1}+b_{n-2} 2^{n-2}+\ldots+b_{1} 2^{1}+$ $b_{0} 2^{0}$. The $b_{n}$ bit is thus the remainder from division step $n+$ 1. The binary representation may thus be created by shifting the remainders down until the m-bit BCD number has been divided by 2 m times.
To divide a BCD number by 2 a down shift is executed. The 4, 2 and 1 -bit positions will contain the correct result, but the 8 -bit position is incorrect. Its value has changed from 10 to 8 instead of from 10 to 5 . This means the resulting BCD number will have a value 3 greater than it should for the division by 2 to be correct. A 3 must be subtracted from any digit in which a 1 entered its 8 -bit.
A sample conversion is given in Table 10. The BCD number is gradually shifted down and corrected when necessary. The binary number is finally correct after 16 cycles.

A flow diagram for the algorithm is given in Figure 29. The BCD input, $A$, is shifted down into the binary output $B$, to start the loop. The constant 0888 is added to $A$ with the carries-in forced to zero. The resulting carries-out will indicate if A contaned a 1 in any of the 8 -bit positions. These carries are saved in status register SR1. A multi-way branch is then executed to enter the adjust table. The digits are adjusted depending on the previous test. At the same time a shift can be executed to prepare for the next test instruction. A test for end of loop is also done in this cycle to provide an exit if 16 iterations of the loop are complete. Finally a shift up of $B$ is needed to cancel the extra right shift when the loop is exited. The microcode for this algorithm is given in Figure 30.

TABLE 10.

| $\begin{gathered} \text { Digit } \\ 3 \end{gathered}$ | Digit | Digit 1 | $\begin{gathered} \text { Digit } \\ 0 \end{gathered}$ | $\underset{\text { Result }}{\mathrm{BCD}} \rightarrow \mathrm{Bi}$ |  | Operation |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0010 | 1001 | 0000 | 0100 |  |  |  |
| 0001 | 0100 | 1000 | 0010 | 0 | SHIFT |  |
| 0001 | 0100 | 0101 | 0010 |  | ADJUST | DIGIT 1 |
| 0000 | 1010 | 0010 | 1001 | 00 | SHIFT |  |
| 0000 | 0111 | 0010 | 0110 |  | ADJUST | DIGITS 2, 0 |
| 0000 | 0011 | 1001 | 0011 | 000 | SHIFT |  |
| 0000 | 0011 | 0110 | 0011 |  | ADJUST | DIGIT 1 |
| 0000 | 0001 | 1011 | 0001 | 1000 | SHIFT |  |
| 0000 | 0001 | 1000 | 0001 |  | ADJUST | DIGIT 1 |
| 0000 | 0000 | 1100 | 0000 | 11000 | SHIFT |  |
| 0000 | 0000 | 1001 | 0000 |  | ADJUST | DIGIT 1 |
| 0000 | 0000 | 0100 | 1000 | 011000 | SHIFT |  |
| 0000 | 0000 | 0100 | 0101 |  | ADJUST | DIGIT 0 |
| 0000 | 0000 | 0010 | 0010 | 1011000 | SHIFT |  |
| 0000 | 0000 | 0010 | 0010 |  | ADJUST | NONE |
| 0000 | 0000 | 0001 | 0001 | 01011000 | SHIFT |  |
| 0000 | 0000 | 0001 | 0001 |  | ADJUST | NONE |
| 0000 | 0000 | 0000 | 1000 | 101011000 | SHIFT |  |
| 0000 | 0000 | 0000 | 0101 |  | ADJUST | DIGIT 0 |
| 0000 | 0000 | 0000 | 0010 | 1101011000 | SHIFT |  |
| 0000 | 0000 | 0000 | 0010 |  | ADJUST | NONE |
| 0000 | 0000 | 0000 | 0001 | 01101011000 | SHIFT |  |
|  |  |  | 0001 |  | ADJUST | NONE |
|  |  |  | 0000 | 101101011000 | SHIFT |  |
|  |  |  | 0000 |  | ADJUST | NONE |
|  |  |  | 000 | 0101101011000 | SHIFT |  |
|  |  |  | 000 |  | ADJUST | NONE |
|  |  |  | 00 | 00101101011000 | SHIFT |  |
|  |  |  | 00 |  | ADJUST | NONE |
|  |  |  | 0 | 000101101011000 | SHIFT |  |
|  |  |  | 0 |  | ADJUST | NONE |
|  |  |  |  | 0000101101011000 | SHIFT |  |
|  |  |  |  |  | ADJUST | NONE |



Figure 29. BCD to Binary Conversion (16 Bits to 14 Bits).

## BINARY TO BCD CONVERSION

A method very similar to the one used for BCD to binary conversion may be used for binary to BCD conversion. The BCD number is created by shifting the binary number up, into a partial $B C D$ result. The BCD number is adjusted to provide a multiplication by 2. The shift adjust process continues until the least significant binary bit is shifted into the BCD result.

The adjustment is needed when a 1 is shifted from the 8 -bit position to the 1 -bit position of the next digit. the value has increased from 8 to 10 , instead of from 8 to 16 . To correct this a 6 must be added to any digit that has a 1 shifted out of its 8 -bit position. Alternately a 3 could be added before the shift to any digit that has a 1 in its 8 -bit position.

Another correction is needed whenever an invalid BCD digit is encountered. If a number greater than 9 is detected in any digit a 10 must be subtracted from that digit and a 1 added to the next highest digit. The same correction can be accomplished if a 6 is added to the invalid digit after the shift. To correct before the shift a 3 is added to any digit which contains a 5,6 or 7 . These adjustments are summarized in Table 11. Both adjustments may be accomplished by adding a 3 to any digit which is greater than 4 .

Table 12 shows an example conversion. The binary number is gradually shifted up and the BCD partial result adjusted. After 14 iterations the conversion is complete. A flow diagram for the algorithm is given in Figure 31.
A. $=R 0$
$\mathrm{B}:=\mathrm{Q}$

LOOP:

EXIT:

ENR \& COUNT LOOP \& CONT<br>PAS RO, RO LDRQ \& SDDL \& LDCT \& CONST 15

Figure 30.

TABLE 11.
\(\left.$$
\begin{array}{|c|c|c|}\hline \text { Present \# } & \begin{array}{c}\text { Adjustment } \\
\text { Before Shift }\end{array}
$$ \& Reason <br>
\hline 0000 \& NONE \& - <br>
0001 \& NONE \& - <br>
0010 \& NONE \& - <br>
0011 \& NONE \& - <br>
0100 \& NONE \& - <br>
0101 \& +3 <br>
0110 \& +3 <br>
0111 \& +3 \& <br>
1000 \& +3 <br>
1001 \& +3 <br>
1010 \& +3 <br>
1011 \& +3 \& Illegal BCD <br>
1100 \& +3 <br>
1101 \& +3 <br>
1110 \& +3 <br>

1111 \& +3\end{array}\right) \quad\)|  |
| :--- |

Initially the 14-bit binary number is left justified by two shift up operations. To start the loop the binary input, B, is shifted up, into the partial BCD result, A. The constant BBBB is added to $A$, with the carries-in forced to zero. The resulting carries-out are stored in status register SR1. A multi-way branch is used to enter the adjust table. The digits are adjusted depending on the result of the previous test. In the same instruction a shift is executed to prepare for the next test cycle. Additionally an end of loop test is used to provide an exit if 16 iterations of the loop are complete. Before the exit a fix-up cycle is used to cancel the extra shift executed in the loop. The microcode for this algorithm is given in Figure 32.

## BCD ADD

One method of performing a 4-digit BCD add is to do a 16 -bit binary add, with the carries-in forced to zero, and adjust the resulting sum. The adjustments are necessary to change invalid $B C D$ digits to valid BCD digits. When an invalid digit is modified a carry to the next highest digit is generated. This could cause a


Figure 31. Binary to BCD Conversion (14 Bits to 16 Bits).
Q. =Binary Input

Ro. = BCD Result

Figure 32. Binary to BCD Conversion Microcode (14 Bits to 16 Bits).

TABLE 12.

| Result |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $\begin{gathered} \text { Digit } \\ 3 \end{gathered}$ | $\begin{gathered} \text { Digit } \\ 2 \end{gathered}$ | Digit <br> 1 | $\begin{gathered} \text { Digit } \\ 0 \end{gathered}$ | Binary $\rightarrow B C D$ Conversion | Operation |  |
|  |  |  |  | 00101101011000 |  |  |
|  |  |  | 0 0 | 0101101011000 | SHIFT ADJUST | NONE |
|  |  |  | 00 | 101101011000 | SHIFT |  |
|  |  |  | 00 |  | ADJUST | NONE |
|  |  |  | 001 | 01101011000 | SHIFT |  |
|  |  |  | 001 |  | ADJUST | NONE |
|  |  |  | 0010 | 1101011000 | SHIFT |  |
|  |  |  | 0010 |  | ADJUST | NONE |
|  |  | 0 | 0101 | 101011000 | SHIFT |  |
|  |  | 0 | 1000 |  | ADJUST | DIGIT 0 |
|  |  | 01 | 0001 | 01011000 | SHIFT |  |
|  |  | 01 | 0001 |  | ADJUST | NONE |
|  |  | 010 | 0010 | 1011000 | SHIFT |  |
|  |  | 010 | 0010 |  | ADJUST | NONE |
|  |  | 0100 | 0101 | 011000 | SHIFT |  |
|  |  | 0100 | 1000 |  | ADJUST | DIGIT 0 |
|  | 0 | 1001 | 0000 | 11000 | SHIFT |  |
|  | 0 | 1100 | 0000 |  | ADJUST | DIGIT 1 |
|  | 01 | 1000 | 0001 | 1000 | SHIFT |  |
|  | 01 | 1011 | 0001 |  | ADJUST | DIGIT 1 |
|  | 011 | 0110 | 0011 | 000 | SHIFT |  |
|  | 011 | 1001 | 0011 |  | ADJUST | DIGIT 1 |
|  | 0111 | 0010 | 0110 | 00 | SHIFT |  |
|  | 1010 | 0010 | 1001 |  | ADJUST | DIGIT 2 |
|  | 0100 | 0101 | 0010 | 0 | SHIFT |  |
| 1 | 0100 | 1000 | 0010 |  | ADJUST | DIGIT 1 |
| 10 | 1001 | 0000 | 0100 |  | SHIFT |  |
| 10 | 1001 | 0000 | 0100 |  | ADJUST | NONE |
| 2 | 9 | 0 | 4 |  |  |  |

previously valid digit to become invalid. The word must be checked and modified until all digits are valid (up to four modification cycles could be necessary).
Initially the two BCD numbers are added with the carries-in to each digit forced to zero. The carries out are saved. Next the hex number 6666 is added to the sum, with the carries-in forced to zero, and the resulting carries out are saved. This tests each digit for validity, a carry-out indicating an invalid BCD digit


Figure 33. BCD Add.
(greater than 9). If a carry was generated in either cycle a 6 is added to the invalid digit, with carries-in forced to zero, to create the valid BCD digit. Additionally a 1 must be added to the next highest digit to provide the BCD carry-out. Each time a digit is adjusted the carry-out may invalidate the next highest digit. Thus adjustment cycles must be followed by validity tests until all digits are valid. A flow diagram for this algorithm is given in Figure 33. The microcode for this algorithm is given in Figure 34.


Figure 34. BCD Add Microcode.

## SUMMARY

In this chapter, a detailed description of the Am2904 was presented, along with an example timing analysis. Several microcode algorithms were discussed to show how the Am2904 operates in a 2903 based CPU. As can be seen, the Am2904 provides a powerful, single-chip LSI solution to the shift multiplexer, status register, and carry multiplexer design portion of a CPU using either the Am2901B or the Am2903.

The Appendix includes a full microcode listing. The interested reader is encouraged to study these listings to gain a better understanding of the hardware organization (Appendix C). An additional microcode listing (Appendix B) gives the AMDASM ${ }^{\text {TM }}$ definition file and source file for the microcode. The reader should study these listings while referring to the AMDASM Manual. (The Am2900 Family Data Book contains an AMDASM Reference Manual, document AM-PUB003, 4-78 FRODO.)






|  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | $\mid--x 0$ | $00000000 \times$ $00000000 \times$ | $\begin{aligned} & x--x 0 \\ & x--x 0 \end{aligned}$ | $0000000000000000 \times$ $0000000000000000 \times$ |
|  |  |  | - - $0^{0}$ | $00000000 \times$ | $\times 0-\times 0$ | $000000000000000 \times$ |
|  |  |  | - - $\times$ - | $00000000 \times$ | $\times 0-\times 0$ | $0000000000000000 \times$ |
|  |  |  | -0 0 | --....-.x | $\times 00 \times-$ | $0000000000000000 \times$ |
|  |  |  | $00 \times 0$ | $00000000 \times$ | $\times 00 \times 0$ | -.-.-.-..-. |
|  |  |  | -0 0 | $00000000 \times$ | $\times 00 \times 0$ | $0000000000000000 \times$ |
|  |  |  | -0 0 | $00000000 \times$ | $\times 00 \times 0$ | 0000000000000000x |
|  |  |  | -0 0 | $00000000 \times$ | $\times 00 \times 0$ | $0000000000000000 \times$ |
|  |  |  | -0 0 | $00000000 \times$ | $\times 00 \times 0$ | $0000000000000000 \times$ |
|  |  |  | - $0 \times 0$ | $00000000 \times$ | $\times 00 \times 0$ | $000000000000000 \times$ |
|  |  |  | - $0 \times 0$ | $00000000 \times$ | $\times 00 \times 0$ | $0000000000000000 \times$ |
| 10373507315 |  |  | $\cdots$ | ------ | $\cdots$ |  |
|  | N 31 |  | -00- | 000000000 | --0-- | 00000000000000000 |
|  | N3 Snivis |  | $\cdots$ | $\cdots$ | $\cdots$ |  |
|  | Na LIIHS | $\stackrel{\sim}{\circ}$ | -0-- | 000000000 | 000-- | 00000000000000000 |
|  |  |  | -0-- | -...-.0 | 000-- | $\cdots-r-r-r-m$ |
|  |  |  | -00- | -...-- | --00- |  |
|  |  |  | --0- | -... | --0 | $\cdots \cdots \cdots \cdots \cdots \cdots$ |
|  |  |  |  |  |  |  |
|  |  |  | $\times \times \times \times$ | ---r---0 | $\times \times \times \times \times$ | - |
|  |  |  | $\times \times \times \times$ | --0 | $\times \times \times \times \times$ | ---.-.-.-.-.- 0 |
|  |  |  | $\times \times \times \times \times$ $\times \times \times \times \times$ | -------0 | $\times \times \times \times \times$ $\times \times \times \times \times$ | -------------- |
|  | Allaviod |  | $\times \times \times \times$ |  | $\times \times \times \times \times$ |  |
|  | Na果 |  | $\cdots \times \times$ | ----r-- | $\cdots \times \times \times$ |  |
|  |  |  | -00- | 000000000 | 0000- | 00000000000000000 |
|  |  |  | -00- | 000000000 | $0000-$ | 00000000000000000 |
|  |  |  | 000- | 000000000 | $0000-$ | 0000000000000000 |
|  |  |  | 0000 | 000000000 | 000 | 0000000000 |
|  |  |  | $\times 00 \times$ | 000000000 | $0000 x$ | 00000000000000000 |
|  | ¢ |  | $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  |  | $\times 0-x$ | $\cdots$ | $000-x$ | $\cdots \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot \cdot 0$ |
|  |  |  | $x 0-x$, | $\cdots-\cdots-1$ | 000-x |  |
|  |  |  | $x--x$ | $00000000-$ | $\cdots-{ }^{-\cdots}$ |  |
|  |  |  | $\left\{\begin{array}{l} x-0 x \\ x \\ x \end{array}\right.$ | $00000000-$ <br> 000000000 | -100x | 0000000000000000- |
|  | 䔍 |  | $x-0 x$ | ---.-. | ---0x | ----------- |
|  |  |  | $x-0 x$ | ---.-.-. | -0-0x | ---------------- |
|  |  |  | xo-x | 000000000 | 000-x | 0000, 000000000000 |
|  |  |  | $\times 00 \times$ | $00000000-$ | -x |  |
|  |  |  | x00x | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  |  | $: \begin{array}{llll} x & 0 & 0 & x \\ x & 0 & 0 & x \end{array}$ | 000000000 | 00000x | 00000000000000000 |
|  |  |  | - $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  |  | $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  | $\stackrel{8}{4}$ | $: \begin{array}{llll} x & 0 & 0 & x \\ x & 0 & 0 & 0 \end{array}$ | 000000000 $000000000$ | $\begin{aligned} & 0000 \times \\ & 00000 x \end{aligned}$ | 00000000000000000 |
|  |  |  | - | 000000000 | 0000x | 00000000000000000 |
|  |  | \% | $\times 00 \times$ | 000000000 | -0000x | 00000000000000000 |
|  |  |  | $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  |  | $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  |  |  | $\times 00 \times$ | 000000000 | $0000 \times$ | 00000000000000000 |
|  | 승 |  | -0 | 000000000 | 000-0 | 00000000000000000 |




## APPENDIX B

AMDOS /29 AMDASM MICRO ASSEMBLER, V1.1 CPUII DEFINITIONS

```
;AJVANCF MICRO DEVICES
; AM2983 AND AM2904 DEFTNITION FILE FOR CPUII
;
;REV. OCTOBER 17. 1978
```

```
WOPD 90
```

; equates

```
NEM: EQU 5#F
SPT: EQU F#\varnothing
OFF: EQU E#1
```

; 2903 DESTINATION MODIFIERS
ADF: FQU $\mathrm{H} \# \emptyset$
LDF: IQU H\#1
ADRQ: FQU E\#Z
LDRQ: EQU H\#Z
fri: EQU E\#S
LDQP: EQU F\#5
QPT: EQU $\#$ \#
EQPT: $3 Q U H \# 7$
AUR: ERU E\#8
IUR: EQU E\#9
AURQ: EQU H\#A
LURQ: EQU F\#B
YBUS: EQU Y\#C
LUQ: EQU E\#D
SINX: EQU E\#F
; CONSTANTS

| H0: | EQU $4 \# 8$ |
| :---: | :---: |
| R1: | EQU H\#1 |
| R2: | EQU H\#2 |
| R3: | EQU F\#3 |
| F4: | EQU H\#4 |
| F5: | EQU F\#5 |
| R6: | E2U E\#6 |
| B7: | EQU H\# |
| Pe: | EOU H\#8 |
| R9: | EQU E\#9 |
| F10: | FQU |
| F.11: | EQU H\#B |
| 212: | EQU E\#C |
| F13: | EQU H\#D |
| F.13: | EQU F\#E |
| P15: | EOU H\#F |

AMDOS / 29 AMDASM MICRO ASSEMBLER, V1.1 CPUII DEFINITIONS

## ;2903 SOURCE MODIFIERS

RADB: EQU $3 B \# Q 1$
RAQ: EQU 3B\#Ø10
DARE: EQU 3B\#100
DADE: EQU 3B\#181
DAQ: EQU 3B\#110
; I / O

| ICIN: | EQU 12H\#\#1 |
| :---: | :---: |
| BIN: | EQU 12H\#10 |
| BOUT: | EQU 12H\#Ø8 |
| LMAR: | EQU 12H\#10 |
| YREG: | EQU 12H\#®2 |
| AOUT : | EQU 12H\#40 |
| IOUT: | EQU 12H\# 4 |

; CARRY SELECT

```
ONZ: EQU 2B#Ø1
CZ: EQU 2B#10
```

;SUB DEFINITIONS

```
SUBG: SUB 36X,1B#\ell,4VX, \VX,4VX
SUB1: SUB 36X,1B#@,4VX,4VX,4VX,4VH#F
SUE2: SUB 36X,1B##,4VX, {VX,4X,4VH#F
SUB3: SUB 3VB#\varnothingD\varnothing,16X,1B#\emptyset,13X
SUB4: SUB 36X,1B#D,12X
SUZ5: SUE 44X,1B#0,15X
SUB6: SUB 44X,1B#\varnothing,15X
SUB7: SUB 2EX
SUEE: SUB 36X,1E#Q,4VX,EX,4VH#F
SUR9: SUB 36X,1B#D,4VX,4X,4VX,4VE#F
SUB1Q: SU3 36X,1E#@,4VX,4VX,4X
SUE11: SUB 24X,2VE#\emptyset@,34X,4D#\emptysetQD0,1E#1,5X
SUE12: SUB 77X,1B#1,12VXH#ø%
SU313: SUE SPF,ЗVB#\emptyset\emptyset\varnothing,16X,1B#\emptyset,13X
SUB14: SUB 24X,2VE#ด\varnothing,34X,4F##D00,2D#1\varnothing
SUB15: SUB 23X,1B#\varnothing,6X
SUB16: SUB SPF,3B#D00,16X,1VB#\varnothing,13X
SUE17: SUB 54X
SUB18: SUB 22X,1B#D,7X
SU319: SU3 16X,1B#Z,13X
SUE20: SUE 1X,1VB#D,14X
SIIB21: SUB 30X,H#B,2\emptysetX
```

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
CPUII NEFINITIONS
ACK: DEF 66X,E#9,20X
OBF: DEF 66X,H#A,2\emptysetX
CNT: DEF 66X,H#F,20X
GRD: DEF 66X,H#\ell,20X
JZ: DEF SUB11,H#D,SUB20
CJS: DEF SUB11,H#1,SUR20
JMAP: DEF SUB11,H#2,SUB20
CJP: DEF SUB11,H#3,SUB20
PUSH: DEF SUB11,H#4,SUE2\emptyset
JSRP: DEF SUB11,H#5,SUB20
CJV: DEF SUB11,E#6,SUB20
JRP: DEF SUR11,F#7,SUR20
RFCT: DEF SUB11,H#&,SUB2Q
BPCT: DEF SUB11,H#9,SUB20
CRTN: DEF SUB11,H#A,SUR20
CJPF: DEF SUB11,H#B,SUB20
LDCT: DEF SUB11,H#C,SUB20
LOCP: DEF SUB11,H#D,SUE2\emptyset
CONT: DEF SUB11,H#E,SUB20
JP: DEF SUB11,H#F,SUB2\emptyset
JSR: LEF SUE14,H##1,SUB20
RTN: DEF SUB14,H#0A,SUB20
;STARED CONTFOL FIELD
GOTC: DEF SUE12
COUNT: DEF SUB12
PUT: DEF T7X,1B#&,12VXH#@%
;PCZARITY CCNTRCI
I: DEF 65X,1B#1,24X
F: DEF 65X,1B#\emptyset,24X
;2923 CONTROL/FUNCTIONS
IN: DFF 3EX,1B#1,H#F, &X,H#F,H#\emptyset,19X,1B#\emptyset,13X
OUT: LEF Z6X,1B#\emptyset,8X,H#F,H#C,H#6,SUE3
YOFF: DEF 36X,1B#1,53X
HIGH: DEF SUB8.H#D,33#010,SUB19
SRS: DEF SUB1,H#1,SUBZ
SSR: DEF SUB1,H#2,SUB3
ADD: DEF SUB1,H#3,SUB3
PAS: DEF SUB2,H#4,SUB3
SOMS: DEF SUB2,E#5,SUB3
PAF: DEF SUB9,H#6,SUB3
COMR: DEF SUEQ,H#',SUEZ
LON: DEF SUBE,H#E,3X,SUB19
CRAS: DEF SUB1,H#9,SUB3
XNRS: DEF SUE1,H#A,SUBZ
XOR: LEF SUB1,H#B,SUB3
AND: DEF SUB1,H#C,SUBZ
NOR: DEF SUR1,H#D,SUBZ
NAND: DEF SUB1,H#5,SUB3
GR: DEF SUB1,H#F,SUB3
;29@z SPECIAL FUNCTIONS
```

```
AMDOS/29 AMDASM MICRO ASSEMBLER, V1.1
``` CPUII DEFINITIONS
```

WMUL: DFF SUB\emptyset,H\# D,SUB16
TCM: DEF SUB\emptyset,H\#2,SUB16
SMTC: DEF SUB1\varnothing,H\#5,SUB16
TCMC: DEF SUB0,H\#6,SUB16
SLN: DEF SUB1\emptyset,H\#8,SUB16
DLN: DEF SUEO,H\#A,SUB16
TDIV: DEF SUB\emptyset,H\#C,SUB16
TDC: DEF SUBO,H\#E,SUB16
INC: DEF SUB10,H\#4,SUB16
SDQP: DEF SUB4,H\#5,4X,SUB3
SUQP: DEF SUB4,H\#D,4X,SUB3
LQPT: DEF 36X,1B\#Ø,8X,4VX,H\#6,F\#6,SUB3
RMOV: DEF SUB2,H\#4,SUB3
QMOV: DEF 36X,1B\#Ø,4VX,8Y,MEM,H\#4,3B\#Ø10,SUB19
SDRI: DEF SUB10,H\#1,H\#4,SUB3
SURL: DEF SUB10,H\#9,H\#4,SUB3

```
; 2904 SHIFT CONTROL
SDDH: DEF SUB7, H\#3, SUBE
SDUH: DEF SUB7,H\#7, SUB5
SDDL: DEF SUE7, H\#6, SUB6
SDUL: LEF SUB7,H\#6,SUB5
FDD: DEF SUB?, H\#F,SUBE
FDU: DEF SUB?, H\#F,SUB5
SSXO: DEF SUB7, H\#E,SUB6
RSD: DEF SUB7,H\#A,SUB6
RSU: DEF SUB7, H\#A,SUB5
SUL: DEF SUB7,H\#2,SUB5
SU\#: DEF SUB7,H\#3,SUB5
SDL: \(\quad\) EEF SUB7, H\# \(\emptyset\), SUB6
SDH: DEF SUB7,H\#1,SUB6
SDMS: DEF SUB7,H\#5,SUBE
SMS: LEF SUR7,H\#2,SUB6
SDDC: DEF SUB7,H\#7, SUB6
SDUC: DEF SUB7,H\#4,SUB5
;2904 MICRC INSTRUCTICN CODES
RSTI: DEF \(36 X, 6 B \# \varnothing \varnothing 0011\), SUB1?
SWAP: DEF \(3 \mathrm{X}, 6 \mathrm{BHDQ} \mathrm{\varnothing 日10,SUB17}\)
SHLD: EQU 1B\#1
;2904 MACEINE INSTRUCTION CODES
LMA: DEF SUE15,6B\#Ø00ø00, SUB17
RSTA: DEF SUE15,6B\#ØQØQ11,SUR17
SEOLD: LFF 23X,1B\# \(0,66 \mathrm{X}\)
```

AMDOS/29 AMDASN.MICRO ASSEMBLER, V1.1
CPUII DEFINITICNS
MIZ: DEF SUB18,6B\#010100,SUB21
MIO: DFF SUB18,6B\#010110,SUB21
MIC: DEF SUB1\&,6B\#011010,SUB21
MIS: DEF SUB18,6B\#011110,SUB21
;2904 MACHINE STATUS SELECT
MAZ: DEF SUB18,6B\#100100,SUE21
MAO: DEF SUB18,6B\#100110,SUB21
MAC: DEF SUB18,6B\#101010,SUB21
MAS: DEF SUB18,6B\#101110,SUR21
;DEVICE DISABIF
ALUCFF: DEF 7 X,1B\#1,13X
ALIOFF: DEF 7 X,3B\#111,13X
;LOAD CONSTANT
CONST: DEF 16 VXH\#0%,4X,1P\# }0,69
;BCD STATUS REGISTER CONTROL
ENR: DEF 16X,1B\#0,73X
CLSR2: DEF 17X,1B\#0,72X
ENSR1: DEF 18X,1B\#1,71X
CZERO: DEF 19X,1B\#Ø,70X

```

END
TCTAL PHASE 1 RRRORS \(=\varnothing\)
```

;ALVANCE MICKC IEVICES

```
; AM2S己̃ AND AM2S®4 CPUII SOURCE FILE
\begin{tabular}{|c|c|c|}
\hline 0106 & & ORさ E\#100 \\
\hline 8184 & INP: & ALUGFF \& \(T\) \& CBF \& CJP \& GCTO INF \\
\hline 8101 & & ALUOFF \& PUSH \\
\hline 8122 & & IN \& T \& OBF \& LOOP \& PUTT ICIN \\
\hline 8185 & & ALUCFF \& RTN \\
\hline \%104 & CUTP: & CUT \& CONT \& PUT YREG \\
\hline Q105 & & ALUOFF \& PUSH \\
\hline ¢18.E & & ALUCFF \& F \& ACK \& LOOP \& PUT IOUT \\
\hline 010? & & ALUCFF \& PUSFi \\
\hline 8108 & & ALUCFF \& \(T\) \& ACK \& LCOP \\
\hline 8108 & & ALUOFF \& RTN \\
\hline 010A & USM: & LOW R1 \& JSR \& GOTO IMP \\
\hline 818 E & & PAR RQ,R15 \& JSR \& GOTO INP \\
\hline 8180 & & LQPT R15 \& F \& GRD \& PUSH \& COUNT Q EE \\
\hline 8160 & & UMUL R1,R1,RD \& F \& CNT \& SDDL \& RFCT \\
\hline Q101 & & PAR \(\mathrm{K} 15, \mathrm{R} 1\) \& JSR \& GOTO OUTP \\
\hline 810F & & QMOV R15 \& JSR \& GOTO OUTP \\
\hline 8110 & & JP \& GCTO USM \\
\hline 6111 & SM: & LOW R1 \& JSh \& \% TOTO INP \\
\hline 8112 & & PAR RQ,R15 \& JSR \& GOTO INP \\
\hline 0113 & & LQPT K15 \& F \& GRD \& PUSH \& COUNT \(20 D\) \\
\hline 0114 & & TCM R1,R1, Kl \& F \& CNT \& SDDL \& RFCT \\
\hline 8115 & & TCilC R1,R1,R0 \& SDDL \& CCNT CZ \\
\hline D11E & & PAR R15,R1 \& JSR \& GOTO OUTP \\
\hline 8117 & & QMOV R15 \& JSR \& GOTO OUTP \\
\hline Q11E & & ALUOFF \& JP \& GCTO SM \\
\hline <11s & DIV: & LOW R10 \& JSR \& GOTO INP \\
\hline 011 A & & PAR \(\mathrm{K}^{\prime}\) ?, K15 ¢ JSR \& GOTO INP \\
\hline 211B & & PAR R1,R15 \& JSR \& GOTO INP \\
\hline 8.11 C & & PAR R4,R15 \& CCNT \\
\hline Q11D & L00F1: & FAR R3,R7 \& CONT \\
\hline 011E & & Par R2,R1 \& T \& MIZ \& CJP \& GOTO ABCRT \\
\hline 011F & & SNTC R2,R2 \& CONT CZ \\
\hline Q120 & & SMTC R3,R3 \& T \& MIC \& CJP CZ \& GOTO SCALE1 \\
\hline 8121 & & ALUCFF \& T \& MIO \& CJP \& GOTO SEIP6 \\
\hline 8122 & & SURI R3,R3 \& SUL \& CONT \\
\hline 0123 & & SURL R2,R2 \& SUL \& CONT \\
\hline 8124 & & ALUCFF \& JP \& GCTO LCOP2 \\
\hline ¢125 & SCALE1: & LQPT K 4 \& JSK \& GOTO SDIVD \\
\hline \%126 & & ALUOFF \& JP LCOP1 \\
\hline 0127 & LOCP2: & SSR R15,R3,R2,YBUS \& CONT ONE \\
\hline 8128 & SKIP6: & LQPT R4 \& F \& MIC \& CJP \& GOTO SKIP3 \\
\hline Q129 & & ALUOFF \& JSR \& GOTO SDIVD \\
\hline D12A & & SDRL R2,R2 \& SDL \& CONT \\
\hline ¢12E & & ALUOFF \& JP \& GOTC LOOP2 \\
\hline 812 C & SKIP3: & ALUCFF \& F \& GRD \& LDCT \& COUNT Q CC \\
\hline O12D & & DLN R1, R1, R7 \& T \& GRD \& RDU \& PUSH \\
\hline 012 E & & TDIV R1,R1,R7 \& F \& CNT \& RDU \& RFCT CZ \\
\hline 812 F & & TDC R1,R1,R7 \& SUH \& CONT CZ \\
\hline 8130 & & QMOV R15 \& JSR \& GOTO OUTP \\
\hline
\end{tabular}

AMDCS／29 AMDASM MICRO ASSEMBLER，V1．1
\begin{tabular}{|c|c|c|}
\hline 8131 & & PAR R15，R1 \＆JSR \＆GOTO OUTP \\
\hline 8132 & & ALUOFF \＆JP \＆GOTC DIV \\
\hline ®133 & SDIVD： & PAR R1，R1 \＆CONT \\
\hline D134 & & ALUOFF \＆T \＆MIS \＆CJP \＆GOTO NEG \\
\hline 8135 & & PAR R1，R1，ADRQ \＆SDDL \＆CONT \\
\hline D136 & & ALUOFF \＆JP \＆GOTO RET \\
\hline 0137 & NET： & PAR R1， \(\mathrm{R} 1, \mathrm{ADRQ}\) \＆SDDI \＆CONT \\
\hline ¢138 & RET： & QMOV R4 \＆CCNT \\
\hline \(\triangle 139\) & & PAR R10，R1D \＆RTN ONE \\
\hline 213A & SINORN： & JSE \＆GOTO INP \\
\hline 813B & & LQPT R15 \＆CONT \\
\hline ®13C & & SLN R2，R2，OFF \＆CCNT \＆SHOLD \\
\hline Q13D & & MAZ \＆T \＆CJP \＆GOTO ABORT \\
\hline 813E & & MAC \＆T \＆LOW RD \＆CJP \＆GOTO END \\
\hline D13F & & SLN K2，R2 \＆MAC \＆T \＆CJP ONE \＆GOTC END \＆SUI． \\
\hline 8142 & Aưain： & SIN \(\mathrm{FL}, \mathrm{R} 2\) \＆MIO \＆F \＆CJP ONF \＆GOTO AGAIN \＆SUL \\
\hline 8141 & & SDIP \＆SMS \＆CONT \\
\hline \(814{ }^{\circ}\) & & SRS R2，R2，R6 \＆CON＇ \\
\hline 0143 & & QMOV K15 \＆JSF \＆さOTC OUTP \\
\hline 8144 & & PAR R1J，R2 \(\hat{\alpha}\) JSR \＆GOTC OUTP \\
\hline 0145 & ERD \({ }^{\text {d }}\) & JP \＆GCTC SLNCRM \\
\hline D14E & IINORM： & JSA \＆GOTO INF \\
\hline 8147 & & LQPT R15 \＆JSR \＆GOTC INP \\
\hline 0148 & & DLN R15，R15，R15，CFF \＆CCNT \＆SHOLD \\
\hline 8145 & & MAZ \＆T \＆CJP \＆GOTO ABCRT \\
\hline ¢14A & & LOW R2 \＆MAC \＆T § CJP \＆GOTO END2 \\
\hline ¢14F & & DLN R15，R15，R15 \＆SDUL \＆MAO \＆T \＆CJF \＆GCTC JUMP1 \\
\hline 814 C & LOOP4： & IIN R15，R15， 15 \＆SDUI \＆MIO \＆\％CJF \＆GOTO JUMP1 \\
\hline C14 \({ }^{\text {D }}\) & & PAR R2，R2 \＆JP ONE \＆JOTO LOOP4 \\
\hline 014 F & IUMP1： & PAR R2，Rá \＆CCNT ONE \\
\hline 814 F & & SLRQ R15，R15 \＆SDMS \＆JSR \＆LUTO OUMP \\
\hline Q156 & & QMOV R15 \＆JSR \＆GOTC OUTP \\
\hline ¢151 & END2： & JP \＆GCTO DLNORM \\
\hline ＜152 & SQET： & ICN R10 \＆CONT \\
\hline －1うこ & & LOW Re \＆JSR \＆GOTC INP \\
\hline ＜154 & & Pair R1， 15 \＆CONT \\
\hline 815 E & & PAK \(\mathrm{F} 2, \mathrm{KD}, \mathrm{LARB}\) \＆CONST DOE 5 \＆CONT \\
\hline －1כ¢ & &  \\
\hline － 157 & & FAR \(\mathrm{K} 4, \mathrm{RD}, \mathrm{DARP}\) \＆CCNST H 4 BFFF \＆CONT \\
\hline \％158 & & PAR \(\overline{5} 5, \mathrm{RD}, \mathrm{LARE}\) \＆CONST 408 D \＆CONT \\
\hline O15 & & FAR RG，RD，DARB \＆CONST OUC8 \＆CONT \\
\hline ¢15A & & SRS Re，R1，R5 \＆CONT \＆SHCLD \\
\hline \＆15B & CYCLE： & HAD R5，R5，R4 \＆CONT \\
\hline 015 C & & SDRL R4，R4 \＆MAS \＆CJP \＆GOTO END3 \\
\hline 2151 & &  \\
\hline 815 E & & OR R5，R3 \＆JP \＆GOTO CNT \\
\hline \(\triangle 15 \mathrm{~F}\) & POS： & CR R5，R2 \＆CONT \\
\hline 2168 & CNT： & SRS RE，RE，R10 \＆CONT \\
\hline \(\bigcirc 161\) & & SDRL R2，R2 \＆T \＆MIZ \＆CJP ，SHLD \＆GOTO END 3 \\
\hline 0162 & & SDAL R3，R3 \＆T \＆MAS \＆CJP \＆GCTC SUR \\
\hline 8163 & & ALD FQ，RD，R5 \＆JP \＆ \\
\hline 8164 & SUB： & SRS Re，Re，RS \＆JP \＆GOTC CYOLE \＆SHOLD \\
\hline 8165 & End3： & JP \＆GOTC SQRT \\
\hline
\end{tabular}

AMDCS/ZS AMDASM MICRO ASSEMELER, V1.1

も16E ABCFT: ALUOFF \& JP \& GOTO ABORT
0167 JF \& GOTO DIV

AMDOS／29 AMDASM MICRO ASSEMBLER，V1．1
0100 XXXXXXXXXXXXXXXX XXXXXXXAQXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0000 \(1110100011 \mathrm{X01100} 0100000000\)
0101 XXXXXXXXXXXXXXXX XXXXXXYXDØXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXøDめD 1XXXXX0100X01XXX XXXXXXXXXX
D102 XXXXXXXXXXXXXXXX XXXXXXXXD0XXXXXX XXXX11111XXXXXXX X1111DDDDXXX0DØD \(1110101101 \mathrm{X00000} 0000000001\)
ø103 XXXXXXXXXXXXXXXX XXXXXXXYø日XXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXə000 \(1000001010 \mathrm{XD1XXX}\) XXXXXXXXXX
 1XXXXX1110X0C060 0000000610
0105 XXXXXXXXXXXXXXXX XXXXXXXX00XXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXZDD日 1XXXXXø100Xø1XXX XXXXXXXXXX
 \(1010011101 \times 010002000000100\)
Q107 XXXXXXXXXXXXXXXX XXXXXXXXXZXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXEXDO 1XXXXXV160XE1XXX XXXXXXXXXX
 1110011101 Xe 1 XYX XXXXXXXXX
Ø1øS YXXXXXXXXXXXXXXX XXXXXXXXDOXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXQ日DD \(1000001010 \times 0.1 \mathrm{XXX} \mathrm{XXXXXXXXXX}\)
 \(1000 \mathrm{coDe} \mathrm{\ell} 1 \mathrm{x00140}\)
－108060 0
D1DE XXXXXXXXXXXXXXXX XXXXXXXX00XXXXXX XXXXDD00DXXXX1111111101100000000 100060001800180012000602
E10C XXXXXZXXXXXXXXXX XXXXXXXXDEXXXXXX XXXXEXXXXXXXX1111011081100000800 1000000140 X 001000000001110
Q18D XXXXXXXXXXXXXXXX XXXXXXXX000110XX XXXX22．2010001000 0000002000000020 10111110e0000XXX XXXXXXXXX
010 E XXXXXXXXXXXXXXXX XXXXXXXXZDXXXXXX XXXXD1111XXXXDOD 1111101102000000

Q 10 F XXXXXXXXXXXXXXXX XXXXXXXXXDXXXXXX XXXXD1111XXXXXXX X111101000100000 \(1200000001 \times 001000100000100\)
Q116 XXXXXXXXXXXXXXXX XXXXXXXXXDXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXQDQD 1YXXXX1111X0X100 D10QLR1210
 10cedebey 1xeb1e0 0102002000
Q112 XXXXXXXXXXAXXXXX XXXXXXYXQDXXXXXX XYXXD2000XXXX111 1111101102002000 10000000018001000100000000
 100000 180 Xe．6100 D020民C1101
 10111118060e0 KXX XXXXXXXXXX
2115 XXXXXXXXXXXXXXXX XXXXXXXY100110YX XXXX000012001000 0011000000200000 1XXXXX1110DDDXXX XXXXXXXXXX
 \(1000000001 \mathrm{XeC100} 010000 \mathrm{Cl} 100\)
D117 XXXXXXXXXXXXXXXX XXXXXXXXDDXXXXXX XYXXX1111XXXXXXX X111121200100DD0

 1XXXXX1111XD1100 ： 10021001
Q119 XXXXXXXXXXXXXXXX XXXXXXXXDOXXXXXX XXXXD1Q10XXXXXXX X111112Q0XXXQQ\＆D \(1202000261 \times 021008100000008\)
©11A XXXXXXXXXXXXXXXX XXXXXXXXZDXXXXXX YXXXDD111XXXX111 1111101100800080 \(1000200801 \mathrm{Xe0100} 010000008\)
 1000000001 XDO 100 1002000002
 1XXXXX1118

AVLCS／ZS AMDASM MICRO ASSEMBLER，V1．1
\＆11I XXXXXXXXXXXXXXXX XXXXXXXXD：OXXXXXX XXXXD0011XXXXD111111101100000000 1 \(\mathrm{XXXXY} 1110 \mathrm{X} \varnothing 2 \mathrm{XXX} \mathrm{XXXXXXXXXX}\)
 \(1110110011 \times 0.01002101100110\)
\(011 F \mathrm{YXXXXXXXXXXXXXXX} \mathrm{XXXXXXXX10XXXXXX} \mathrm{XXXX000120012XXX} \mathrm{X01} \mathrm{\ell 100000000000}\) 1XXXXX111eXXeXXX XXXXXXXXXX
0120 XXXXXXXXXXXXXXXX XXXXXX0X10XXXX刃1 \(0110000110011 \mathrm{XXX} \mathrm{XD10100000000000}\) \(1110110011 \mathrm{X00100} 0100100101\)
 \(1110110011 \mathrm{X} 01120 \quad 100101000\)
 1XXXXX1110000XXX XXXXXXXXXX
12 2 xXXXXXXXXXKXXXXX XXXXXXXXD00010XX XXXX000100010YXX X100101000000000 1 \(\mathrm{XXXXX} 1112000 \mathrm{XXX} \times \mathrm{XXXXXXXXX}\)
 \(1 \mathrm{XXXXX1111} \mathrm{\times 01100} \mathrm{\quad 0100100111}\)
 \(1000000001 \times 00100 \quad 0100110011\)
 1 \(\mathrm{XYXXX} 1111 \mathrm{X} \varnothing 1 \mathrm{XXX}\) XXXXXXXXXX
0127 XXXXXXXXXXXXXXXX XXXXXXXX01XXXXXX XXXX011110011001 0110000100000000 1 XXXXX 1110 XD 0 XXX XXXXXXXXXX
\(012 \varepsilon\) XXXXXXXXXXXXXXXX XXXXXXQXe0XXXXD1 \(10100 X X X X X X X X \varrho 100011001100000000\) \(1010118011 \times 00100 \quad 0100101100\)
 \(1000000601 \times 011000100110011\)
Q12A XXXXXXXXXXXXXXXX XXXXXXYX000000XX XXXX000100010XXX X080181000000000 1 XXXXX 1110000 XXX XXXXXXXXXX


Ø12C XXXXXXXXXXXXXXXX XXXXXXXXD0XXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXX0ロ0日 \(1000601100 \times 611000000021100\)
 \(1100000100000 \times \mathrm{XXX} \times \mathrm{XXXXXXXXX}\)
 \(1011111000.00 \times X X \quad X X X X X X X X X X\)
\(012 \mathrm{~F} \quad \mathrm{XXXXXXXXXXXXXXXX} \mathrm{XXXXXXXX100011XX} \mathrm{XXXXD000100010111111000000000000}\) 1 \(\mathrm{XXXXX1110000XXX} \mathrm{XXXXXXXXXX}\)
 \(1000000601 \times 0.1000100000100\)
\＆1 21 XXXXXXXXXXXXXXXX XXXXXXXXø0XXXXXX XXXXD1111XXXX0001111101100000000 \(1000000001 \times 601000100000100\)
 1XXXXX1111X01100 0100011001
 1 XXXXX1110Xロ0XXX XXXXXXXXXX
 \(1110110011 \times 011000100110111\)
0135 XXXXXXXXXXXXXXXX XXXXXXXX0ø0110XX XXXXD0001XXXXø001001001100000000 1 XXXXX 1110002 XXX XXXXXXXXXX
 1XXXXX1111 X01100 0100111000
0137 XXXXXXXXXXXXXXXX XXXXXXXX000110XX XXXX00001XXXX0001001001100000000 1 XXXXX11102e0XXX．XXXXXXXXX
0138 XXXXXXXXXXXXXXXX XXXXXXXXøøXXXXXX XXXXø0100XXXXXXX X11110100ø1000ฎ0 1 XXXXX 1110 X 00 XXX XXXXXXYXXX
0139 XXXXXXXXXXXXXXXX XXXXXXXX01XXXXXX XXXX01010XXXX101 0111101100000000 \(1000001010 \times 06 \times X X \quad X X X X X X X X X X\)

AVDOS／ZG AMIASM MICFO ASSEMBLER，V1．1

C13A XXXXXXXXXXXXXXXX XXXXXXXXQXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXX2Q®®

013 F XXXXXXXXXXXXXXXX XXXXXXXXZ0XXXXXX XXXXDXXXXXXYX1111011001100000000 \(1 \times X X X \times 1110 \times 06 \mathrm{XXX}\) XXXXXXXXXX
 1 XXXXX 1110 DV 1 XXX XXXXXXXXXX
E13I XXXXXXXXXXXXXXXX XXXXXXQXQQXXXX10 \(010 \ell X X X X X X X X X X X X ~ X X X X X X X X X X X X \varnothing \varnothing 0 \ell\)

毛13E XXXXXXXXXXXXXXXX XXXXXXXXDDXXXX10 \(1 \times 1000000 X X X X X X X ~ X 11111000 X X X 0000\) \(1110110011 \times 001200101000101\)
 1110110011000100010100.101
 10101100110001008101000002
0141 XXXXXAXXXXAXXXXX XXXXXXXX0DDD1XXX XXXXDXXXXXXXXXXX XQ1Q1XXXXQQQQD日Q 1 XXXYX 1110000 XXX XXXXXXXXXX
 \(1 \mathrm{XXXXX} 1110 \mathrm{X} \triangle \mathrm{XXX}\) XXXXXXXXXX
Q143 XXXXXXXXXXXXXXXX XXXXXXXXQQXXXXXX XXXXø1111XXXXXXX X11110100010200Q



\＆ 145 XXXXXXXXXXXXXXXX XXXXXXXXQOXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXDODQ \(1 \times \times \times X \times 1111 \times 0 \times 1008100111010\)
Q146 XXXXXXXXXXXXXXXX XXXXXXXXQ冃XXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXø冃冃冃 1860068も01：0X100 8100000000
 \(1200600081 \times 001008180000000\)
 1XXXXX1110XD1XXX XXXXXXXXXX
 \(1110110011 \times 0 \times 1062101100110\)
Х14A XXXXXXXXXXXXXXXX XXXXXX0XDDXXXX1D 101000010XXXXXXX X11111000XXXD000 \(1110110011 \times 001002101010001\)
 11101100110001000101001110
014 C XXXXXXXXXXXXXXXX XXXXXX0X0QQ11001 \(011001111111111111810800000000 \ell 0\) 11101100110001000101001110
 1XXXXX1111Xe01ष0 \(\quad 101001100\)
 1 XXXXX1110X60XXX XXXXXXXXXX
 \(1000000001000180.10 \mathrm{Cib0} \mathrm{\ell 100}\)
\(\oslash 150\) XXXXXXXXXXAXXXXX XXXXXXXXQ0XXXXXX XXXXZ1111XXXXXXX X1111®1000100000 1000060001X0610D 0106000100
6151 גXXXXXXXXXAXXYXX XXXXXXXXDøXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXQQ\＆D 1XXXXX1111XQX1ष0 101000110


8153 XXXXXXXXXXXXXXXX XXXXXXXXøDXXXXXX XXXXQQD0日XXXXXXX X11111000XXXø000 \(1000000001 \times 001008100000000\)
 1 XXXXX 1110 XD 0 XXX XXXXXXXXXX
 \(1 \mathrm{XXXXX1} 110 \mathrm{Xe日XXX} \mathrm{XXXXXXXXXX}\)
 1 XXXXX 1110 X 00 XXX XXXXXXXXXX

AMLCS/ZS AMDASM MICRC ASSEMBLER, V1. 1

\author{
 \(1 \times X X X X 1110 \times 63 \times X X X X X X X X X X X X\) \\  \(1 \mathrm{XXXXX1110} \mathrm{\times 00XXX} \mathrm{XXXXXXXXXX}\) \\ \(315=\leftarrow 000000000001020\) XXXX0XXX00XXXXXX XXXXD0110XXXX000 5111101101000000 \(1 \mathrm{XXXXX} 1110 \times 0.0 \mathrm{XXX} \mathrm{XXXXXXXXXX}\) \\  1 XXXXX111~XD0XXX XXXXXXXXXX \\  1 XXXX .1116 XCOXXX XXXXXXXXXX \\ 0150 XXXXXXXXXXXXXXXX XXXXXX0X00XXXX10 1110001000100XXX X000101000000000
 \\  \(1110110011 \times 0.100 \quad 2101011111\) \\  \(1 \mathrm{XXXXX} 1111 \mathrm{X00100} 0181100000\) \\ 015 F XXXXXXXXXXXXXXXX XXXXXXXXD0XXXXYX XXXX001010010XXX X111111110000000 1XXXXX1110 \(000 \times X X \quad X X X X X X X X X X\) \\  1XXXXX1110X0.DXXX XXXXXXXXXX \\ 2161 xXXXXXXYXXXXXXYX XXXXXX0X00XXYXQ1 \(01000 \downarrow 0100010 \times X X X 000101000000000\) \(1110110011 \times 101008101100101\) \\ \(016 \approx \mathrm{XXXXXXXXXXXXXXXX} \mathrm{XXXXXXDXDOXXXX10} 1110000110011 X X X \quad X 002101200000000\) \(1110110811 \times 00102 \times 101100100\) \\  \(1 \times X X X X 1111 \times 60100 \quad 2101011011\) \\ Q16 \(4 \times X X X X X X X X X \triangle X X X X X ~ X X X X X X X 022 X X X X X X ~ X X X X 0000000000101111100010000000\)
 \\ 0165 XXXXXXXXXXXXXXXX XXXXXXXXDOXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXD \(0 \varnothing \varnothing\) \(1 \mathrm{XXXXX} 1111 \times 0 \times 100 \quad 101010010\) \\  1 XXXXX1111X义1100 010110飞110 \\  \(1 \times \times \times X X 1111 \times 0 \times 100 \varnothing 100011001\)
}

\section*{Am2903 MNEMONICS}

\section*{\(I_{0}\) FUNCTION}
\begin{tabular}{ll} 
RAMB & RAM B - OUTPUT \\
Q & Q REGISTER \\
SPF & SPECIAL FUNCTIONS
\end{tabular}

\section*{ALU Functions}
\begin{tabular}{|c|c|c|}
\hline SPF & Special Functions & \\
\hline HIGH & \(\mathrm{Fi}=\mathrm{HIGH}\) & HIGHS \\
\hline SRS & Subtract R from S & \(S-R-1+C_{n}\) \\
\hline SSR & Subtract S from R & \(R-S-1+C_{n}\) \\
\hline ADD & Add R and S & \(R+S+C_{n}\) \\
\hline PAS & Pass S & \(\underline{S}+\mathrm{C}_{n}\) \\
\hline COMS & 2's Complement S & \(\bar{S}+\mathrm{C}_{n}\) \\
\hline PAR & Pass R & \(\underline{R}+\mathrm{C}_{n}\) \\
\hline COMR & 2's Complement R & \(\overline{\mathrm{R}}+\mathrm{C}_{\mathrm{n}}\) \\
\hline LOW & \(\mathrm{Fi}=\) LOW & LOW'S \\
\hline CRAS & Complement R AND with S & \(\overline{\mathrm{R}}\) AS \\
\hline XNRS & Exclusive NOR R with S & RVS \\
\hline XOR & Exclusive OR R with S & RVS \\
\hline AND & AND R with S & RAS \\
\hline NOR & NOR R with S & RVS \\
\hline NAND & NAND R with S & \(\overline{\text { RAS }}\) \\
\hline OR & OR R with S & RVS \\
\hline
\end{tabular}

\section*{ALU Destination Control}

ADR Arithmetic Shift Down, Results Into RAM
LDR Logical Shift Down, Results into RAM
ADRQ
* LDQP Logical Shift Down Contents of Q Register, Generate Parity
* QPT Results Into Q Register, Generate Parity

RQPT Results Into RAM and Q Register, Generate Parity AUR Arithmetic Shift Up, Results Into RAM
LUR Logical Shift Up, Results Into RAM
AURQ Arithmetic Shift Up, Results Into RAM and Q Register
LURQ Arithmetic Shift Up, Results Into RAM and Q Register
* YBUS Results to Y BUS Only
* LUQ

SINX
REG Results to RAM, Sign Extend
\(*=\overline{\text { WRITE }}=H\)

\section*{Special Functions}

UMUL
TCM
INC
SMTC
TCMC
SLN
DLN
Two Complement Division Correction




MICROPROGRAM MEMORY





Central Processing Unit


Chapter V
Program Control Unit

\section*{Introduction}

In order to access instructions and data in an orderly manner within a computer, a Program Control Unit is usually used to provide the most efficient mechanism for program control. A program is a set of instructions which direct the processor to perform a specific task. Ordinarily, program instructions are stored in sequential memory locations. During the normal processing of a program, an instruction is fetched from the location specified by the program counter, the instruction is executed, the program counter is incremented, and another fetch and execute cycle begins. The addressing mechanisms that such control unit might employ are various. Indeed there are some machines that literally use dozens of addressing modes to fetch instructions and data. In this discussion of program control units, several of the addressing modes and their common implementation techniques will be discussed. The addressing modes used commonly in today's machines include register, immediate, direct, indirect, index, and relative and various combinations thereof.

\section*{Data Formats}

Technically, an instruction set manipulates data of various length words. Generally speaking, most 16 bit minicomputers can manipulate data of three different word lengths: 8 -bit bytes, 16 -bit words and 32 -bit double words. This data may represent fixed point numbers, floating point numbers, or logical data. The data is used as operands for the instructions, and is manipulated as indicated by the particular instruction being executed.
Typically, fixed point data is treated as signed 15 -bit integers in the 16 -bit representation or as signed 31-bit integers in the 32-bit double length notation. Positive and negative numbers are represented in the ordinary 2's complement notation with the sign bit carrying negative weight. Positive numbers have a sign bit of zero and negative numbers have a sign of one. The numerical value of zero is always represented with all bits LOW.
Floating point numbers consist of a signed exponent and a signed fraction. Many different formats are used by manufacturers in expressing floating point data and these variations will not be described here. Let it simply suffice to say that the floating point number represents a quantity expressed as the product of a fraction times the number 2 raised to the power of the exponent. In some cases, the number 16 is raised to the power of the exponent. Typically, all floating point numbers are assumed to be normalized prior to their use as operands. No pre-normalization is performed and all results are post-normalized. Usually, the floating point instruction set will normalize un-normalized floating point numbers.
Logical operations are used to manipulate 8 -bit bytes, 16 -bit words or 32-bit double words. All bits participate in the logical operations.

\section*{Instruction Formats}

Various minicomputers use different types of instruction formats ranging from the very simple straight forward formats to the more complicated difficult to decode formats. For example, a register to register format can consist of a simple 8-bit opcode and two 4-bit source operand specifiers. On the other hand, it may consist of a byte or word specifier, an opcode specifier, source and destination register specifiers, and mode specifiers for each of the source and destination register selections. Agaın, it is not the purpose of this application note to describe all of the trade-offs in selecting instruction formats but rather to select a simple format such that the student of bipolar microprogrammed microprocessors can understand the techniques used by instructions for operating the machine.

Thus, we will use a few 16-bit and 32-bit formats in this application note to demonstrate the function of the program control unit in various types of instruction execution.

\section*{Instruction Types}

For purposes of this application note, we will define nine different instruction types using various addressing modes. As we define these instruction types, we will use the basic ADD instruction as the example in all cases. It should be recognized that the operations of the instructions are similar for all the arithmetic as well as logical type operations. However, by using the ADD instruction it will be easier to describe the operation of each of these instructions rather than to try to be very general in their description. Figure 1 shows all nine instruction types with their appropriate names. As is seen, four of the instruction types are single 16 -bit word instructions while five of the instruction types are double word or 32-bit, instructions. The advantage of the double word instructions is that a second word can be used as an address whereby it provides an index value or a second word can be used for data which is used as an immediate value.

\section*{Register-to-Register Instructions}

When the register-to-register (RR) instruction is executed, it is simply a technique for selecting two of the machine's internal working registers in order to execute the desired operation. The instruction is fetched from memory and placed in the instruction register and the source register R2 and second source register R1 are selected as the two source operands for the ALU. Register R1 is the destination register in addition to being a source register and the results of the ALU operation will be placed in the register specified by the R1 field. In the instruction format shown in Figure 1 for the register-to-register instruction, the 8 -bit opcode field specifies the machine operation to be performed. The next 4-bit field, R1, in the instruction format specifies the address of the first operand. In most machines, the R1 field is normally the address of a general register. The 4-bit R2 field in the register-to-register instruction format specifies the address of the second operand; this also is normally the address of a general register. In most machines, the R1 field also in addition to being a source operand is the destination general register select. Thus, the results of the operation are stored in the register selected by the R1 field.
The RR instructions are used for operations between registers. We are assuming in this discussion that the machine contains 16 general registers which function as accumulators or index registers in all arithmetic and logical operations. Each general register contains a 16 -bit word consisting of two 8 -bit bytes. For arithmetic operations, the most significant bit is considered the sign bit using 2's complement representation. The general registers of the machine are usually numbered from 0 to 15 (decimal) and written in hexadecimal notation as 0 through \(F\). In this example, the general registers have not been given specific functional assignments. However, in some machines certain registers are assumed to perform specific functions. These can include specific stack pointer registers and program counter registers. Figure 2 depicts the typical signal path for executing the RR instruction in a bit-slice system.
The actual operation of the Register-to-Register Instruction is as follows. First, the instruction is fetched and placed in the instruction register as shown in Figure 2. This is part of the fetch routine. Next, the instruction is decoded via the mapping PROM and the appropriate microinstruction in the microprogram memory selected and placed in the pipeline register. Then, the instruction is executed where the two registers in the general purpose registers of the Am2903 are selected by the contents of the R1 and R2 fields of the instruction register. The actual microcode required to


Register-to-Memory Reference
\begin{tabular}{|l|l|l|r|}
\hline 0 & \multicolumn{1}{r|}{} \\
\hline & OP & R1 & X2 \\
\hline
\end{tabular}

Memory-to-Memory


Regıster Short Immediate


Register-to-Indexed Memory
\begin{tabular}{|l|l|l|r|r|}
\hline 0 & 15 & 16 & & 31 \\
\hline OP & R1 & X2 & & ADDRESS \\
\hline
\end{tabular}

Register-to-Memory Immediate
\begin{tabular}{|l|l|l|l|r|}
\hline 0 & \multicolumn{6}{|r|}{15} & 16 & & 31 \\
\hline OP & R1 & X2 & & DATA \\
\hline
\end{tabular}

Memory-to-Memory Indexed
\begin{tabular}{|l|l|l|lll|}
0 & \multicolumn{5}{|r|}{15} \\
\hline
\end{tabular}

Register Immediate
\begin{tabular}{|l|l|r|rr|}
\hline 0 & \multicolumn{1}{l|}{16} & 16 & 31 \\
\hline OP & R1 & & & DATA \\
\hline
\end{tabular}

Memory Immedıate
\begin{tabular}{|l|l|l|ll|}
\hline 0 & \multicolumn{1}{|c|}{15} & 16 & 31 \\
\hline OP & X 1 & & & DATA \\
\hline
\end{tabular}

\section*{ADD INSTRUCTION}
\((R 1) \leftarrow(R 1)+(R 2)\)
\((\mathrm{R} 1) \leftarrow(\mathrm{R} 1)+[(\mathrm{X} 2)]\)
\([(\mathrm{X} 1)] \leftarrow[(\mathrm{X} 1)]+[(\mathrm{X} 2)]\)
\((R 1) \leftarrow(R 1)+\) DATA
\((\mathrm{R} 1) \leftarrow(\mathrm{R} 1)+[(\mathrm{X} 2)+\mathrm{A}]\)
\((\mathrm{R} 1) \leftarrow(\mathrm{R} 1)+\) DATA \(+[(\mathrm{X} 2)]\)
\([(\mathrm{X} 1)] \leftarrow[(\mathrm{X} 1)]+[(\mathrm{X} 2)+\mathrm{A}]\)
\((R 1) \leftarrow(R 1)+\) DATA
\[
[(\mathrm{X} 1)] \leftarrow[(\mathrm{X} 1)]+\text { DATA }
\]

Note• (R1) means the contents of register 1.
\([(\mathrm{X} 1)]\) means the contents of the word whose address is in R1.
Figure 1. Various Instruction Types for the ADD operation.


MPR-562
Figure 2. Register-to-Register Instructions Select Two Registers in the Am2903 Array for Instruction Execution.
execute this instruction is shown in Figure 3. Here, we assume the Program Counter (PC) value is contained in one of the general registers and can be selected by microcode as well as the R1 and R2 fields. This was shown in Chapter 3.

\section*{Register-to-Memory-Reference}

The register-to-memory-reference instruction is one whereby the contents of the memory location pointed to by the register identified with the X2 value is fetched from memory and then added to the register value specified in the R1 field. The result of this operation is placed in the register specified by the R1 field.

Figure 4 shows a general block diagram of the hardware used to implement the instruction types described in the first part of this application note. As shown, the memory address register can be driven by either the Y outputs or the DB outputs of the Am2903s.

In addition, the Y outputs of the Am2903s can be placed onto the memory data bus by means of a three-state buffer. The computer control unit is intended to be representative of that described in Chapter 2 of this application note series. For purposes of this discussion, we assume the program counter (PC) is one of the general purpose registers within the Am2903 register stack. Later, we will change this concept and use the PC external to Am2903.

The operation of the register-to-memory-reference instruction as depicted in Figure 1 can best be described by referring to Figure 5. Here, we see the first three microinstructions that represent the fetch routine for the currently described machine. First, the program counter is placed in the memory address register and the program counter is incremented and returned to the PC register.


Figure 3. Register-to-Register Instruction Microcode.


Figure 4. Simple Memory Addressing Scheme with PC in the ALU.


Figure 5. Register to Memory Reference Instruction Microcode.

Next, the instruction is fetched from memory and placed in the instruction register within the CCU. Thirdly, the instruction is decoded via the mapping PROM and the appropriate microinstruction selected and placed in the pipeline register. To execute this particular register-to-memory-reference instruction, it is necessary to place the contents of the register specified by the X2 field into the memory address register. Then the contents of memory can be fetched and the operand added to the value currently contained in the register specified by the R1 field. The result of this operation is placed in the register specified by the R1 field. All totaled, the execution of this register to memory reference instruction requires five microcycles as depicted in this example.

\section*{Memory to Memory}

This instruction is one whereby the memory location pointed to by the contents of the register specified in the X2 field is fetched and the memory location pointed to by the contents of the register locations specified in the X1 is fetched and these two operands are added together. At the completion of the instruction, the resultant is placed in the memory location as defined by the contents of the register specified in the X1 field.
The Memory to Memory Instruction operation is also depicted by the block diagram shown in Figure 4. In fact, all of the next six instructions to be defined utilize the block diagram of Figure 4 to represent the hardware required for implementing these instructions.
The microcode required for the memory to memory instruction is detailed in Figure 6. The first three microinstructions represent the fetch routine. In the fourth microinstruction, the contents of the register specified by the X2 field are placed in the memory address register. Then, in the fifth microinstruction the contents of
this memory location is loaded into the Q register within the Am2903. This value is temporarily held for use later. In the sixth microinstruction, the contents of the register specified by the X1 field in the instruction is placed in the memory address register. On the seventh microinstruction, this operand is fetched from memory and added to the contents of the \(Q\) register with the result being placed in the \(Q\) register. In the eighth microinstruction, the current contents of the Q register is returned to the memory location. This memory location is specified by the contents of the register specified by the X 1 field and is still in the memory address register. Thus, we have used the Q register as a temporary holding register for the data used in this instruction.

\section*{Register with Short-Immediate}

This instruction is a technique whereby a 4 -bit field is added to the contents of the register specified by the R1 field. Thus, short jumps or branches can be executed within a range of zero to fifteen memory locations. The more significant 12-bits of the word are zero filled.
The register with short immediate instruction operates very similar to the register-to-register instruction. The microcode for this instruction is shown in Figure 7. The only difference between the register-to-register instruction and the register short-immediate instruction is that instead of adding operands specified by the R1 and R2 fields, we take a data value contained in a four-bit field in the instruction as depicted in Figure 1 and add it to the contents of the register specified in the R1 field. The results of the operation are returned to the register specified by the R1 field. This addition is performed by taking the 4 -bit data value shown in Figure 1 as the DATA and zero filling the twelve most significant bits. This gives us a 16-bit word ranging in value between zero and fifteen. Thus, short jumps can be implemented using this technique.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & 77 & T8 & T9 & T10 & T11 & T12 \\
\hline PC \(\rightarrow\) MAR; PC + \(1 \rightarrow\) PC & X & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & X & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline (X2) \(\rightarrow\) MAR & & & & X & & & & & & & & & \\
\hline MEM \(\rightarrow\) Q & & & & & x & & & & & & & & \\
\hline \((\mathrm{X} 1) \rightarrow\) MAR & & & & & & X & & & & & & & \\
\hline \(\mathrm{MEM}+\mathrm{Q} \rightarrow \mathrm{Q}\) & & & & & & & X & & & & & & \\
\hline \(Q \rightarrow\) MEM & & & & & & & & X & & & & & \\
\hline
\end{tabular}

Figure 6. Memory to Memory Instruction Microcode.


Figure 7. Register Short Immediate Instruction Microcode.

\section*{Register to Indexed Memory}

The 16 -bit word in the register defined by X2 in the instruction is added to the address that is the second word of memory. Then, this address is used to fetch an operand from memory which is added to the contents of the register pointed to by R1. The results of this operation are then placed in R1. The instruction format for this instruction was shown in Figure 1.
The Register to Indexed Memory Instruction is shown is Figure 8 and executed in the following manner. First, the current PC value is placed in the MAR and PC +1 is returned to the PC register. Next, the instruction at this memory location is fetched and placed in the instruction register. On the third cycle this instruction is decoded and the contents of the microprogram memory placed in the pipeline register. On the fourth microinstruction, the PC value is again placed in the MAR and PC +1 is returned to the PC register. On the fifth microinstruction, the value at this location in memory is fetched and added to the contents of the X2 register
with the result being placed in the MAR. And on the sixth microinstruction, the operand pointed to by this address is fetched and added to the contents of R1 with the result being placed in the register pointed to by the R1 field of the instruction.

\section*{Register to Memory Immediate}

In the register to memory immediate instruction, the contents of the memory location pointed to by the register specified in the X2 field is fetched from the memory and the data value which is in the second word of the instruction is also fetched from memory and added to it. This result is then added to the contents of the R1 register and the final result replaces the value currently in R1.

The register to memory immediate instruction as shown in Figure 1 is implemented using the microcode shown in Figure 9. Again, the first three microinstructions are the fetch routine. The fourth microinstruction is used to take the contents of the register specified by the X2 field and place it in the memory address
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & то & T1 & T2 & T3 & T4 & T5 & T6 & 17. & T8 & T9 & T10 & T11 & T12 \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & x & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & x & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & & & & x & & & & & & & & & \\
\hline \(\mathrm{MEM} \mathrm{+} \mathrm{X2} \rightarrow\) MAR & & & & & x & & & & & & & & \\
\hline \(\mathrm{MEM} \mathrm{+} \mathrm{R1} \rightarrow\) R1 & & & & & & x & & & & & & & \\
\hline
\end{tabular}

Figure 8. Register to Indexed Memory Instruction Microcode.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & T7 & T8 & T9 & T10 & T11 & T12 \\
\hline \(P C \rightarrow M A R ; P C+1 \rightarrow P C\) & X & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & X & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline (X2) \(\rightarrow\) MAR & & & & X & & & & & & & & & \\
\hline \(\mathrm{MEM}+\mathrm{R} 1 \rightarrow \mathrm{R} 1\) & & & & & x & & & & & & & & \\
\hline \(P C \rightarrow M A R ; P C+1 \rightarrow P C\) & & & & & & x & & & & & & & \\
\hline \(\mathrm{MEM} \mathrm{+} \mathrm{R1} \rightarrow\) R1 & & & & & & & X & & & & & & \\
\hline
\end{tabular}

Figure 9. Register to Memory Immediate Instruction Microcode.
register. Next, the operand at this memory location is brought into the Am2903's and added to the contents of the register specified by the R1 field with the results returned to that register. The sixth microinstruction is used to set up the memory address register to fetch the second word of the instruction. The seventh microinstruction brings this data value into the Am2903 ALU via the data bus and adds this value to the contents of the register specified by the R1 field. The result of the operation is placed into the register specified by the R1 field.

\section*{Memory to Memory Indexed}

The memory to memory indexed instruction is one whereby the contents of the register specified in the X2 field are added to the second word of the instruction to form a new address. This address is then used to fetch an operand which is added to the operand selected by taking the contents of the register specified in the R1 field and using that as a memory address to fetch an operand. The result of this addition is then replaced in the memory location pointed to by the contents of the register specified in the X1 field.

The memory to memory indexed instruction is probably the most complicated of the instruction formats described in the application note. In all, nine microinstructions are required for its implementation. Basically, the first three microinstructions are used to fetch the instruction from memory, place it in the instruction register, and decode the instruction for initial operation. Again, the basic fetch routine. Microinstruction number 4 sets up the memory address register to fetch the second word of the instruction and microinstruction number 5 is used to bring this value from mem-
ory into the Am2903 ALU where it is added to the X2 register. The results of the addition are placed into the memory address register during this microinstruction. This value is used to fetch a value from memory which is placed in the \(Q\) register using microinstruction number 6. In the seventh microinstruction, the contents of the register pointed to by the X1 field are placed in the memory address register so that microinstruction eight can be utilized to bring this memory value into the Am2903s where it is added to the contents of the Q register with the result being placed into the \(Q\) register. Microinstruction number 9 is used to place this value back into the memory location as specified by the contents of the register pointed to by the X1 field. This memory address is still contained in the memory address register so that no updating is required. The total microcode required to implement this instruction routine is shown in Figure 10.

\section*{Register Immediate}

The register immediate instruction is a very useful instruction which allows data to be added to the contents of the register. In this example, the second word of the instruction is fetched and added to the contents of the register specified in the R1 field.

Figure 11 depicts the microcode used to implement the register immediate instruction. Here, the first three microinstructions are the fetch routine for the instruction. The fourth microinstruction of this routine sets up the MAR to fetch the second word of the two word instruction. The contents of this memory location is brought into the Am2903 ALU and added to the contents of the register specified by the R1 field. The result of this operation is placed in the register specified by the R1 field.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & T7 & T8 & T9 & T10 & T11 & T12 \\
\hline \(P C \rightarrow M A R ; P C+1 \rightarrow P C\) & X & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & x & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & & & & x & & & & & & & & & \\
\hline MEM + X2 \(\rightarrow\) MAR & & & & & x & & & & & & & & \\
\hline MEM \(\rightarrow\) Q & & & & & & X & & & & & & & \\
\hline \((\mathrm{X} 1) \rightarrow\) MAR & & & & & & & x & & & & & & \\
\hline \(\mathrm{MEM}+\mathrm{Q} \rightarrow \mathrm{Q}\) & & & & & & & & x & & & & & \\
\hline Q \(\rightarrow\) MEM & & & & & & & & & X & & & & \\
\hline
\end{tabular}

Figure 10. Memory to Memory Indexed Instruction Microcode.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & 77 & T8 & T9 & T10 & T11 & T12 \\
\hline \(\mathrm{PC} \rightarrow\) MAR; PC \(+1 \rightarrow \mathrm{PC}\) & X & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & x & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \(\mathrm{PC} \rightarrow\) MAR; PC \(+1 \rightarrow \mathrm{PC}\) & & & & x & & & & & & & & & \\
\hline \(\mathrm{MEM} \mathrm{+} \mathrm{R1} \rightarrow\) R1 & & & & & x & & & & & & & & \\
\hline
\end{tabular}

Figure 11. Register Immediate Instruction Microcode.

\section*{Memory Immediate}

The memory immediate instruction is used to add immediate data contained in the second word of the instruction to a location in memory. The memory location is contained in the register specified in the X 1 field of the instruction.
The memory immediate instruction is similar to the register immediate instruction except that an indirect addressing scheme is used. Again, the first three microinstructions fetch and decode the memory immediate instruction. The fourth and fifth microinstructions are used to fetch the data value which is the second word of this memory immediate instruction. Microinstruction number 4 sets up the memory address register and microinstruction number 5 brings the data into the Am2903 Q register. Microinstruction number 6 places the contents of the register specified by the X 1 field into the memory address register so that the contents of this memory location can be brought into the Am2903 during microinstruction number 7 . Here, during microinstruction 7 the contents of the \(Q\) register are added to this value and returned to the Q register. At microinstruction 8 , the Q register is written back to the memory location as specified by the contents of the register pointed to by the X1 field. This value was already in the memory address register because it was used to fetch the operand originally at this location. The microcode for this instruction is detailed in Figure 12.

\section*{Improving Program Control Unit Performance}

If we examine the microcode as shown for the various instruction types depicted in Figure 1, we find that all of these microroutines have several things in common. First, the very first microinstruction simply sets up the memory address register with the current value of the program counter. In addition, this microinstruction increments the current program counter value. The second microinstruction simply fetches the contents of memory and places it in the instruction register. The third microinstruction is used to decode the microinstruction, select the appropriate micromemory word and set it into the pipeline register. Finally, the fourth microinstruction begins actual execution of the desired instruction. In all of these examples and using the block diagram of Figure 4, we find that a bottle neck occurs in the ALU because of our need to be operating on program counter data and operand data intermixed. We can improve the performance of the program control unit by making the program counter an external register and using a multiplexer to select either the program counter or the Am2903 output to load the memory address register. This is depicted in block diagram form in Figure 13.
The first effect of implementing a program control unit with this architecture is that one of the instruction types is shortened by one microcycle. This is the register-to-memory-immediate instruction. The new microcode flowcharts for this instruction is


Figure 12. Memory Immediate Instruction Microcode.


Figure 13. Memory Addressing Scheme with PC Outside of the ALU.
shown in Figure 14. In this case, we see that a PC value can be placed into the memory address register and the PC incremented while the ALU within the Am2903 is being used to perform either a pass or an addition. Thus, this architectural change has made some improvement in the thru-put of our machine.

The most important improvement in thru-put realized by the architecture shown in Figure 13 can be seen by evaluating the timing for sequential instructions. That is, what happens when several instructions are executed sequentially?

To keep the examples simple, let's visualize the microcycle timing chart for three register-to-register instructions executed sequentially. The most obvious timing chart would simply be to take the register-to-register microinstruction flows as shown in Figure 3 and concatenate three examples of this timing chart. If we do this, we will see that the final execution of the values of R1 + R2 return to R1 utilize the ALU, but the program counter is not in operation. However, the next microcycle requires placing the program counter into the memory address register. Thus, the architecture of Figure 13 allows us to do these two micro-operations during the same microinstruction. If we assume three register-to-register instructions in sequence in memory; let's call them instruction A, \(B\) and C ; the timing chart of Figure 15 results. What we see in this diagram is that the execution of instruction A can be overlapped with the set up the program counter in memory address register for fetching instruction B . Thus, instead of instruction B starting at time T4, it may be started at time T3. This can be accomplished by simply having the execution microinstruction also load the MAR with the current PC value and increment the PC. From this discussion, we can see that instead of twelve microcycle times being required to execute three register-to-register instructions, only nine microcycle times will be required. We should caution that if the reader counts the microcycles in Figure 15, he will arrive at 10 microcycle times being required. This leads us to our next point.

If we examine all of the instructions described earlier in this application note, we will find that in all cases, the execution of the instruction (the last microcycle) can be overlapped with the first
microinstruction of the fetch routine. Thus, the architectural change shown in Figure 13 not only allows three of the instructions to execute faster during their total microcode, but in fact all microinstructions can be executed at least one microcycle faster because of the ability to overlap the first microcycle of the fetch routine with the execution of the instruction. This architectural change therefore saves one or two microcycles depending on the instruction.

In Chapter 9 we will show how further overlapping at the machine instruction level can allow us to execute a register-to-register instruction during every microcycle, effectively; rather than every three microcycles as shown in Figure 15. At the present time, let us simply leave the discussion at this point.

\section*{Subroutining}

An implementation technique that is common to the different addressing modes is the subroutine (also called stack and link). The subroutine allows sections of main program to access a common subsection of the program. The general effect is to allow less lines of machine code to be written for any given program that employs subroutines.

Figure 16 shows an example of a subroutine within the program. The main program executes instructions untll it gets to instruction 52 which is a call to subroutine. This instruction puts address 80 in the program counter while saving address 53 in a separate register called Return Register. The program continues on from address 80 to address 85 where it encounters the return from subroutine command. The return-from-subroutine command takes a value out of the return register and puts that into the program counter. At that point the program counter continues down in the main body of the program untilit reaches address 57. At this time, another call to subroutine may occur forcing the program counter back to the value of 80 while putting the value 58 into the return address. The subroutine is executed and at address 85 the return command is again encountered. At this point,
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & T7 & T8 & T9 & T10 & T11 & T12 \\
\hline PC \(\rightarrow\) MAR; PC + \(1 \rightarrow\) PC & x & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & X & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \((\mathrm{X} 2) \rightarrow\) MAR & & & & x & & & & & & & & & \\
\hline \(\mathrm{MEM}+\mathrm{R} 1 \rightarrow \mathrm{R} 1\) & & & & & x & & & & & & & & \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & & & & & x & & & & & & & & \\
\hline \(\mathrm{MEM} \mathrm{+} \mathrm{R1} \rightarrow\) R1 & & & & & & X & & & & & & & \\
\hline
\end{tabular}

Figure 14. Register to Memory Immediate Instruction Improved Microcode.


Figure 15. Register to Register Instruction with Overlap of Execute and PC Control.


Figure 16. Subroutine Execution.
the subroutine will return control of the program to address 58 of the instruction stream and the man program continues to sequence through its instructions.
In many systems, one subroutine may very well call another subroutine which may in turn call yet another subroutine and so on. To accomplish this the return address linkage must now be "nested" using a last-ın first-out (LIFO) stacking arrangement. Figure 17 illustrates subroutine nesting. In this example, the main program contans a subroutine call or jump-to-subroutine command (JSB) at address 53. Program control is passed to the first subroutine at address 88 , while the return address 54 is placed in the stack. At address 89 the of the subroutine 1 another JSB command is encountered passing the program control to Subroutine 2 at address 502 . The return address value 90 is pushed onto the top of the stack. This continues in like fashion for calls to Subroutine 3 and 4 with return address 506 and 723 being placed on the stack. At address 785 of Subroutine 4, a Return from Subroutine (RTS) command is decoded causing the return address 723 on the top of the stack to be placed in the program counter and the contents of the stack are "poped" up one place.

At address 725 another RTS command is found, causing the top of the stack, address 506, to be placed in the program counter and the stack is poped. The identical action occurs for the RTS commands at address 507 and 92 such that control is eventually returned to the main program and the stack is empty.

The LIFO or subroutine stack in the program control hardware is shown in Figure 18. When the call from subroutine command is decoded by the computer control unit, the pipeline register outputs cause the stack control to accept the output of the program counter register and place it at the top of the stack. Next the subroutine address is brought in from the memory passed through the multiplexer and placed in the MAR. The subroutine address is also brought through the multiplexer incrementer, through the incrementer and placed in the program counter register to be used as a possible next source of address. The subroutine return address is recovered from the stack when the pipeline register instructs the stack control logic to place the return address at the multiplexer. The return address is passed through the multiplexer and clocked into the MAR. The return address is also clocked into the PC register via the incrementer multiplexer and the incrementer, for use as the next sequential address. Figure 19 shows the jump to subroutine instruction and Figure 20 shows the microcycles that are used in a typical call to subroutine command using the program control hardware shown in Figure 18. At T0 the program counter is placed into the MAR and updated. Time T1 finds the MAR accessing the subroutine call instruction, with the instruction being placed into the instruction register. At T2 the opcode is decoded by the CCU, and the first instruction microcode bits are clocked into the pipeline register. At time T3, the PC is placed in the MAR. At T4 the starting address of the subroutine is being fetched and placed into the MAR; the stack pointer is incremented; the current program counter is placed on the LIFO stack; and the starting address of the Subroutine plus one is placed into the program counter.
Figure 21 details the microcycle timing for a return-from-subroutine execution. At time zero the current program counter is placed into the MAR, then incremented by one. During tıme one the contents of the MAR fetches the return from subroutine command, which is then clocked into the instruction register at the end of the microcycle. At time 2 the contents of the instruction register is decoded in the CCU with the control bits being clocked into the pipeline register. During time 3 the return address on the top of


Figure 17. Nested Subroutine Example.


MPR-567
Figure 18. Subroutine Stack Architecture.


Figure 19. Jump to Subroutine (Branch and Stack) Instruction.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & T7 & T8 & T9 & T10 & T11 & T12 \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & X & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & x & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & & & & x & & & & & & & & & \\
\hline \[
\left\{\begin{array}{l}
\text { MEM } \rightarrow \text { MAR; PC } \rightarrow \text { STACK } \\
\text { MEM }+1 \rightarrow \mathrm{PC} ; \mathrm{SP}+1 \rightarrow \mathrm{SP}
\end{array}\right\}
\] & & & & & X & & & & & & & & \\
\hline
\end{tabular}

Figure 20. Branch and Stack Instruction Microcode.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{Microinstruction Operation} & \multicolumn{13}{|c|}{Microcycle Time} \\
\hline & T0 & T1 & T2 & T3 & T4 & T5 & T6 & T7 & T8 & T9 & T10 & T11 & T12 \\
\hline \(\mathrm{PC} \rightarrow \mathrm{MAR} ; \mathrm{PC}+1 \rightarrow \mathrm{PC}\) & x & & & & & & & & & & & & \\
\hline Fetch Inst to IR & & x & & & & & & & & & & & \\
\hline Decode & & & x & & & & & & & & & & \\
\hline \[
\left\{\begin{array}{l}
\text { Stack } \rightarrow \text { MAR; Stack }+1 \rightarrow \mathrm{SP} \\
\mathrm{SP}-1 \rightarrow \mathrm{SP}
\end{array}\right\}
\] & & & & X & & & & & & & & & \\
\hline
\end{tabular}

Figure 21. Return from Subroutine Instruction Microcode.
the LIFO stack is placed into the MAR, while that value plus one is stored into program counter. The stack pointer is then decremented.
The basic program control hardware thus developed with some embellishments added are contained within the Am2930 program control unit as shown in Figure 22. The Am2930 is a 4-bit slice of the program control unit. It therefore easily allows the address bus to be virtually independent of the data bus in terms of width. The Am2930 has a general purpose auxiliary register which has two sources and two destinations. One source being the \(D\) inputs which flow through the R multiplexer and hence into the auxiliary register and the other source being the output of the full adder which is the second input to the R multiplexer. The two outputs of the auxiliary register go to the \(A\) and \(B\) multiplexers which in turn source the \(A\) and \(B\) inputs to the full adder. The register enable pin ( \(\overline{\mathrm{RE}}\) ) allows the auxiliary register to be unconditionally loaded from the D Inputs of the Am2930. The A multiplexer selects as its sources a logical zero, the output of the auxiliary register, or the D inputs. The B multiplexer accepts the outputs of the auxiliary register, a logical zero, the output of the subroutine stack file, or the output of the program counter register as its sources.
In the Am2930 design the LIFO stack is 17 words deep, allowing up to seventeen levels of subroutine. The LIFO stack is controlled by the stack pointer logic which gives a FULL indication when the
stack is full and an EMPTY indication when the stack has emptied. The input to the LIFO stack is fed through a stack multiplexer whose inputs may be D inputs or the output of the program counter. Thus, depending upon the application, the stack may be used as either a subroutine stack or a general purpose LIFO stack which resides on the D bus. The incrementer and the full adder are controlled by the Ci and Cn carry-in bits respectively. Figure 23 details the ripple carry connections between Am2930s in a 16-bit array. The Ci input of the least significant slice (LSS) is controlled from the pipeline register.

The Ci signal is internally propagated through the incrementer of each device using carry look ahead logic. The microprogram memory, using the Ci input may now cause the Am2930s to repeatedly access the same main memory instruction if so desired. The full adder has its Cn input tied to ground for the LSS device of the Am2930 array. The Cn signal is progagated in parallel through the Am2930s.

For a faster propagation of the Cn signal the interconnection shown in Figure 24 should be employed. The generate and propagate pins ( \(\bar{G}, \bar{P}\) ) of the Am2902A carry look ahead generator. The look ahead carries ( \(C n+x, y, z\) ) are connected to the Cn inputs of their respective devices. The output of the Am2930 is three-state and is controlled by the output enable pin


Figure 22. Am2930 Block Diagram.


MPR-569

Figure 23. Ripple Expansion Scheme for Am2930's.


Figure 24. Parallel Look-Ahead Expansion Scheme for Am2930's.
( \(\overline{\mathrm{OE}}\) ). Other features of the Am2930 include an Instruction Enable pin (IEN). This pin allows the Am2930 array to be taken off of the microprogram data bus thus allowing the bits that were formerly committed to the Am2930 to be used in conjunction with other devices. The Am2930 also includes a condition code input (CC). The Condition Code input permits the conditional testing of a single bit. This allows the feasibility of such techniques as conditional branching at the macroprogram level. For more detailed explanation of the Am2930, its instructions and its applications, see the Am2930 Data Sheet. Figure 25 shows a typical system interconnection using the Am2930. The instruction lines, \(\mathrm{Ci}, \overline{\mathrm{RE}}\) and the \(\overline{O E}\) control pins are connected directly to the outputs of the combination microprogram memory and pipeline registers contained in the Am24775 devices. The condition code inputs are obtained from the Am2904 status and control device, thus allowing conditional jumps on status. Status from the Am2904 is also
fed into the test mux for use by the Am2910 for its conditional code input. Likewise the full and empty indications from the Am2930 are fed into the test MUX for use by the Am2910 to ascertain the current status of the stack. If the stack is full and the user wishes to push the data onto the stack then the current data must be emptied from the stack under microprogram control, using additional hardware.
Another feature of the Am2930 Program Control Unit as shown in Figure 22 is the full adder between the program counter and \(Y\) outputs. This allows for the execution of PC relative addressing types of instructions. While this can be an effective addressing scheme, it will not be covered in detail in this application note.
While the Am2930 offers advantages in small high performance systems requiring a small LIFO stack, it is not intended to be the solution for all program counter requirements.


Figure 25. System Interconnection Using the Am2930.

\section*{Using the Am2901A as a Program Control Unit}

Up to this point, the discussion has concerned a general architecture which includes 16 general registers in the ALU section and the LIFO stack is a program control section as shown in Figure 18. An alternative architecture and that used by most general purpose machines, is to place the LIFO stack in main memory. The stack pointer for the main memory LIFO stack can be contained in the program control unit to be described in this section. If the program control unit is built using Am2901A's it now has the capability of using its internal registers as the program counter, stack pointer, upper stack bound pointer, lower stack bound pointer, and internal temporary registers. This of course provides considerable flexibility in the architecture and also allows for a much greater repertoire of instructions to be executed. Particularly, several stack instructions can be included in the instruction set, most of which will use the form of the register-to-indexed-memory instruction format as shown in Figure 1.

Another advantage of the architecture shown in Figure 25 is speed. The Am2901A's slightly surpass the Am2903 in speed.

Thus, a 16-bit Am2901A program control unit architecture can be implemented and it will perform well within the microcycle times budgeted for the system.
Looking at Figure 26 which shows the Am2901A used as a program control unit and the Am2903 used for the general register stacks/ALU section, we see a three-state buffer on the Y outputs of the Am2903 connected to the data bus as well as a three-state buffer at the input of the Am2903's from the data bus. This provides isolation and buffering for the bus as well as allowing appropriate disconnects so that certain microcycles can be combined to improve the overall performance of the machine. In addition a transfer register is used between the Am2903's and Am2901s to allow a microcycle to be terminated if an ALU operation is takıng place within the Am2903's. This provides higher performance operation for the machine. In addition, a bi-directional buffer (such as the Am8304B) is used between the Am2901A Y-outputs and the Am2903 Y-outputs. This gives the ability to push the program counter contained in the Am2901A on the stack for interrupt handling. In addition, values coming from the Am2903 can be placed in the memory address register.


Figure 26. PCU Architecture Using the Am2901A.

\section*{Summary}

The thrust of this discussion has been aimed at defining and implementing hardware to accomplish addressing of main memory. We have shown that a speed advantage is realized if the program counter is kept separate from the main general purpose register stack/ALU hardware. The most general purpose program control unit is the Am2901A. It offers several advantages in terms of program control, stack pointer control, and stack pointer boundary conditions. The Am2930 can be used in program control units occupying less space and including a built-in stack, but
has some speed and performance limitations. Both devices can be used to implement the basic addressing modes associated with the instructions described in this application note.

Another purpose of this application note is to set the stage for Chapter 9 where we will overlap machine instructions such that register to register instructions can be executed in a single 200ns microcycle and the memory reference instructions can be executed in 600 ns ( 3 microcycles) as the effective execution time. Also, we will expand on the use of the Am2901A as a Program Control Unit.


Chapter VI Interrupt

\section*{INTRODUCTION}

A digital computer can be viewed as a finite state machine that moves from state to state via the execution of a program. Interrupt mechanisms provide a well-defined way of altering the flow of states in response to outside asynchronous events (interrupts). There is a wide variety of ways of handling interrupts depending upon the system requirements. The choice of a particular interrupt mechanism can have a large impact on the through-put and flexibility of a system. Therefore, time should be spent carefully defining the interrupt mechanism of a new computer design.

\section*{POLLING VS. NON-POLLING}

One of the simplest ways to handle asynchronous events is the polling method. With each possible event there is an associated flag that can be accessed by the program. The processor then interrogates each flag in order to determine if service is required. This method trades simple hardware for software. This not only uses memory space but also uses time for polling the flags when no service is required. The polling method has low system through-put, high real time overhead and slow response time.
In non-polling systems, the asynchronous event generates an interrupt request signal which is passed to the processor. The processor in turn suspends the execution of the current process and starts execution of an interrupt service routine. When the interrupt routine is completed, the processor resumes execution of the suspended process. This system is called an interrupt driven system because it executes interrupt service routines that are initiated by interrupt requests.
Although the non-polling method requires more hardware, it has many advantages. Because the execution of interrupt service routines is transparent to the current process, less thought and time is required of the programmer of the current process. The response time is faster because no time is spent interrogating the other non-active interrupts, which in turn increases the system throughput. There is less real time overhead and less memory space required because only the service routine exists in memory and no polling routine is required.

\section*{MACHINE VS. MICROPROGRAM LEVEL INTERRUPTS}

There are two levels on which interrupts may be handled. The first and most common is the machine level interrupt. In this method possible interrupt requests are checked for during the machine instruction fetch cycle. This guarantees that an interrupt can only happen when a machine instruction is complete and before a new instruction starts.
The second level of handling interrupts is on the microprogram level. In the machine level interrupt system, the microprogram has complete control of when to recognize an interrupt but in the microprogram level system the microprogram can be interrupted at any time. This method has a smaller response time for servicing interrupt requests but requires that restrictions may be placed on the microprogram and the interrupt mechanism. These restrictions come from setting aside space on the finite microprogram stack in the sequencer for possible interrupt requests. Special consideration may also have to be given to loop counters.

\section*{TYPES OF INTERRUPTS}

There are basically four types of interrupts based on the relationship of the source of the interrupt to the processor: within the processor, within the system, between software, and between processors. A multiprocessor has to be able to handle all four levels of interrupts. Therefore, the interrupt structure that is picked will have these design tradeoffs to consider.
A. Intraprocessor interrupts are those asynchronous events that happen within the processor during the execution of a machine instruction. This group includes such things as zero divide, overflow, accessing restricted memory, execution of a privileged instruction, machine failure, etc.
B. Intrasystem interrupts are interrupts created by system peripherals such as disks, CRT's and printers that require service.
C. Executive interrupts are those interrupts caused by the current program that is executing. This provides a way for the current program to make a request of the executive (operating system) program. These requests might include such things as starting new tasks, allocating hardware resources (disks, line printers), communication with other tasks, etc. A good example would be the supervisor call (SVC) in the IBM 360/370 computers.
D. Interprocessor interrupts include those interrupts between two intelligent processors. For example, this class of interrupts would be used to initiate data and status transfer between a local processor and a processor at a remote site.

\section*{SEQUENCE OF EVENTS FOR INTERRUPT HANDLING}

When an interrupt occurs there is a sequence of six events that happen. These events, which can be implemented in microcode or machine code, integrated together with the hardware comprise the interrupt mechanism. The sequence of events describes the steps that occur to provide for a smooth transfer from the current process environment to an interrupt servicing environment and back again. The sequence ensures that the processor status will be the same immediately after an interrupt is serviced as immediately before the interrupt occurred. The events listed in the next few paragraphs may differ in order or overlap depending upon the machine design and application.

\section*{Interrupt Recognition}

This step consists of the recognition of an interrupt request by the processor via an interrupt request line. In this step the processor can determine which device made the request. The method that is used to determine which device to service is directly related to the interrupt structure of the machine. The different types of interrupt structures will be discussed in more detail below.

\section*{Save Status}

The goal of this step is to make the interrupt sequence transparent to the interrupted process. Therefore, the processor saves a minimum set of flags and registers that may be changed by the interrupt service routine, so that after the service routine is finished they may be restored.
The minimum set of flags and registers would be those which will be destroyed in the transfer of control from the current process to the interrupt service routine. It is then the responsibility of the service routine to save any other registers which it might change. The minımum set of flags and registers might include the Program Counter, Overflow Flag, Sign Flag, Interrupt Mask, etc. The minimum set also includes any register or flag that needs to be saved that the interrupt service routine cannot access.

\section*{Interrupt Masking}

This step can overlap some of the other steps. For the first few steps of the sequence all interrupts are masked out so that no interrupt may occur before the processor status is saved. The mask is then usually set to accept interrupts of higher priority.

Some machines allow the service routine to selectively enable or disable interrupts also. There may be different variations to this step depending upon the application.

\section*{Interrupt Acknowledge}

At some point the processor must acknowledge the interrupt being serviced so that the interrupting device knows that it is free to contınue its task. The processor can acknowledge several different ways. One of the ways is to have a line devoted to interrupt acknowledge. Another method relies upon the interrupting device recognizing an acknowledge when the cause of the interrupt is serviced.
Some processor designs also use this signal as a request for the interrupting device to send an I.D. down the data bus. This aspect will be discussed in more detail below.

\section*{Interrupt Service Routine}

At this point the processor can call the interrupt service routine. The address of the routine can be obtained several ways depending upon the system architecture. The most trivial is when there is only one routine which polls each device to find out which one interrupted. Some designs require that the interrupting device put an address on the data bus so that the processor can store it in its program counter and branch to it. Other designs use an I.D. number derived from the priority of the interrupt and put it through a mapping PROM or look-up table in memory in order to obtain the address of the service routine.

\section*{Restore and Return}

After the interrupt service routine has returned via some variation of an Interrupt Return instruction, the processor should re-
store all the registers and flags that were saved previous to the interrupt routine. If this is done correctly, the processor should have the same status as before the interrupt was recognized.

\section*{INTERRUPT STRUCTURES}

There are several interrupt structures that can be implemented. As usual there is a trade-off between hardware and software (or firmware). Listed below are some of the more common structures used. The particular structures vary in the way that the processor determines which device made the interrupt request.

\section*{Single Request, Multiple Poll}

In this structure there is one request line which is shared among all interrupting devices. When the processor recognizes an interrupt request it polls all the devices to find the interrupting device (see Figure 1). Priority is introduced via the order in which the devices are polled. This scheme also allows dynamic reallocation of priority.

\section*{Single Request, Daisy Chain Acknowledge}

In this structure there is one request line which is shared. When the processor receives an interrupt it sends out a signal acknowledging the interrupt. The acknowledge signal is passed from I/O device to I/O device until the interrupting device receives the signal. At this point the interrupting device identifies Itself by putting an I.D. number on the data bus (see Figure 2). This structure requires less software, but has a static priority associated with each interrupting device. There is also a time delay associated with daisy chain acknowledge structure because in each device INTA signal has to pass through several gate delays.


Figure 1. Single Request, Multiple Poll.


Figure 2. Single Request, Daisy Chain Acknowledge.

\section*{Multiple Request}

This structure features one line per priority level (see Figure 3). The multiple line structure gives the fastest response time since the interrupting device can be identified immediately. It also results in simpler interfaces in the peripheral units, in general, a single interrupt request flip-flop. This structure allows for the possibility of having a mask bit associated with each priority level (device). The trade-off of this circuit is a wider bus and a limit of one peripheral per proority level.

\section*{Multiple Request, Daisy Chain Acknowledge}

This structure combines the Single Request/Daisy Chain Acknowledge with the Multiple Request structure (see Figure 4). For each interrupt request line there is an interrupt acknowledge line which is connected to a string of devices in a daisy chain fashion. When the appropriate device receives the interrupt acknowledge, it puts an I.D. number on the data bus.
The advantage of this structure is that a lot (more than available interrupt levels) of devices may be handled by breaking them up
into short daisy chains. This gives a shorter access time than a pure daisy chain with less hardware than an interrupt request line per device. This advantage is that each device must be intelligent to pass on the acknowledge signal which requires more hardware in each device.

\section*{PRIORITY SCHEMES}

When handling asynchronous requests one must assume that sometimes two or more requests can happen simultaneously. In order to handle this situation, there must be some sort of priority scheme implemented to pick which request is serviced first.
The two most common priority schemes are the static and the rotating structures. In the static structure, all the interrupt levels are ordered from the lowest priority to the highest priority. This can be fixed in software or hardware and is usually permanent.
In the rotating structure the possible interrupt requests are arranged in a circle. There is a pointer which points to the lowest priority interrupt. The priority of each interrupt increases as one travels around the circle, with the highest priority interrupt being


Figure 3. Multiple Request.


Figure 4. Multiple Requests, Daisy Chain Acknowledge.
adjacent to the lowest priority interrupt. The lowest priority interrupt pointer is changed to point at the interrupt that was just serviced. This structure is advantageous when all interrupts have similar priority and service bandwidth requirements.

\section*{NESTING}

Nesting allows only higher priority interrupts to interrupt a processing interrupt service routine. Nesting requires fencing off equal and lower level interrupts. Fencing requires that the interrupt structure hold the value of the highest priority interrupt being serviced. This can be implemented with a Status Register that holds the value as a binary encoded number or in other systems as an In-Service Register with a different bit associated with each interrupt.
Whether nesting is performed in microcode or not, all computers must have machine instructions to enable and disable interrupts


Figure 5.
and set and clear mask bits. With these instructions, interrupt handlers can be written to accomplish nesting of interrupts although less efficiently than when done with microcode and hardware. In low-end computers, the interrupt structure only prioritizes interrupts leaving nesting to the software interrupt handlers.

\section*{A UNIVERSAL HARDWARE INTERRUPT STRUCTURE}

While designing a hardware interrupt structure, the designer should consider the specific functions that are to be achieved. This provides for system optimization in not only hardware but also software. In the following paragraphs is a step by step development of a general purpose interrupt structure as related to the design concepts involved.

\section*{Multiple Interrupt Request Handling}

Since interrupt requests are generated from a number of sources, the interrupt structures ability to handle interrupt requests from several sources is important.
As implemented in Figure 5, the register configuration allows the hardware to handle interrupt requests from several sources. The first column of registers catches the asynchronous interrupt request. The second column of registers synchronizes the requests with respect to the system. After the interrupt is serviced, one of the CLR lines can be used to selectively clear the interrupt request.

\section*{Interrupt Request Prioritization}

Since the processor can service only one interrupt request at a time, the interrupt structure should have the ability to prioritize the requests and determine which has the highest priority. As shown in Figure 6, a priority encoder can be put on the output of the interrupt storage registers. The priority encoder will identify the highest interrupt request as a binary encoded number.

\section*{Dynamic Interrupt Request Masking}

The ability to selectively inhibit or "mask" individual interrupt requests under program control is desirable. For example at times it may be important to inhibit all interrupts except Power Failure. As shown in Figure 7 this is realized by ANDing the output of a mask register with the output of the interrupt storage registers. Therefore, the mask register can be used to select which interrupt requests will pass through to the rest of the hardware.

\section*{Interrupt Request Clearing}

Flexibility in the method of clearing the interrupt allows different modes of interrupt system operation. Of particular value are the abilities to clear the interrupt currently being serviced or clear all interrupts.


Figure 6.


Figure 7.

This is implemented in Figure 8 by use of the Vector Hold register on the output of the Priority Encoder. This register holds the latest interrupt request that was recognized. Before another interrupt request is recognized, the output of the Vector Hold register can be fed through some clear control logic to selectively clear the old interrupt.

\section*{Interrupt Request Priority Threshold}

The ablity to establish a priority threshold is valuable. In this type of operation, only those interrupt requests which have higher prority than a specified threshold priority are accepted. The threshold priority can be defined by microprogram or can be automatically established by hardware at the interrupt currently being serviced plus one. This automatic threshold prevents multiple interrupts from the same source.

This feature is implemented in Figure 8 using an incrementer and status register which is compared with the current request. Each time an interrupt is recognized, the status register is updated with one plus the current level.

\section*{Interrupt Service Routine "Nesting"}

This feature allows an interrupt service routine for a given priority request to be interrupted in turn by a higher prionity interrupt request. This can be achieved by saving the status register before each interrupt is serviced and restoring it afterwards.

\section*{Microprogrammability and Hardware Modularity}

These last two design concepts bring us to the Vectored Priority Interrupt controller, the Am2914. The Am2914 is a modular interrupt system block which is beneficial in two ways. First,


Figure 8.


Figure 9. Am2914 Block Diagram.
hardware modularity provides expansion capability. Additional modules may be added as the need to service additional requests arises. Secondly, hardware modularity provides a structural regularity which simplifies the system structure and also reduces the number of hardware part numbers.

The Am2914 is microprogrammable, which permits the construction of a general purpose or "universal" interrupt structure which can be microprogrammed to meet a specific application's requirement. The universality of the structure allows standardization of the hardware and amortization of the hardware development costs across a much broader user base. The end result is a flexible, low cost interrupt structure as shown in Figure 9.

\section*{PROGRAMMING THE Am2914}

The Am2914 is controlled by a four-bit microinstruction field \(\mathrm{I}_{0}-\mathrm{I}_{3}\). The microinstruction is executed if \(\bar{E}\) (Instruction Enable) is LOW and is ignored if IE is HIGH, allowing the four I bits to be shared with other functions. Sixteen different microinstructions are executed. Figure 11 shows the microinstructions and the microinstruction codes.

In this microinstruction set, the Master Clear microinstruction is selected as binary zero so that during a power-up sequence, the microinstruction register in the microprogram control unit of the central processor can be cleared to all zeros. Thus, on the next clock cycle, the Am2914 will execute the Master Clear function.


Figure 10. Am2914 Logic Symbol.
\begin{tabular}{|l|c|}
\hline \multicolumn{1}{|c|}{\begin{tabular}{l} 
MICROINSTRUCTION \\
DESCRIPTION
\end{tabular}} & \begin{tabular}{c} 
MICROINSTRUCTION \\
CODE \\
\(\mathbf{I}_{3} \mathbf{I}_{2} \mathbf{I}_{0}\)
\end{tabular} \\
\hline MASTER CLEAR & 0000 \\
CLEAR ALL INTERRUPTS & 0001 \\
CLEAR INTERRUPTS FROM & \\
M-BUS & 0010 \\
CLEAR INTERRUPTS FROM MASK & 0011 \\
REGISTER & \\
CLEAR INTERRUPT, LAST & 0100 \\
VECTOR READ & 0101 \\
READ VECTOR & 0110 \\
READ STATUS REGISTER & 0111 \\
READ MASK REGISTER & 1000 \\
SET MASK REGISTER & 1001 \\
LOAD STATUS REGISTER & 1010 \\
BIT CLEAR MASK REGISTER & 1011 \\
BIT SET MASK REGISTER & 1100 \\
CLEAR MASK REGISTER & 1101 \\
DISABLE INTERRUPT REQUEST & 1110 \\
LOAD MASK REGISTER & 1111 \\
\hline
\end{tabular}

Figure 11. Am2914 Microinstruction Set.

This includes clearing the Interrupt Latches and Register as well as the Mask Register and Status Register. The LGE flip-flop of the least significant group is set LOW because the Group Advance Receive input is tied LOW. All other Group Advance Receive inputs are tied to Group Advance Send outputs and these are forced HIGH during this instruction. This clear instruction also sets the Interrupt Request Enable flip-flop so that a fully interrupt driven system can be easily initiated from any interrupt.
The Clear All Interrupts microinstruction clears the Interrupt Latches and Register.
The Clear Interrupts from M-Bus micronstruction clears those Interrupt Latches and Register bits which have corresponding M-Bus bits set equal to one.
The Clear Interrupts from Mask Register microinstruction clears those Interrupt Latches and Register bits which have corresponding Mask Register bits set equal to one. The M-Bus is used by the Am2914 during the execution of this microinstruction and must be floating.
The Clear Interrupt, Last Vector Read micronstruction clears the Interrupt Latch and Register bit associated with the last vector read.

The Read Vector microinstruction is used to read the vector value of the highest priority request causing the interrupt. The vector outputs are three-state drivers that are enabled onto the is instruction. This microinstruction also automatically loads the value "vector plus one" into the Status Register. In addition, this instruction sets the Vector Clear Enable flip-flop and loads the current vector value into the Vector Hold Register so that this value can be used by the Clear Interrupt, Last Vector Read microinstruction. This allows the user to read the vector associated with the interrupt, and at some later time clear the Interrupt Latch and Register bit associated with the vector read.

During the Read Status Register microinstruction, the Status Register outputs are enabled onto the Status Bus \(\left(\mathrm{S}_{0}-\mathrm{S}_{2}\right)\). The Status Bus is a three-bit, bi-directional, three-state bus.
The Read Mask Register microinstruction enables the Mask Register outputs onto the bi-directional, three-state M-Bus.
The Set Mask Register microinstruction sets all the bits in the Mask Register to one. This results in all interrupts being inhibited.
The Load Status Register microinstruction loads S-Bus data into the Status Register and also loads the LGE flip-flop from the Group Enable input.
The Bit Clear Mask Register micronstruction may be used to selectively clear individual Mask Register bits. This microinstruction clears those Mask Regıster bits which have correspondıng M-Bus bits equal to one. Mask Register bits with corresponding M -Bus bits equal to zero are not affected.
The Bit Set Mask Register microinstruction sets those Mask Register bits which have corresponding M-Bus bits equal to one. Other Mask Register bits are not affected.
The entire Mask Register is cleared by the Clear Mask Register microinstruction. This enables all interrupts subject to the Interrupt Enable fllp-flop and the Status Register.
All Interrupt Requests may be disabled by execution of the Disable Interrupt Request microinstruction. This microinstruction resets an Interrupt Request Enable flip-flop on the chip.
The Load Mask Register microinstruction loads data from the three-state, bi-directional M-Bus into the Mask Register.
The Enable Interrupt Request microinstruction sets the Interrupt Enable flip-flop. Thus, Interrupt Requests are enabled subject to the contents of the Mask and Status Registers.

\section*{Am2914 BLOCK DIAGRAM DESCRIPTION}

The Am2914 block diagram is shown in Figure 9. The Microinstruction Decode circuitry decodes the Interrupt Microinstructoons and generates required control signals for the chip.
The Interrupt Register holds the Interrupt Inputs and is an eight-bit, edge-triggered register which is set on the rising edge of the CP Clock signal if the Interrupt Input is LOW.
The Interrupt latches are set/reset latches. When the Latch Bypass signal is LOW, the latches are enabled and act as negative pulse catchers on the inputs to the Interrupt Register. When the Latch Bypass signal is HIGH, the Interrupt latches are transparent.
The Mask Register holds the eight mask bits associated with the eight interrupt levels. The register may be loaded from or read to the M-Bus. Also, the entire register or individual mask bits may be set or cleared.
The Interrupt Detect circuitry detects the presence of any unmasked Interrupt Input. The eight-Input Priority Encoder determines the highest priority, non-masked Interrupt Input and forms a binary coded interrupt vector. Following a Vector Read, the three-bit Vector Hold Register holds the binary coded interrupt vector. This stored vector can be used later for clearing interrupts.
The three-bit Status Register holds the status bits and may be loaded from or read to the S-Bus. During a Vector Read, the Incrementer increments the interrupt vector by one, and the result is clocked into the Status Register. Thus, the Status Register points to a level one greater than the vector just read.

The three-bit Comparator compares the Interrupt Vector with the contents of the Status Register and indicates if the Interrupt Vector is greater than or equal to the contents of the Status Register.
The Lowest Group Enabled Flip-Flop is used when a number of Am2914's are cascaded. In a cascaded system, only one Lowest Group Enabled Flip-Flop is LOW at a time. It indicates the eight interrupt group, which contains the lowest priority interrupt level which will be accepted and is used to form the higher order status bits.

The Interrupt Request and Group Enable logic contain various gating to generate the Interrupt Request, Parallel Disable, Ripple Disable, and Group Advance Send signals.
The Status Overflow signal is used to disable all interrupts. It indicates the highest proority interrupt vector has been read and the Status Register has overflowed.

The Clear Control logic generates the eight individual clear signals for the bits in the Interrupt Latches and Register. The Vector Clear Enable Flip-Flop indicates if the last vector read was from this chip. When it is set it enables the Clear Control Logic.

The CP clock signal is used to clock the Interrupt Register, Mask Register, Status Register, Vector Hold Register, and the Lowest Group Enabled, Vector Clear Enable and Status Overflow FlipFlops, all on the clock LOW-to-HIGH transition.

\section*{CASCADING THE Am2914}

A number of input/output signals are provided for cascading the Am2914 Vectored Priority Interrupt Encoder. A definition of these I/O signals and their required connections follows:

Group Signal ( \(\overline{\mathrm{GS}}\) ) - This signal is the output of the Lowest Group Enabled flip-flop and during a Read Status microinstruction is used to generate the high order bits of the Status word.
Group Enable ( \(\overline{\mathrm{GE}}\) ) - This signal is one of the inputs to the Lowest Group Enable flip-flop and is used to load the flip-flop during the Load Status microinstruction.
Group Advance Send ( \(\overline{\mathrm{GAS}}\) ) - During a Read Vector microinstruction, this output signal is LOW when the highest priority vector (vector seven) of the group is being read. In a cascaded system Group Advance Send must be tied to the Group Advance Receive input of the next higher group in order to transfer status information.

Group Advance Receive ( \(\overline{\mathrm{GAR}}\) ) - During a Master Clear or Read Vector microinstruction, this input signal is used with other internal signals to load the Lowest Group Enabled flip-flop. The Group Advance Receive input of the lowest priority group must be tied to ground.
Status Overflow ( \(\overline{\mathrm{SV})}\) - This output signal becomes LOW after the highest priority vector (vector seven) of the group has been read and indicates the Status Register has overflowed. It stays LOW until a Master Clear or Load Status microinstruction is executed. The Status Overflow output of the highest priority group should be connected to the Interrupt Disable input of the same group and serves to disable all interrupts untll new status is loaded or the system is master cleared. The Status Overflow outputs of lower priority groups should be left open (see Figure 14).
Interrupt Disable ( \(\overline{\mathrm{ID}}\) ) - When LOW, this input signal inhibits the Interrupt Request output from the chip and also generates a Ripple Disable output.

Ripple Disable ( \(\overline{\mathrm{RD}}\) ) - This output signal is used only in the Ripple Cascade Mode (see below). The Ripple Disable output is LOW when the Interrupt Disable input is LOW, the Lowest Group Enabled flip-flop is LOW, or an Interrupt Request is generated in the group. In the ripple cascade mode, the Ripple Disable output is tied to the Interrupt Disable input of the next lower priority group (see Figure 13).
Parallel Disable (PD) - This output is used only in the parallel cascade mode (see below). It is LOW when the Lowest Group Enabled flip-flop is LOW or an Interrupt Request is generated in the group. It is not affected by the Interrupt Disable input.

\section*{CASCADING CONFIGURATIONS}

A single Am2914 chip may be used to prioritize and encode up to eight interrupt inputs. Figure 12 shows how the above cascade lines should be connected in such a single chip system.


Figure 12. Cascade Lines Connection for Single Chip System.

The Group Advance Receive and Group Enable inputs should be connected to ground so that the Lowest Group Enabled flipflop is forced LOW during a Master Clear or Load Status microinstruction. Status Overflow should be connected to Interrupt Disable in order to disable interrupts when vector seven is read. The Group Advance Send, Ripple Disable, Group Signal and Parallel Disable pins should be left open.

The Am2914 may be cascaded in either a Ripple Cascade Mode or a Parallel Cascade Mode. In the Ripple Cascade Mode, the Interrupt Disable signal, which disables lower priority interrupts, is allowed to ripple through lower priority groups. Figures 13, 16, and 17 show the cascade connections required for a ripple cascade 32 input interrupt system.
In the parallel cascade mode, a parallel lookahead scheme is employed using the high-speed Am2902 Lookahead Carry Generator. Figures 14, 15, and 17 show the cascade connections required for a parallel cascade 32 -input interrupt system. For this application, the Am2902 is used as a lookahead interrupt disable


Figure 13. Interrupt Disable Connections for Ripple Cascade Mode.
generator. A Parallel Disable output from any group results in the disabling of all lower priority groups in parallel. Figure 15 shows the Am2902 logic diagram and equations.

In Figures 16 and 17 the Am2913 Priority Interrupt Expander is shown forming the high order bits of the vector and status, respectively. The Am2913 is an eight-line to three-line priority encoder with three-state outputs which are enabled by the five output control signals G1, G2, \(\overline{\mathrm{G} 3}, \overline{\mathrm{G} 4}\), and \(\overline{\mathrm{G} 5}\). In Figure 16, the Am2913 is connected so that its outputs are enabled during a Read Vector instruction, and in Figure 17 the Am2913 is connected to microinstruction bits so that its outputs are enabled during a Read Status instruction. The Am2913 logic diagram and truth table are shown in Figure 18.

The Am25LS138 three-line to eight-line Decoder also is shown in Figure 17. It is used to decode the three high order status bits during a Load Status instruction. The Am25LS138 logic diagram and truth table are shown in Figure 19.

\section*{Am2914 IN THE Am2900 SYSTEM}

The block diagram of Figure 20 shows a typical 16 -bit mınicomputer architecture. The Am2914 is the heart of the Interrupt Control Unit as shown at the bottom of the block diagram. It receives its microinstructions from the Computer Control Unit. The mask, Status and Interrupt vector information are passed on the data bus. The interrupt request line from the Am2914 input into the next microprogram Address Control unit where it can be tested to determine if an interrupt request has been made.

Figures 21 and 22 show the detaled hardware design of two example interrupt control units (ICU's) for an Am2900 Computer


Figure 14. Interrupt Disable Connections for Parallel Cascade Mode.


Figure 15. Am2902 Carry Look-Ahead Generator Logic Diagram and Equations.


Figure 16. Vector Connections for both the Parallel and Ripple Cascade Modes.

System. Figure 21 shows an eight interrupt level ICU, and Figure 22 shows an ICU which has sixteen levels. In both designs, the Am2914 Instruction inputs and Instruction Enable input are driven by the \(\mathrm{I}_{0-3}\) field and \(\overline{\mathrm{E}}\) bit, respectively, of the Microinstruction Register. Note that Am2914 Instruction inputs are enabled only when the \(\overline{I E}\) bit is LOW. Therefore, the \(\mathrm{I}_{0-3}\) field of the Microinstruction Register may be shared with another functional unit of the computer such as the ALU.

The Latch Bypass input is shown connected to ground so that a Low-going pulse will be detected at any of the Interrupt Inputs. The designer has the option of connecting the Latch Bypass input to a pull up resistor connected to +5 volts. This makes the inputs low level sensitive. They are clocked in by each system clock. It is therefore implied that the processor will have to acknowledge the interrupt so that the interrupting device will know when to release the interrupt request line.


Figure 17. Group Signal, Group Enable, Group Advance Send, Group Advance Receive and Status Connections for Both the Parallel and Ripple Cascade Modes.


Figure 18. Am2913 Priority Interrupt Expander Logic Diagram and Truth Table.


Figure 19. Am25LS138 3 to 8 Line Decoder Logic Diagram and Truth Table.


Figure 20. A Generalized Computer Architecture.


Figure 21. 8 Level Interrupt Control Unit for Am2900 System.


Figure 22. 16 Level Interrupt Control Unit for Am2900 System.

In Figures 21 and 22, the Status and Mask inputs/outputs are connected to the data bus in a bi-directional configuration so that Status and Mask Registers may be loaded from or read to the data bus with appropriate Am2914 instructions. This gives the designer two possibilities which could be very advantageous.
Number one is the ability to store the Status and Mask information on a stack in memory. This is very advantageous when doing nested interrupts. Secondly, it allows the designer to construct machine instruction that can modify these two registers. This is very important to the system programmer who is involved in writing software to manage the interrupts.
For the eight level ICU of Figure 21, the Status Overflow output is connected to the Interrupt Disable input, and the Group Advance Receive and Group Enable inputs are connected to ground, as previously described.
For the 16 interrupt level ICU of Figure 22, the Parallel Disable output of the higher priority group serves as the high order vector bit. An Am2913 Priority Interrupt Expander is gated by the Am2914 instruction lines so that its output is enabled only during a Read Status instruction, and is used to encode the high order bit of the status. An inverter suffices to decode the high order bit of the status bit during a Load Status instruction. As described previously for a ripple cascade system, the Group Advance Receive input of the next higher priority group; the Ripple Disable output is connected to the Interrupt Disable input of the next lower priority group; the Status Overflow output of the highest priority group is connected to the Interrupt Disable input of the same group, and the Group Advance Receive input of the lowest priority group is connected to ground.

In both designs, two Am29751 32-word by 8-bit PROM's with three-state outputs are used to map the Am2914 Vector outputs into a 16-bit address vector. The PROM outputs are connected to the data bus. When a Read Vector Instruction (Am2914) is executed, the address vector is available to be used either as the address of the next instruction or a location to find the address of the next instruction to execute.

Figure 23 shows a design where the address vector from the mapping PROM can be clocked into a register in the Am2903's. The registers in the Am2903's would be split between general purpose, scratch, stack pointers and Program Counter regısters.

The address vector also may be gated directly to the "D" inputs of the Am2911 Microprogram Sequencer as shown in Figure 24, and used as the start PROM address of a microinstruction interrupt service routine. This method would be most useful in a controller application. This method would trade faster service for a bigger microprogram that accommodates all the code to service each individual interrupt.

\section*{FIRMWARE EXAMPLE FOR Am2914 INTERRUPT SYSTEM}

The software for handling interrupt requests is on two levels. The first level to come into play is the microprogram level. This is the level at which the request is recognized and the program counter is manipulated to start execution of a machine level interrupt service routine which is the second level. When the machine level interrupt service routine is finished, some form of a Return Interrupt instruction is executed. The microcode for the return instruction manipulates the program counter so that execution of the current machine program previous to the request is restored as shown in Figure 25.

This example is concerned with the microprogram level. This microcode goes along with the hardware shown in Figure 23. In this example the code is shown in the form of Flow Charts be-
cause the actual microprogram format will vary from machine to machine.

The important features to notice that have a direct relevance to the firmware are the Latch Bypass and where the Mask, Status and Vector busses go. For this example, the Latch Bypass is LOW making the Interrupt Latches latch up on a negative going pulse. The Mask and Status busses go to the data bus allowing the Status and Mask data to be transferred to and from memory. The Vector bus passes through a mapping PROM to the data bus where it can be read into the Program Counter contained in the Am2903's. The PROM contains addresses of service routines which correspond to the different interrupt levels.

Another relevant fact, important to understanding the firmware is that the interrupt mechanism is limited to handle interrupts on the machine level.
As shown in Figure 26a, the first thing that happens in the fetch routine (written in microcode) is a conditional subroutine call that will be taken if an interrupt request is present. This happens before the current machine instruction is fetched and the program counter is incremented.
In the Interrupt routine (shown in Figure 26b) a microprogram subroutine is first called to push the program counter onto the system stack. This is done so that the program counter can be restored in order to resume execution of the machine program after the interrupt service routine is done. The next thing that is saved on the system stack is the contents of the Am2914 Status Register. This is done because the status register which contains the priority level that would be serviced prior to the interrupt, will be restored after the interrupt is serviced. This maintains a nested interrupt structure (fence).

After saving the program counter and status register, the vector is read out of the Am2914 through the mapping PROM to obtain the address of the machine interrupt service routine. The address is then read into the program counter which resides in the Am2903's. When the Vector is read, the interrupt request priority plus one is automatically put into the status register by the Am2914 so that all interrupt requests of lower priority than the one being serviced are ignored. This is often referred to as moving the fence up. Since the vector has been read and the new address is in the program counter, the interrupt request can be cleared from the interrupt register via the Clear Interrupt/Last Vector Read instruction. At this point a jump is made to the Fetch routine which will now fetch the first instruction of the machine Interrupt Service routine.
The last instruction that the machine level interrupt service executes is an Interrupt Return. This will in turn call Return Interrupt microprogram. The status is first popped off the system stack and loaded back into the status register. This restores the Interrupt Fence. The program counter is then popped off the system stack and loaded into the program counter register. This restores the program counter to point to the instruction that was going to be executed when the interrupt request occurred.

\section*{TIME DELAY WHEN USING THE Am2914}

An aspect that should be covered when using any part is how it will fit into the system timing; because the cycle time of the system will be as long as the longest delay path in the machine. Shown in Figure 27 is the longest delay path through the Am2914 for the previous 16-bit computer example. The calculations were using both typical and worst case values at \(25^{\circ} \mathrm{C}\) and 5.0 V .

The longest delay path for the system where the vector from the mapping PROM feeds into the "D" inputs of the Am2910 is


Figure 23. Example of a 16-Bit Computer \#1.



Figure 25. Machine Level Instruction Flow During Interrupt Request.


Figure 26a. Flow Chart for a Simplified Microprogram Fetch Routine.


Figure 26b. Call Interrupt Service Routine Microprogram Flow Chart.


Figure 26c. Return Interrupt Microprogram Flow Chart.


Figure 27a. AC Calculations.


Figure 28a.


DELAY PATH, CYCLE \(n\)
DELAY PATH, CYCLE \(n+1 \ldots\)



DELAY PATH, CYCLE \(n\)
DELAY PATH, CYCLE \(n+1\)

Figure 28.
\begin{tabular}{|l|l|r|r|}
\hline \multicolumn{1}{|c|}{ Device No. } & Device Path & Typ. & Max. \\
\hline 29775 & CP to D & 15 & 20 \\
2914 & I to V & 40 & 55 \\
2918 & \(\mathrm{t}_{\text {s }}\) (Data) & 5 & 5 \\
\hline Cycle n Total-ns & & 60 & 80 \\
\hline 2918 & CP to Q & 8.5 & 13 \\
27 S 19 & A to O & 25 & 40 \\
2910 & D to Y & 14 & 22 \\
29775 & \(\mathrm{t}_{\text {s }}\) (A) & 40 & 50 \\
\hline Cycle \(\mathrm{n}+1\) Total-ns & & 97.5 & 125 \\
\hline
\end{tabular}

Figure 28f.
\begin{tabular}{|l|c|c|c|}
\hline Device No. & Device Path & Typ. & Max. \\
\hline 2914 & CP to IRQ & 65 & 82 \\
2922 & \(\mathrm{D}_{\mathrm{n}}\) to Y & 13 & 19 \\
2910 & CC to Y & 27 & 44 \\
29775 & \(\mathrm{t}_{\mathbf{s}}(\mathrm{A})\) & 40 & 50 \\
\hline Total-ns & & 145 & 195 \\
\hline
\end{tabular}

Figure 28g.
\begin{tabular}{|l|c|c|c|}
\hline \multicolumn{1}{|c|}{ Device No. } & Device Path & Typ. & Max. \\
\hline 2914 & CP to IRQ & 65 & 82 \\
74 S 74 & \(\mathrm{t}_{\mathbf{s}}\) (Data) & 3 & 3 \\
\hline Cycle n Total-ns & & 68 & 85 \\
\hline 74 S 74 & CP to Q & 6 & 9 \\
2922 & \(\mathrm{D}_{\mathrm{n}}\) to Y & 13 & 19 \\
2910 & \(\mathrm{CC}^{29} \mathrm{Y}\) & 27 & 44 \\
29775 & \(\mathrm{t}_{\mathbf{s}}(\mathrm{A})\) & 40 & 50 \\
\hline Cycle \(\mathrm{n}+1\) Total-ns & & 86 & 122 \\
\hline
\end{tabular}

Figure 28h.
shown in Figure 28. This path is much longer because of the two PROM's that have to be accessed. Therefore, there may be a trade-off of slightly longer system cycle time for faster service of interrupts via service routines in microcode.
For some systems the delay time shown in Figure 28b may be too long. Therefore, the designer can split the delay time into parts by putting a register between the Am2914 and the mapping PROM as shown in Figure 28c. When done in two system clock cycles, the delay time will be as shown in Figure 28f.
Figure 28d shows the delay path from the Interrupt Request Register through the Condition Code MUX to the Am2910. The tıme calculations are shown in Figure 28g. Again, for some systems, this path may be too long. Therefore, as shown above, this path may be broken in two, which is shown in Figure 28e. This will result in two system clock cycles. The delay involved in each cycle is shown in Figure 28h.

\section*{ANOTHER EXAMPLE OF Am2900 SYSTEM USING THE Am2914}

As shown in Figure 29, this example varies in the way that the interrupt request is recognized by the microprogrammed
machine. In this example the interrupt request line for the Am2914 enables or disables the \(\overline{M A P}\) signal going to the mapping PROM. When an interrupt request is present and a Jump Map instruction is executed, the output of the mapping PROM remains tri-stated; and the bus connected to the "D" inputs of the Am2910 is HIGH because of the pull-up resistors. Therefore, the microprogram will start executing at the highest location in microprogram memory when an interrupt request is present. At this location a Jump Instruction to the microprogram interrupt service routine could be placed. The microcode is written so that the only time a Jump Map instruction is executed is at the end of the Fetch microprogram routine as shown in Figure 30a.
In the previous example the interrupt request was recognized before the program counter is incremented after which the Jump Map instruction is executed. When the Jump Map is executed, either the instruction is executed or an interrupt request is serviced. Therefore, when the Return Interrupt machine instruction is executed, the program counter needs to be backed up via microcode, as shown in Figure 30b, in order to refetch the machine instruction which was lost. This also dictates that the program counter have a path to an incrementer/decrementer or ALU, which in this example is handled by putting the program counter in the Am2903's.

\section*{MICROPROGRAM LEVEL INTERRUPT EXAMPLE}

Some high-speed control applications require extremely fast interrupt response. While it may ordinarily be desirable to complete an entire processing sequence (such as executing a microprogram for a macronstruction) prior to testing for the interrupt and allowing it to occur, it is not always possible to achieve the required interrupt response time desired. If this is the case, microinstruction level interrupt handling must be employed. The technique described below has a maximum latency of three microcycles which can be \(450-600 \mathrm{~ns}\) total. Implementation is straightforward usıng the Am2910 Microsequencer, a 40-pın LSI device that can control 4096 words of microprogram at a 150 ns cycle tıme, and a few extra MSI and SSI packages. In this application, the Am2910 is configured in its standard architecture. The additional logic does not influence the normal system cycle time.

If microlevel interrupt handling is to be employed, logic must be provided to generate a substitute microprogram address corresponding to the location of the interrupt service routine. In the event of a microlevel interrupt, the sequencer address outputs are tri-stated and the substitute address is placed on the microprogram address bus, causing the next microinstruction fetch to be determined by the interrupt control vector generator. While this is happening, steps must be taken with the Am2910 to insure that the interrupted routine can be properly restored. To understand this procedure, it will be necessary to examine the Am2910 in more detail.

Referring to Figure 31, the microprogram address bus is driven by the Y outputs of the Am2910 through a tri-state buffer than can be disabled by means of the \(\overline{\mathrm{OE}}\) input. The address is selected in a multiplexer from a direct input, from a register/ counter, from a push/pop stack, or from a microprogram counter register. The microprogram counter register is commonly used as the address source when executing the next microinstruction in sequence. Whenever an address appears at the multiplexer outputs, it is incremented and presented to the microprogram counters inputs. At the rising edge of the clock, this new address that is current address-plus-1 is loaded into the microprogram counter and a microprogram access begins at this address.


Figure 29. Example of a 16-Bit Computer \#2.


Figure 30a. Return Interrupt Microprogram for Second Example.


Figure 30b. Fetch Microprogram for the Second Example.


Figure 31. Am 2910 Block Diagram.

Note that at this time, whatever was fetched at the previous address was loaded into the microword register for execution. Thus, the microprogram sequencer is always looking for the address of the next microinstruction to be executed (while a previously fetched microinstruction is residing in the microword register). Subroutine and microprogram loops may be accomplished by using the stack and the register counter. Regardless of what is selected as source of next address, the selected address will be incremented and presented to the microprogram counter. So to accomplish a microprogram branch, one would simply select the D inputs for a branch address for one cycle, then the next address source could be switched back to the program counter on the next cycle which would then contain the branch address plus 1.

This is a carry input to the incrementer which is normally tied HIGH. In the case of a microlevel interrupt, the microprogram sequencer will not determine the address of the next microinstruction to be executed. Instead the sequencer output will be tri-stated and a substitute address will be placed on the bus. The sequencer continues to operate in a normal fashıon with its multıplexer output being incremented and presented to the microprogram counter register. It must now be noted that the instruction located at the address then coming out of the multiplexer outputs will not be executed but rather the next microinstruction to be executed will be determined by the interrupt vector generator. It would therefore, be wrong to increment this microprogram address but rather it must be saved intact in order to push it onto the stack for access during interrupt return. This is easily accomplished in the Am2910 by grounding the carry input to the incrementer sımultaneously with three-stating the sequencer output. Then the multiplexer output will be stored in the
microprogram counter register and on the next microcycle the Am2910 must be told to push in order to preserve this address on the stack.
This carry-in input is all important and exists on all Advanced Micro Devices' microprogram sequencers. Unless the carry-in is grounded, whatever address was in the multiplexer output when the sequencer output was tri-stated is incremented and an instruction is missed in the interrupted routine. This, of course, would likely be disastrous. The key to this microinterrupt technique is that the address of the unexecuted instruction (when the Am2910 was tri-stated and a substitute address supplied) is preserved by inhibiting the increment via the carry input, so the address is passed on intact to the microprogram counter. If the microinterrupt is to be more than one cycle long, the microprogram counter must be pushed so as to save the return address. Otherwise, a "continue" may be used to return from the interrupt on the very next cycle. In this event the microinterrupt effectively inserts one instruction in the stream.
Figure 32 is the block diagram of a hardware design that implements the above concept. The SYNC/CONTROL and INTERRUPT CONTROL/VECTOR GENERATOR logic are shown in detail in Figure 33. Part of the Am2918 and both 'LS74 FlipFlops are used to synchronize the recognition of the asynchronous interrupt request as shown in Figure 34. The interrupt request arrives at the interrupt input. On the next clock cycle it is clocked into the Am2918. In the following clock cycle a pulse that is one system clock cycle loing is put out by the flip-flop pair FF1 and FF2. The pulse is used to disable the carry input of the Am2910, tri-state the output of the Am2910, and enable the jump vector onto the input of the PROM. The vector indexes into a table in microprogram memory that contains "JUMP SUBROUTINE" instructions to different interrupt service routines.


Figure 32. Computer Control Unit Set-up for High-Speed Micro-Level Interrupt Handling. Latency is a Maximum of Two Microcycles (i.e., about 300 to \(\mathbf{5 0 0 n s}\) ).


Figure 33. Example of Sync Control Logic and Vector Generator.


Figure 34. Timing of Vector Generator and Sync Control Logic.


Figure 35. Interrupt Sequence Timing.


Figure 36. Return-From-Interrupt Sequence Timing.

Figure 35 shows how the interrupt sequence timing fits into the normal flow of microprogram address in the Am2910. Note how the stack is used. This demonstrates the need for always reserving room on the stack to allow for interrupts. This applies to any room that the interrupt service routine may require as well as the return address. This limitation may require that only one interrupt request be serviced at a time.
Figure 36 shows how the return from the interrupt service routine fits into the microprogram flow. Notice that a Return instruction is used to accomplish this.

\section*{SUMMARY}

In this chapter, Interrupts were discussed beginning with a definition of the Interrupt Mechanism and proceeding to a classification of different interrupts and how they are handled. A dis-
cussion of the concepts that go into designing the "Universal Interrupt" hardware was given which culminated with the Am2914. The chapter ends with several Interrupt Mechanism applications using the Am2914 and Am2910.

In this chapter it was shown how interrupts can be handled using parts from the Am2900 family. Because of their hardware modularity and universal architecture, they may be used in a variety of applications. Since the Am2900 Family parts are microprogrammable, they allow the user's system to grow with time as system requirements change. Together these attributes make the Am2900 Family the flexible cost effective family that it is.


Chapter VII
Direct Memory Access

\section*{Introduction}

The transfer of data between the microcomputer and the peripheral devices is generally referred to as Input/Output (I/O). What is desired is a high speed technique of transferring data between the peripherals and the memory. Generally speaking, there is a minimum of three types of \(I / O\). These are, Programmed I/O, Memory Mapped I/O and Direct Memory Access I/O. All of these schemes are common in today's currently available minicomputers. A basic understanding of these I/O techniques is helpful in fully comprehending DMA. The first two of these types of I/O can be interrupt driven. That is, programmed I/O or memory mapped I/O can be initiated by an interrupt from the peripheral device.

\section*{Programmed I/O}

In this type of I/O, all operations are controlled by the CPU program. In other words, the peripheral device performs the functions of inputting or outputting data as it is controlled by the CPU. Normally, the machine will include a set of I/O instructions which are used to transfer data to or from the peripheral devices via an Input/Output port. All data for the peripheral devices passes through these I/O ports to the CPU and the resources of the CPU must be utilized in order to effect an I/O transfer. Figure 1 shows the Block Diagram of a programmed I/O system used in a typical microcomputer. Figure 2 shows an example of that portion of the program used to output data to the peripheral device.


Figure 1. Programmed I/O System.
\begin{tabular}{c|c}
\hline CPU Program & \multicolumn{1}{c}{ Comments } \\
\hline- & - \\
- & - \\
Load R, M & \begin{tabular}{l} 
Load CPU Register R with the Contents of \\
Memory Address M
\end{tabular} \\
Out D, R & \begin{tabular}{l} 
Transfer the Contents of CPU Register R to \\
I/O Device D via the I/O port. \\
-
\end{tabular} \\
\hline
\end{tabular}

Figure 2. Example Output Program - Programmed I/O.

Programmed I/O is simple to implement and does not require the utilization of any memory addresses for its realization. In addition, special instructions are available to the programmer to execute the peripheral data transfers. Programmed I/O is also low cost relative to other types of I/O; however, it has the following disadvantages. Since I/O device operation is asynchronous with re-
spect to CPU operation, the CPU has no way of knowing when a peripheral device is ready to transfer data and must periodically poll the device to determine its readiness. This results in an inefficient I/O transfer. Also, since the CPU must be used to effect the I/O transfer, the CPU resources are tied up during the time of transfer and the time of polling and cannot be used for other tasks. For these reasons, Programmed I/O is generally limited to use with low speed devices.

Perhaps, one of the best known programmed I/O microcomputers in the industry today is the Am9080A. This device features two instructions for either inputting data or outputting data to any one of 256 Input/Output ports.

\section*{Memory Mapped I/O}

Memory Mapped I/O is a technique whereby the transfer of data to and from peripheral devices is accomplished by using some of the normally available memory space. In this technique, memory addresses are decoded within the peripheral devices and are thus used to determine when a specific device is being addressed. Usually, each type of function within the peripheral device is assigned a memory address and can then be accessed by the CPU. For example, the peripheral device may contain a command register, a status register, a data in register and a data out register. Thus, four memory addresses might be utilized in performing I/O to this peripheral. Figure 1 is also the block diagram for a Memory Mapped I/O scheme.

The chief advantage of Memory Mapped I/O is that all of the memory reference instructions are usually available to perform the I/O function. Consequently, no special I/O instructions are required in the machine. The key disadvantage of this technıque is that a block of the memory addressing range must be set aside for assignment to the peripheral devices. Thus, the overall memory addressing range of the machine is reduced by the size of this block. Again, the resources of the CPU are tied up while the I/O is being performed. A well known machine using only Memory Mapped I/O is the PDP 11. In it the upper 4 k of memory space is usually used for the I/O devices.

\section*{Interrupt Driven I/O}

Interrupts are means by which a peripheral device can stop the normal flow of the CPU instruction execution and force the CPU to temporarily suspend its current program. Then, the program "jumps" to a different program which executes an I/O transfer. Typically, this eliminates the need for polling the peripheral devices to determine if an I/O transfer is ready. Thus, the interrupt driven scheme provides a more efficient I/O transfer technique. However, there is an overhead burden associated with interrupts in that the CPU must store away and later restore all of the parameters required to resume the interrupted program. This overhead degrades the CPU performance. Depending on the overall interrupt structure, the CPU still may have to do some polling of devices which may be tied to the same interrupt level. It should be pointed out that both Programmed I/O and Memory Mapped can take advantage of the interrupt technique. That is, an interrupt can be used to initiate the peripheral data transfer in elther type of system. The CPU still must control the transfer of the data between the memory and the peripheral device and the CPU resources are unavailable for executing other instructions during this time.

\section*{What is DMA?}

DMA is a technique for data transfer which provides a direct path between the I/O device and the memory without CPU intervention. With this path, a peripheral device has "Direct Memory Access" and can transfer data directly to or from the memory. The


Figure 3. DMA I/O System.
purpose of the DMA is to relieve the CPU of the task of controlling the I/O transfer, thereby freeing it to perform other tasks during this time, and to provide a means by which data can be transferred between an I/O device and memory at very high speed. Figure 3 shows the Block Diagram of a system where several I/O devices can perform DMA transfers into memory. Note that the CPU and peripheral devices share a common bus to the memory and that the CPU and peripheral devices cannot access memory during the same cycle. DMA can also be designed to perform memory-to-memory transfers or I/O-to-l/O transfers.
Several DMA transfer methods exist, such as the CPU halt method, the memory timeslice method, and the "cycle steal" method. In the CPU halt method, the CPU is halted and switched off the bus while a DMA transfer occurs. This is the most straightforward method. However, it takes a relatively long time to switch the CPU on and off the bus, and the CPU cannot do anything during the transfer.
The memory timeslice method works by splitting each memory cycle into two timeslots; one is reserved for the CPU and the other for DMA. This method provides the highest CPU execution rate as well as the highest DMA transfer rate because both the CPU and DMA are guaranteed access to memory during every memory cycle. The disadvantage of this method is that high speed, costly memories must be used.
The "cycle steal" method is a cost/performance compromise between the low cost of the CPU halt method and the high performance of the memory timeslice method. Cycle stealing refers to a DMA device "stealing" a CPU memory cycle in order to execute a DMA transfer. CPU program execution continues during the DMA transfer (the CPU is not halted), resulting in an overlap of CPU program execution with DMA transfer. If the CPU and a DMA device require a memory cycle at the same time, priority is granted to the DMA device and the CPU waits until the DMA cycle is completed. DMA causes CPU performance degradation only in those applications where the CPU uses the entire memory bandwidth. In many applications the CPU is slow relative to memory cycle time and "cycle stealing" provides satisfactory performance at relatively low cost.

\section*{How is DMA Implemented?}

In order to relieve the CPU of the I/O transfer control task, circuitry external to the CPU must be added. This circuitry is called the DMA Controller and performs the following functions.

Address Line Control - In a DMA system, the memory address lines are driven by either the CPU or a DMA device, depending on which is using the memory during a given cycle. The DMA controller must switch the appropriate address onto the memory address lines.

Data Transfer Control - The DMA Controller must provide the control signals required to transfer data directly between memory and an I/O device. As with the address lines, these control signals must be switched onto and off of the memory control lines appropriately.

Address Maintenance - Just as the CPU has the program counter and one or more other registers for memory address pointers, the DMA controller must also maintain an address pointer that indicates where the next word of data will be read or written in memory. This pointer must be incremented or decremented after each word transfer.
Word Count Maintenance - At the initialization of a DMA transfer, the CPU specifies to the DMA Controller the total number of words to be transferred. During the transfer, the DMA controller must maintain a count of the number of words that have been transferred and terminate the transfer when the specified number of words has been reached.

Mode Control - Certain aspects of a DMA transfer, such as direction of data flow, method of termination, etc., may vary from one DMA transfer to the next. For this reason, a number of DMA modes may be required. Mode control logic contained in the DMA controller, is set by the CPU at the initialization of a DMA transfer.
A DMA Controller can be placed in each I/O device (Distributed DMA) or DMA control circuitry for a number of \(I / O\) devices can be placed in a separate unit (Centralized DMA). The former provides the advantage of incremental cost; DMA control circuitry is added only as I/O devices are added. The latter provides the advantages of consolidation.
At DMA initialization, the CPU normally specifies the mode, the starting memory address and the number of words to be transferred (word count) to the DMA controller. In some applications, it is desirable to repeat a DMA transfer over and over again without disturbing the CPU. This capability is called Repetitive DMA, and can be implemented by adding two registers to the DMA controller. One register saves the starting address and the other the starting word count. This allows the DMA Controller to automatically reinitialize itself after the transfer of the data has been completed, thereby eliminating the need for CPU intervention.

\section*{The Am2940 DMA ADDRESS GENERATOR}

The design of the Address Line Control, Data Transfer Control and Mode Control circuitry of a DMA Controller is dependent upon system architecture and timing; therefore, it varies considerably from system to system. However, the address maintenance and word count maintenance circuitry is independent of these variables, and is common to almost all DMA Controllers. The Am2940 DMA Address Generator is designed for use in DMA Controllers and provides the Address and Word Count maintenance circuitry that is common to most. It combines the advantages of high speed bipolar LSI with the flexibility and general purpose usefulness of microprogrammed control.

\section*{Am2940 GENERAL DESCRIPTION}

The Am2940, a 28 -pin member of Advanced Micro Devices Am2900 family of Low-Power Schottky bipolar LSI chips, is a high-speed, cascadable, eight-bit wide Direct Memory Access Address Generator slice. Any number of Am2940s can be cascaded to form larger addresses.
The primary function of the device is to generate sequential memory addresses for use in the sequential transfer of data to or from a memory. It also maintains a data word count and generates a DONE signal when a programmable terminal count has been reached. The device is designed for use in peripheral controllers with DMA capability or in any other system which transfers data to or from sequential locations of a memory.
The Am2940 can be programmed to increment or decrement the memory address in any of four control modes, and executes eight different instructions. The initial address and word count are saved internally by the Am2940 so that they can be restored later in order to repeat the data transfer operation.

\section*{Am2940 ARCHITECTURE}

As shown in the Block Diagram of Figure 4, the Am2940 consists of the following:
- A three-bit Control Register.
- An eight-bit Address Counter with input multiplexer.
- An eight-bit Address Register.
- An eight-bit Word Counter with input multiplexer.
- An eight-bit Word Count Register.
- Transfer complete circuitry.
- An eight-bit wide data multiplexer with three-state output buffers.
- Three-state address output buffers with external output enable control.
- An instruction decoder.

\section*{Control Register}

Under instruction control, the Control Register can be loaded or read from the bidirectional DATA lines \(D_{0}-D_{7}\). Control Register bits 0 and 1 determine the Am2940 Control Mode, and bit 2 determines whether the Address Counter increments or decrements. Figure 5 defines the Control Register format.

\section*{Address Counter}

The Address Counter, which provides the current memory address, is an eight-bit, binary, up/down counter with full look-ahead carry generation. The Address Carry Input ( \(\overline{\mathrm{ACl}}\) ) and Address Carry Output ( \(\overline{\mathrm{ACO}}\) ) allow cascading to accommodate larger addresses. Under instruction control, the Address Counter can be enabled, disabled, and loaded from the DATA inputs, \(D_{0}-D_{7}\), or the Address Register. When enabled and the \(\overline{A C I}\) input is LOW, the Address Counter increments/decrements on the LOW to HIGH transition of the CLOCK input, CP. The Address Counter output can be enabled onto the three-state ADDRESS outputs \(\mathrm{A}_{0}-\mathrm{A}_{7}\) under control of the Output Enable input, \(\overline{\mathrm{OE}}_{\mathrm{A}}\).


Figure 4. Am2940 DMA Address Generator.


Figure 5. Control Register Format Definition.

\section*{Address Register}

The eight-bit Address Register saves the initial address so that it can be restored later in order to repeat a transfer operation. When the LOAD ADDRESS instruction is executed, the Address Register and Address Counter are simultaneously loaded from the DATA inputs, \(D_{0}-D_{7}\).

\section*{Word Counter and Word Count Register}

The Word Counter and Word Count Register, which maintain and save a word count, are similar in structure and operation to the Address Counter and Address Register, with the exception that the Word Counter increments in Control Modes 1 and 3, decrements in Control Mode 0, and is disabled in Control Mode 2. The LOAD WORD COUNT instruction simultaneously loads the Word Counter and Word Count Register.

\section*{Transfer Complete Circuitry}

The Transfer Complete Circuitry is a combinational logic network which detects the completion of the data transfer operation in three Control Modes and generates the DONE output signal. The DONE signal is an open-collector output, which can be dot-anded between chips.

\section*{Data Multiplexer}

The Data Multiplexer is an eight-bit wide, 3 -input multiplexer which allows the Address Counter, Word Counter, and Control Register to be read at the DATA lines, \(D_{0}-D_{7}\). The Data Multiplexer and three-state Data output buffers are instruction controlled.

\section*{Address Output Buffers}

The three-state Address Output Buffers allow the Address Counter output to be enabled onto the ADDRESS lines, \(A_{0}-A_{7}\), under external control. When the Output Enable input, \(\overline{\mathrm{OE}}_{\mathrm{A}}\), is LOW, the Address output buffers are enabled; when \(\overline{O E}_{A}\) is HIGH, the ADDRESS lines are in the high-impedance state. The address and Data Output Buffers can sink 24 mA output current over the commercial operating range.

\section*{Instruction Decoder}

The Instruction Decoder generates required internal control signals as a function of the INSTRUCTION inputs, \(\mathrm{I}_{0}-\mathrm{I}_{2}\) and Control Register bits 0 and 1.

\section*{Clock}

The CLOCK input, CP, is used to clock the Address Register, Address Counter, Word Count Register, Word Counter, and Control Register, all on the LOW to HIGH transition of the CP signal.

\section*{Am2940 CONTROL MODES}

\section*{Control Mode 0 - Word Count Equals Zero Mode}

In this mode, the LOAD WORD COUNT instruction loads the word count into the Word Count Register and Word Counter. When the Word Counter is enabled and the Word Counter Carry-in, WCI, is LOW, the Word Counter decrements on the LOW to HIGH transition of the CLOCK input, CP. Figure 5 specifies when the DONE signal is generated in this mode.

\section*{Control Mode 1 - Word Count Compare Mode.}

In this mode the LOAD WORD COUNT instruction loads the word count into the Word Count Register and clears the Word Counter. When the Word Counter is enabled and the Word Counter Carry-in, \(\overline{\mathrm{WCl}}\), is LOW, the Word Counter increments on the LOW to HIGH transition of the clock input, CP. Figure 5 specifies when the DONE signal is generated.

\section*{Control Mode 2 - Address Compare Mode}

In this mode, only an initial and final memory address need be specified. The initial Memory Address is loaded into the Address Register and Address Counter and the final memory address is loaded into the Word Count Register and Word Counter. The Word Counter is always disabled in this mode and serves as a holding register for the final memory address. When the Address Counter is enabled and the \(\overline{A C l}\) input is LOW, the Address Counter increments or decrements (depending on Control Register bit 2) on the LOW to HIGH transition of the CLOCK input, CP. The Transfer Complete Circuitry compares the Address Counter with the Word Counter and generates the DONE signal during the last word transfer, i.e., when the Address Counter equals the Word Counter.

\section*{Control Mode 3 - Word Counter Carry Out Mode}

For this mode of operation, the user can load the Word Count Register and Word Counter with the two's complement of the number of data words to be transferred. When the Word Counter is enabled and the \(\overline{\mathrm{WCI}}\) input is LOW, the Word Counter increments on the LOW to HIGH transition of the CLOCK input, CP. A Word Counter Carry Out signal, WCO, indicates the last data word is being transferred. The DONE signal is not required in this mode and, therefore, is always LOW.

\section*{Am2940 INSTRUCTIONS}

The Am2940 instruction set consists of eight instructions. Six instructions load and read the Address Counter, Word Counter and Control Register, one instruction enables the Address and Word Counters, and one instruction reinitializes the Address and Word Counters. The function of the REINITIALIZE COUNTERS, LOAD WORD COUNT, and ENABLE COUNTERS instructions vary with the Control Mode being utilized. Table 1 defines the Am2940 Instructions as a function of Instruction inputs \(\mathrm{I}_{0}-\mathrm{I}_{2}\) and the four Am2940 Control Modes.

The WRITE CONTROL REGISTER instruction writes DATA input \(D_{0}-D_{2}\) into the Control Register; DATA inputs \(D_{3}-D_{7}\) are "don't care" inputs for this instruction. The READ CONTROL REGISTER instruction gates the Control Register outputs to DATA lines, \(D_{0}-D_{2}\). DATA lines \(D_{3}-D_{7}\) are in the HIGH state during this instruction.
The Word Counter can be read using the READ WORD COUNTER instruction, which gates the Word Counter ouputs to DATA lines \(D_{0}-D_{7}\). The LOAD WORD COUNT instruction is Control Mode dependent. In Control Modes 0, 2, and 3, DATA inputs \(D_{0}-D_{7}\) are written into both the Word Count Register and Word Counter. In Control Mode 1, DATA inputs \(D_{0}-D_{7}\) are written into the Word Count Register and the Word Counter is cleared.
The READ ADDRESS COUNTER instruction gates the Address Counter outputs to DATA lines \(\mathrm{D}_{0}-\mathrm{D}_{7}\), and the LOAD ADDRESS instruction writes DATA inputs \(D_{0}-D_{7}\) into both the Address Register and Address Counter.

In Control Modes 0, 1, and 3, the ENABLE COUNTERS instruction enables both the Address and Word Counters; in Control Mode 2, the Address Counter is enabled and the Word Counter holds its contents. When enabled and the carry input is active, the counters increment on the LOW to HIGH transition of the CLOCK input, CP. Thus, with this instruction applied, counting can be controlled by the carry inputs.
The REINITIALIZE COUNTERS instruction also is Control Mode dependent. In Control Modes 0, 2, and 3, the contents of the Address Regıster and Word Count Register are transferred to the respective Address Counter and Word Counter; in Control Mode 1, the content of the Address Register is transferred to the Address Counter and the Word Counter is cleared. The REINITIALIZE COUNTERS instruction allows a data transfer operation to be repeated without reloading the address and word count from the DATA lines.

\section*{Am2940 Timing}

Varıous computations must be performed by the designer to determine how fast the Am2940 can be operated reliably in a given design. The exercises of this section demonstrate how these computations are performed.

Worst case A.C. characteristics, over the full temperature and voltage operating range should be used in these computations. Since, at the time of this writing, the Am2940 is still being characterized, only typical A.C. characteristics are available. These typicals are used here merely to demonstrate how the computations are performed; the designer must use worst-case characteristics. Figure 6 shows the characteristics of a Schottky register and a memory which are assumed for this exercise.
Figures 7A, B, and C show the typical cycle time calculations for the 16 -bit Am2940 configuration. The typical delay along the longest path for any of the eight Am2940 instructions determines the typical cycle time. In each case, delays are computed from the LOW to HIGH transition of a clock through an entire microcycle to the next LOW to HIGH transition of a clock. The typical cycle tıme for a 16-bit Am2940 configuration is 64ns.

TABLE I. Am2940 INSTRUCTIONS
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & \(I_{1}\) & & Octal Code & Function & Mnemonic & Control Mode & Word Reg. & Word Counter & Address Reg. & Address Counter & Control Register & \[
\begin{gathered}
\text { Data } \\
D_{0}-D_{7}
\end{gathered}
\] \\
\hline L & L & L & 0 & WRITE CONTROL REGISTER & WRCR & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & \(\mathrm{D}_{0}-\mathrm{D}_{2} \rightarrow \mathrm{CR}\) & INPUT \\
\hline & L & H & 1 & READ
CONTROL
REGISTER & RDCR & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & \[
\begin{aligned}
& \mathrm{CR} \rightarrow \mathrm{D}_{0}-\mathrm{D}_{2} \\
& (\text { Note } 1)
\end{aligned}
\] \\
\hline & H & L & 2 & \[
\begin{aligned}
& \text { READ } \\
& \text { WORD } \\
& \text { COUNTER }
\end{aligned}
\] & RDWC & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & WC \(\rightarrow\) D \\
\hline & H & H & 3 & READ ADDRESS COUNTER & RDAC & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & \(\mathrm{AC} \rightarrow \mathrm{D}\) \\
\hline \multirow[b]{2}{*}{H} & \multirow[b]{2}{*}{L} & & \multirow[t]{2}{*}{4} & \multirow[t]{2}{*}{REINITIALIZE COUNTERS} & \multirow[b]{2}{*}{REIN} & 0, 2, 3 & HOLD & WCR \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & Z \\
\hline & & & & & & 1 & HOLD & ZERO \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & Z \\
\hline & L & H & 5 & LOAD ADDRESS & LDAD & 0, 1, 2, 3 & HOLD & HOLD & \(\mathrm{D} \rightarrow \mathrm{AR}\) & \(D \rightarrow A C\) & HOLD & INPUT \\
\hline \multirow[b]{2}{*}{H} & \multirow[b]{2}{*}{H} & \multirow[b]{2}{*}{L} & \multirow[b]{2}{*}{6} & \multirow[t]{2}{*}{\[
\begin{aligned}
& \text { LOAD } \\
& \text { WORD } \\
& \text { COUNT }
\end{aligned}
\]} & \multirow{2}{*}{LDWC} & 0, 2, 3 & \(\mathrm{D} \rightarrow \mathrm{WR}\) & D \(\rightarrow\) WC & HOLD & HOLD & HOLD & INPUT \\
\hline & & & & & & 1 & \(\mathrm{D} \rightarrow \mathrm{WR}\) & ZERO \(\rightarrow\) WC & HOLD & HOLD & HOLD & INPUT \\
\hline \multirow[t]{2}{*}{H} & \multirow[t]{2}{*}{H} & \multirow[t]{2}{*}{H} & \multirow[t]{2}{*}{7} & \multirow[b]{2}{*}{ENABLE COUNTERS} & \multirow[t]{2}{*}{ENCT} & 0, 1, 3 & HOLD & ENABLE COUNT & HOLD & ENABLE COUNT & HOLD & Z \\
\hline & & & & & & 2 & HOLD & HOLD & HOLD & ENABLE COUNT & HOLD & z \\
\hline
\end{tabular}
\begin{tabular}{ll}
\(C R=\) Control Reg. & WCR \(=\) Word Count Reg \\
\(A R=\) Address Reg. & WC \(=\) Word Counter \\
\(A C=A d d r e s s\) Counter & \(D=\) Data
\end{tabular}

L = LOW
\(\mathrm{H}=\mathrm{HIGH}\)
\(\mathrm{Z}=\) High Impedance

Note 1.
Data Bits \(D_{3}-D_{7}\) are high during this instruction.
\begin{tabular}{|l|c|c|c|}
\hline \multicolumn{4}{|c|}{ Min. } \\
\hline & Typ. & Max. \\
\hline Schottky Register & & & \\
Clock to Output Delay & & 9 & 15 \\
Input Set-Up Time & 5 & 2 & \\
Memory & & & \\
Address Set-Up Time & 20 & 10 & \\
\hline \multicolumn{4}{|l}{} \\
\hline
\end{tabular}

Figure 6. Assumed AC Characteristics.

Figure 8 shows the address output enable time computations. Since the Am2940 has an asynchronous address output enable control, the address output enable time may not be related to the Am2940 cycle time.
Figure 9 shows the typical cycle time calculation for an 8 -bit Am2940 configuration. The path shown is the longest path and determines an 8 -bit typical cycle time of 52 ns .
The typical cycle time calculation for a 24-bit Am2940 configuration is shown in Figure 10. The path shown is the longest path and determines a 24 -bit typical cycle time of 76 ns .
Figure 11 is a summary of typical Am2940 cycle times for the 8, 16 and 24-bit configurations.
a)

\section*{CALCULATIONS FOR \\ READ CONTROL REG, \\ READ ADDRESS COUNTER, \\ READ WORD COUNTER \\ instructions}

\begin{tabular}{|l|l|c|c|c|c|}
\hline DEVICE TYPE & DEVICE PATH & PATH 1 & PATH 2 & PATH 3 & PATH 4 \\
\hline Schottky Reg & CLK to Q & 9 & 9 & & \\
\hline 2940 & Inst Set-Up & 33 & & & \\
\hline 2940 & Inst. to Data & & 21 & & \\
\hline Schottky Reg & D Set-Up & & 2 & & \\
\hline 2940 & CLK to DONE & & & 50 & \\
\hline Schottky Reg & D Set-Up & & & 2 & \\
\hline 2940 & CLK to WCO & & & & 35 \\
\hline 2940 & WCl to DONE & & & & 27 \\
\hline Schottky Reg & D Set-Up & & & & 2 \\
\hline TOTAL-ns & & 42 & 32 & 52 & 64 \\
\hline
\end{tabular}

PATH 1
PATH 2
PATH 3
PATH 4
PATH 3 … . . .x. ....................


Figure 7. 16-Bit Typical Cycle Time Computations.
b)

CALCULATIONS FOR WRITE CONTROL REG, LOAD WORD COUNT, LOAD ADDRESS instructions


MPR-553
\begin{tabular}{|l|l|c|c|c|}
\hline DEVICE TYPE & DEVICE PATH & PATH 1 & PATH 2 & PATH 3 \\
\hline Schottky Reg & CLK to Q & 9 & 9 & \\
\hline 2940 & Inst Set-Up & 33 & & \\
\hline 2940 & Data Set-Up & & 13 & \\
\hline 2940 & CLK to WCO & & & 35 \\
\hline 2940 & WCI to DONE & & & 27 \\
\hline Schottky Reg & D Set-Up & & & 2 \\
\hline TOTAL-ns & & 42 & 22 & 64 \\
\hline
\end{tabular}

PATH 2 manown :
PATH 3 … … .................
c)

CALCULATIONS FOR
REINITIALIZE COUNTERS, ENABLE COUNTERS INSTRUCTIONS

\begin{tabular}{|l|l|c|c|}
\hline DEVICE TYPE & DEVICE PATH & PATH 1 & PATH 2 \\
\hline Schottky Reg & CLK to Q & 9 & \\
\hline 2940 & Inst Set-Up & 33 & \\
\hline 2940 & CLK to WCO & & 35 \\
\hline 2940 & WCI to DONE & & 27 \\
\hline Schottky Reg & D Set-Up & & 2 \\
\hline TOTAL-ns & & 42 & 64 \\
\hline
\end{tabular}

Figure 7. 16-Bit Typical Cycle Time Computations. (Cont.)


Figure 8. Speed Computations.

\begin{tabular}{|l|l|c|}
\hline DEVICE TYPE & DEVICE PATH & PATH 1 \\
\hline 2940 & CLK to DONE & 50 \\
\hline Schottky Reg. & D Set-Up & 2 \\
\hline TOTAL-ns & & 52 \\
\hline
\end{tabular}

PATH 1
MPR-556
Figure 9. 8-Bit Typical Cycle Time Computation.

\begin{tabular}{|l|l|c|}
\hline DEVICE TYPE & DEVICE PATH & PATH 1 \\
\hline 2940 & CLK to WCO & 35 \\
\hline 2940 & WCI to WCO & 12 \\
\hline 2940 & WCI to DONE & 27 \\
\hline Schottky Reg & D Set-Up & 2 \\
\hline TOTAL-ns & & 76 \\
\hline
\end{tabular}

Figure 10. 24-Bit Typical Cycle Time Computation.


Figure 11. Summary of Am2940 Cycle Times.

\section*{AN EXAMPLE DESIGN}

The Am2940 is designed for use in high speed peripheral Controllers using DMA and provides the address and word count maintenance circuitry that is common to most. As indicated previously, DMA Control can be placed in each I/O Controller (Distributed DMA) or DMA Control for a number of I/O devices can be centralized in a separate unit.
Figure 12 shows a block diagram of a microprogrammed I/O Controller which is designed for use in a Distributed DMA system. The Am2910 Microprogram Sequencer, Microprogram Memory and the Microinstruction Register form the microprogram control portion of this I/O Controller. The Am2940 maintains the memory address and word count required for DMA operation. An internal three-state bus provides the communication path between the Microinstruction Register, the Am2917 Data Transceivers, the Am2940, the Am2901A Microprocessor, and the Device Interface

Circuitry. The Address Line Control, Data Transfer Control and Mode Control functions of this DMA Controller are incorporated into the I/O Controller Microprogram and the Asynchronous Interface Control Circuitry. The I/O Controller Microprogram also controls the Am2940.

The Am2940 interconnections are shown in detail in Figure 13. Two Am2940s are cascaded to generate a sixteen-bit address. The Am2940 ADDRESS and DATA output current sink capability is 24 mA over the commercial operating range. This allows the Am2940s to drive the System Address Bus and Internal ThreeState Bus directly, thereby eliminating the need for separate bus drivers. Three bits in the Microinstruction Register provide the Am2940 Instruction Inputs, \(\mathrm{I}_{0}-\mathrm{I}_{2}\). The microprogram clock is used to clock the Am2940s and, when the ENABLE COUNTERS instruction is applied, address and word counting is controlled by the CNT bit of the Microinstructıon Register.
Asynchronous interface control crrcuitry generates System Bus control signals and enables the Am2940 Address onto the System Address Bus at the appropriate time. The open-collector DONE outputs are dot-anded and used as a test input to the Am2910 Microprogram Sequencer.
The I/O controller read operation is flowcharted in Figure 14. The CPU initializes the I/O controller by sending a read command, the starting memory address, the word count and any other parameters required to perform the operation. The I/O Controller then obtains a word of data from the I/O device and requests use of the system bus for a DMA transfer. When the bus is granted, the I/O Controller requests a memory data transfer. Upon receipt of the memory acknowledge signal, which indicates the memory trans-


Figure 12. DMA Peripheral Controller Block Diagram.


Figure 13. Am2940 Interconnections.


Figure 14. Read Control Flowchart.
fer is complete, the I/O Controller tests the word count. If the word count is not equal to zero, the word counter is decremented, the address counter is incremented and another data word is transferred. When the word count reaches zero, the I/O Controller terminates the data transfer and informs the CPU that the transfer has been completed.

\section*{THE Am2942 PROGRAMMABLE TIMER/COUNTER, DMA ADDRESS GENERATOR.}

\section*{GENERAL DESCRIPTION}

The Am2942, a 22-pin version of the Am2940, can be used as a high-speed DMA address Generator or Programmable Timer/Counter. It provides multiplexed Address and Data lines, for use with a common bus, and additional Instruction Input and Instruction Enable pins. The Am2942 executes 16 instructions; eight are the same as the Am2940 instructions, and eight instructions facilitate the use of the Am2942 as a Programmable Timer/Counter. The Instruction Enable input allows the sharing of the Am2942 instruction field with other devices.

When used as a Timer/Counter, the Am2942 provides two independent, programmable, eight-bit, up-down counters in a 22 -pin package. The two on-chip counters can be cascaded to form a single chip, 16-bit counter. Also, any number of chips can be cascaded - for example three cascaded Am2942s form a 48-bit timer/counter.
Reinitialization instructions provide the capability to reintialize the counters from on-chip registers. Am2942 Programmable Control Modes, identical to those of the Am2940, offer four different types of programmable control.

\section*{Am2942 ARCHITECTURE}

As shown in the Block Diagram, the Am2942 consists of the following-
- A three-bit Control Register.
- An eight-bit Address Counter with input multiplexer.
- An eight-bit Address Register.
- An eight-bit Word Counter with input multiplexer.
- An eight-bit Word Count Register.
- Transfer complete circuitry.
- An eight-bit wide data multiplexer with three-state output buffers.
- An instruction decoder.


Figure 15. Am2942 Block Diagram.

\section*{Control Register}

Under instruction control, the Control Register can be loaded or read from the bidirectional DATA lines, \(D_{0}-D_{7}\). Control Register bits 0 and 1 determine the Am2942 Control Mode, and bit 2 determines whether the Address Counter increments or decrements. Figure 16 defines the Control Register format.

\section*{Address Counter}

The Address Counter, which provides the current memory address, is an eight-bit, binary, up/down counter with full look-ahead carry generation. The Address Carry input ( \(\overline{\mathrm{ACI}}\) ) and Address Carry Output ( \(\overline{\mathrm{ACO}}\) ) allow cascading to accommodate larger
addresses. Under instruction control, the Address Counter can be enabled, disabled, and loaded from the DATA inputs, \(D_{0}-D_{7}\), or the Address Register. When enabled and the ACI input is LOW, the Address Counter increments/decrements on the LOW to HIGH transition of the CLOCK input, CP.

\section*{Address Register}

The eight-bit Address Register saves the initial address so that it can be restored later in order to repeat a transfer operation. When the LOAD ADDRESS instruction is executed, the Address Register and Address Counter are simultaneously loaded from the DATA inputs, \(D_{0}-D_{7}\).
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow{10}{*}{"'} & \multicolumn{7}{|r|}{Control Register} & & \\
\hline & & & & & \(\mathrm{CR}_{2}\) & \(\mathrm{CR}_{1}\) & \(\mathrm{CR}_{0}\) & & \\
\hline & & & Control & \multirow[b]{2}{*}{Control Mode Type} & \multicolumn{2}{|r|}{\multirow[b]{2}{*}{Word Counter}} & \multicolumn{3}{|c|}{Done Output Signal} \\
\hline & \(\mathrm{CR}_{1}\) & \(\mathrm{CR}_{0}\) & Number & & & & & \(\overline{\mathbf{W C l}}=\) LOW & \(\overline{\mathrm{WCl}}=\mathbf{H I G H}\) \\
\hline & L & L & 0 & Word Count Equals Zero & \multicolumn{2}{|r|}{Decrement} & & HIGH when Word Counter \(=1\) & HIGH when Word Counter \(=0\) \\
\hline & L & H & 1 & Word Count Compare & \multicolumn{2}{|r|}{Increment} & & HIGH when Word Counter \(+1=\) Word Count Register & \begin{tabular}{l}
HIGH when \\
Word Counter = Word Count Register
\end{tabular} \\
\hline & H & L & 2 & Address Compare & \multicolumn{2}{|r|}{Decrement} & & \multicolumn{2}{|l|}{HIGH when Word Counter = Address Counter} \\
\hline & H & H & 3 & Word Counter Carry Out & \multicolumn{2}{|r|}{Increment} & & \multicolumn{2}{|c|}{Always LOW} \\
\hline & \multicolumn{4}{|l|}{\multirow[b]{2}{*}{\[
\begin{aligned}
& H=H I G H \\
& L=L O W
\end{aligned}
\]}} & \(\mathrm{CR}_{2}\) & \multicolumn{2}{|l|}{Address Counter} & & \\
\hline & & & & & \[
\begin{aligned}
& \mathrm{L} \\
& \mathrm{H}
\end{aligned}
\] & \multicolumn{2}{|l|}{\begin{tabular}{l}
Increment \\
Decrement
\end{tabular}} & & \\
\hline
\end{tabular}

Figure 16. Control Register Format Definition.

\section*{Word Counter And Word Count Register}

The Word Counter and Word Count Register, which maintain and save a word count, are similar in structure and operation to the Address Counter and Address Register, with the exception that the Word Counter increments in Control Modes 1 and 3 and decrements in Control Modes 0 and 2. The LOAD WORD COUNT instruction simultaneously loads the Word Counter and Word Count Register.

\section*{Transfer Complete Circuitry}

The Transfer Complete Circuitry is a combinational logic network which detects the completion of the data transfer operation in three Control Modes and generates the DONE output signal. The DONE signal is an open-collector output, which can be dot-anded between chips.

\section*{Data Multiplexer}

The Data Multiplexer is an eight-bit wide, three-input multiplexer which allows the Address Counter, Word Counter and Control Register to be read at DATA lines \(D_{0}-D_{7}\). The Data Multiplexer output, \(Y_{0}-Y_{7}\), is enabled onto DATA lines \(D_{0}-D_{7}\) if, and only if, the Output Enable input, \(\mathrm{OE}_{\mathrm{D}}\), is LOW. (Refer to Figure 17.)
\begin{tabular}{|c|c|}
\hline\(\overline{\mathrm{OE}}_{\mathrm{D}}\) & \multicolumn{1}{|c|}{\(\mathrm{D}_{0}-\mathrm{D}_{7}\)} \\
\hline L & DATA MULTIPLEXER OUTPUT, \(\mathrm{Y}_{0}-\mathrm{Y}_{7}\) \\
H & HIGH Z \\
\hline
\end{tabular}

Figure 17. Data Bus Output Enable Function.

\section*{Instruction Decoder}

The Instruction Decoder generates required internal control signals as a function of the INSTRUCTION inputs, \(\mathrm{I}_{0}-\mathrm{I}_{3}\) Control Register bits 0 and 1, and the INSTRUCTION ENABLE input, \(\mathrm{I}_{\mathrm{E}}\).

\section*{Clock}

The clock input, CP, is used to clock the Address Register, Address Counter, Word Count Register, Word Counter, and Control Register, all on the LOW to HIGH transition of the CP signal.

\section*{Am 2942 CONTROL MODES}

\section*{Control Mode 0 - Word Count Equals Zero Mode}

In this mode, the LOAD WORD COUNT instruction loads the word count into the Word Count Register and Word Counter. When the Word Counter is enabled and the Word Counter Carry-in, \(\overline{\text { WCI, }}\), is LOW, the Word Counter decrements on the LOW to HIGH transition of the CLOCK input, CP. Figure 16 specifies when the DONE signal is generated in this mode.

\section*{Control Mode 1 - Word Count Compare Mode}

In this mode the LOAD WORD COUNT instruction loads the word count into the Word Count Register and clears the Word Counter. When the Word Counter is enabled and the Word Counter Carry-in, \(\overline{\mathrm{WCI}}\), is LOW, the Word Counter increments on the LOW to HIGH transition of the clock input, CP. Figure 16 specifies when the DONE signal is generated.

\section*{Control Mode 2 - Address Compare Mode}

In this mode, only an initial and final memory address need to be specified. The initial Memory Address is loaded into the Address Register and Address Counter and the final memory address is loaded into the Word Count Register and Word Counter. The Word Counter serves as a holding register for the final memory address. When the Address Counter is enabled and the \(\overline{A C l}\) input is LOW, the Address Counter increments or decrements (depending on Control Register bit 2) on the LOW to HIGH transition of the CLOCK input, CP. The Transfer Complete Circuitry compares the Address Counter with the Word Counter and generates the DONE signal during the last word transfer, i.e., when the Address Counter equals the Word Counter.

\section*{Control Mode 3 - Word Counter Carry Out Mode}

For this mode of operation, the user can load the Word Count Register and Word Counter with the two's complement of the number of data words to be transferred. When the Word Counter is enabled and the WCI input is LOW, the Word Counter increments on the LOW to HIGH transition of the CLOCK input, CP. A Word Counter Carry Out signal, \(\bar{W} C O\), indicates the last data word is being transferred. The DONE signal is not required in this mode and, therefore, is always LOW.

\section*{Am2942 INSTRUCTIONS}

The Am2942 instruction set consists of sixteen instructions. Eight are DMA instructions and are the same as the Am2940 instructions. The remaining eight instructions are designed to facilitate the use of the Am2942 as a Programmable Timer/ Counter. Figures 18 and 19 define the Am2942 Instructions.
Instructions 0-7 are DMA instructions. The WRITE CONTROL REGISTER instruction writes DATA input \(D_{0}-D_{2}\) into the Control Register; DATA inputs \(D_{3}-D_{7}\) are "don't care" inputs for this instruction. The READ CONTROL REGISTER instruction gates the Control Register to Data Multiplexer outputs \(\mathrm{Y}_{0}-\mathrm{Y}_{2}\). Outputs \(Y_{3}-Y_{7}\) are HIGH during this instruction.
The Word Counter can be read using the READ WORD COUNTER instruction, which gates the Word Counter to Data Multiplexer outputs, \(\mathrm{Y}_{0}-\mathrm{Y}_{7}\). The LOAD WORD COUNT instruction is Control Mode dependent. In Control Modes 0, 2 and 3, DATA inputs \(D_{0}-D_{7}\) are written into both the Word Count Register and Word Counter. In Control Mode 1, DATA inputs \(D_{0}-D_{7}\) are written into the Word Count Register and the Word Counter is cleared.
The READ ADDRESS COUNTER instruction gates the Address Counter to Data Multiplexer outputs, \(Y_{0}-Y_{7}\), and the LOAD ADDRESS instruction writes DATA inputs \(D_{0}-D_{7}\) into both the Address Register and Address Counter.
In Control Modes 0, 1, and 3, the ENABLE COUNTERS instruction enables both the Address and Word Counters; in Control Mode 2, the Address Counter is enabled and the Word Counter holds its contents. When enabled and the carry input is active, the counters increment on the LOW to HIGH transition of the CLOCK input, CP. Thus, with this instruction applied, counting can be controlled by the carry inputs.
The REINITIALIZE COUNTERS instruction also is Control Mode dependent. In Control Modes 0, 2, and 3, the contents of the Address Register and Word Count Register are transferred to the respective Address Counter and Word Counter; in Control Mode 1, the content of the Address Register is transferred to the Address Counter and the Word Counter is cleared. The REINITIALIZE COUNTERS instruction allows a data transfer operation to be repeated without reloading the address and word count from the DATA lines.
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \(\overline{T_{E}}\) & \(\mathrm{I}_{3}\) & \(\mathrm{I}_{2}\) & \(\mathrm{I}_{1}\) & \(\mathrm{I}_{0}\) & \[
\begin{aligned}
& \text { HEX } \\
& \text { CODE }
\end{aligned}
\] & & \\
\hline 0 & 0 & 0 & 0 & 0 & 0 & WRITE CONTROL REGISTER & \multirow{9}{*}{} \\
\hline 0 & 0 & 0 & 0 & 1 & 1 & READ CONTROL REGISTER & \\
\hline 0 & 0 & 0 & 1 & 0 & 2 & READ WORD COUNTER & \\
\hline 0 & 0 & 0 & 1 & 1 & 3 & READ ADDRESS COUNTER & \\
\hline 0 & 0 & 1 & 0 & 0 & 4 & REINITIALIZE COUNTERS & \\
\hline 0 & 0 & 1 & 0 & 1 & 5 & LOAD ADDRESS & \\
\hline 0 & 0 & 1 & 1 & 0 & 6 & LOAD WORD COUNT & \\
\hline 0 & 0 & 1 & 1 & 1 & 7 & ENABLE COUNTERS & \\
\hline 1 & 0 & x & x & x & 0-7 & INSTRUCTION DISABLE & \\
\hline 0 & 1 & 0 & 0 & 0 & 8 & WRITE CONTROL REGISTER, T/C & \multirow{9}{*}{} \\
\hline 0 & 1 & 0 & 0 & , & 9 & REINITIALIZE ADDRESS COUNTER & \\
\hline 0 & 1 & 0 & 1 & 0 & A & READ WORD COUNTER, T/C & \\
\hline 0 & 1 & 0 & 1 & 1 & B & READ ADDRESS COUNTER, T/C & \\
\hline 0 & 1 & 1 & 0 & 0 & C & REINITIALIZE ADDRESS \& WORD COUNTERS & \\
\hline 0 & 1 & 1 & 0 & 1 & D & LOAD ADDRESS, T/C & \\
\hline 0 & 1 & 1 & 1 & 0 & E & LOAD WORD COUNT, T/C & \\
\hline 0 & 1 & 1 & 1 & 1 & F & REINITIALIZE WORD COUNTER & \\
\hline 1 & 1 & X & X & X & 8-F & INSTRUCTION DISABLE, T/C & \\
\hline
\end{tabular}

0 = LOW \(\quad 1=\) HIGH \(\quad X=\) DON'T CARE

Notes. 1 When \(I_{3}\) is tied LOW, the Am2942 acts as a DMA circuit. When \(I_{3}\) is thed HIGH, the Am2942 acts as a Timer/Gounter circuit
2. Am2942 instructions 0 through 7 are the same as Am2940 instructions.

Figure 18. Am2942 Instructions

When \(\bar{I}_{E}\) is HIGH, Instruction inputs, \(I_{0}-I_{2}\), are disabled. If \(I_{3}\) is LOW, the function performed is identical to that of the ENABLE COUNTERS instruction. Thus, counting can be controlled by the carry inputs with the ENABLE COUNTERS instruction applied or with Instruction Inputs \(\mathrm{I}_{0}-\mathrm{I}_{2}\) disabled.
Instructions 8-F facilitate the use of the Am2942 as a Programmable Timer/Counter. They differ from instructions 0-7 in that they provide independent control of the Address Counter, Word Counter and Control Register.
The WRITE CONTROL REGISTER, T/C instruction writes DATA input \(D_{0}-D_{2}\) into the Control Register. DATA inputs \(D_{3}-D_{7}\) are "don't care" inputs for this instruction. The Address and Word Counters are enabled, and the Control Register contents appear at the Data Multiplexer output.
The REINITIALIZE ADDRESS COUNTER instruction allows the independent reinitialization of the Address Counter. The Word Counter is enabled and the contents of the Address Counter appear at the Data Multiplexer output.
The Word Counter can be read, using the READ WORD COUNTER, T/C instruction. Both counters are enabled when this instruction is executed.
When the READ ADDRESS COUNTER, T/C instruction is executed, both counters are enabled and the address counter contents appear at the Data Multiplexer output.
The REINITIALIZE ADDRESS and WORD COUNTERS instruction provides the capability to reinitialize both counters at the same time. The Address Counter contents appear at the Data Multiplexer output.

DATA inputs \(D_{0}-D_{7}\) are loaded into both the Address Register and Counter when the LOAD ADDRESS, T/C instruction is executed. The Word Counter is enabled and its contents appear at the Data Multiplexer output.
The LOAD WORD COUNT, T/C instruction is identical to the LOAD WORD COUNT instruction with the exception that Address Counter is enabled.
The Word Counter can be independently reinitialized using the REINITIALIZE WORD COUNTER instruction. The Address Counter is enabled and the Word Counter contents appear at the Data Multiplexer output.
When the \(\bar{I}_{E}\) input is HIGH , Instruction inputs, \(\mathrm{I}_{0}-\mathrm{I}_{2}\), are disabled. The function performed when \(\mathrm{I}_{3}\) is HIGH is identical to that performed when \(\mathrm{I}_{3}\) is LOW, with the exception that the Word Counter contents appear at the Data Multiplexer output.

\section*{EXAMPLE DESIGNS}

Figure 20 shows an Am2942 used as two independent, programmable eight-bit timer/counters. In this example, an Am2910 Microprogram Sequencer provides an address to Am29775 \(512 \times 8\) Registered PROMs. The on-chip PROM output register is used as the Microinstruction Register.
The Am2942 Instruction input, \(\mathrm{I}_{3}\) is tied HIGH to select the eight Timer/Counter instructions. The \(\bar{E}_{E}, I_{0}-I_{2}\), and \(\overline{\mathrm{OE}}_{\mathrm{D}}\) inputs are provided by the microinstruction, and the \(D_{0}-D_{7}\) data lines are connected to a common Data Bus. GATE WC and GATE AC are separate enable controls for the respective Word Counter and Address Counter. The DONE, \(\overline{\text { ACO }}\) and WCO output signals indicate that a pre-programmed time or count has been reached.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline \(\overline{T_{E}}\) & \[
\begin{gathered}
I_{3} I_{2} I_{1} I_{0} \\
(\mathrm{Hex})
\end{gathered}
\] & Function & Mnemonic & Control Mode & Word Reg. & Word Counter & \begin{tabular}{l}
Adr. \\
Reg.
\end{tabular} & Adr. Counter & Control Reg. & Data Multiplexer Output \\
\hline L & 0 & WRITE CONTROL REGISTER & WRCR & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & \(\mathrm{D}_{0-2} \rightarrow \mathrm{CR}\) & \[
\begin{gathered}
\text { FORCED } \\
\text { HIGH }
\end{gathered}
\] \\
\hline L & 1 & READ CONTROL REGISTER & RDCR & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & CONTROL REG \\
\hline L & 2 & READ WORD COUNTER & RDWC & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & WORD COUNTER \\
\hline L & 0 & READ ADDRESS COUNTER & RDAC & 0, 1, 2, 3 & HOLD & HOLD & HOLD & HOLD & HOLD & ADR COUNTER \\
\hline \multirow[b]{2}{*}{L} & \multirow{2}{*}{4} & \multirow[t]{2}{*}{REINITIALIZE COUNTERS} & \multirow[b]{2}{*}{REIN} & 0, 2, 3 & HOLD & WR \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & ADR CNTR \\
\hline & & & & 1 & HOLD & ZERO \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & ADR CNTR \\
\hline L & 5 & LOAD ADDRESS & LDAD & 0, 1, 2, 3 & HOLD & HOLD & \(D \rightarrow A R\) & \(D \rightarrow A C\) & HOLD & WORD COUNTER \\
\hline \multirow[t]{2}{*}{L} & \multirow[t]{2}{*}{6} & \multirow[t]{2}{*}{LOAD WORD COUNT} & \multirow[b]{2}{*}{LDWC} & 0, 2, 3 & \(D \rightarrow W R\) & \(\mathrm{D} \rightarrow \mathrm{WC}\) & HOLD & HOLD & HOLD & FORCED HIGH \\
\hline & & & & 1 & \(D \rightarrow W R\) & ZERO \(\rightarrow\) WC & HOLD & HOLD & HOLD & FORCED HIGH \\
\hline \multirow[t]{2}{*}{L} & \multirow[t]{2}{*}{7} & \multirow[t]{2}{*}{ENABLE COUNTERS} & \multirow[t]{2}{*}{ENCT} & 0, 1, 3 & HOLD & ENABLE & HOLD & ENABLE & HOLD & ADR CNTR \\
\hline & & & & 2 & HOLD & HOLD & HOLD & ENABLE & HOLD & ADR CNTR \\
\hline \multirow[t]{2}{*}{H} & \multirow[t]{2}{*}{0-7} & \multirow[t]{2}{*}{INSTRUCTION DISABLE} & \multirow[t]{2}{*}{-} & 0, 1, 3 & HOLD & ENABLE & HOLD & ENABLE & HOLD & ADR CNTR \\
\hline & & & & 2 & HOLD & HOLD & HOLD & ENABLE & HOLD & ADR CNTR \\
\hline L & 8 & WRITE CONTROL REGISTER, T/C & WCRT & 0, 1, 2, 3 & HOLD & ENABLE & HOLD & ENABIE & \(\mathrm{D}_{0-2} \rightarrow \mathrm{CR}\) & CONTROL REG \\
\hline L & 9 & REINITIALIZE ADR COUNTER & REAC & 0, 1, 2, 3 & HOLD & ENABLE & HOLD & \(A R \rightarrow A C\) & HOLD & ADR COUNTER \\
\hline L & A & READ WORD COUNTER, TC & RWCT & 0, 1, 2, 3 & HOLD & ENABLE & HOLD & ENABLE & HOLD & WORD COUNTER \\
\hline L & B & READ ADDRESS COUNTER, T/C & RACT & 0, 1, 2, 3 & HOLD & ENABLE & HOLD & ENABLE & HOLD & ADR COUNTER \\
\hline \multirow[t]{2}{*}{L} & \multirow[t]{2}{*}{C} & \multirow[t]{2}{*}{REINITIALIZE ADDRESS AND WORD COUNTERS} & \multirow[t]{2}{*}{RAWC} & 0, 2, 3 & HOLD & WR \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & ADR CNTR \\
\hline & & & & 1 & HOLD & ZERO \(\rightarrow\) WC & HOLD & \(A R \rightarrow A C\) & HOLD & ADR CNTR \\
\hline L & D & LOAD ADDRESS, T/C & LDAT & 0, 1, 2, 3 & HOLD & ENABLE & \(D \rightarrow A R\) & \(D \rightarrow A C\) & HOLD & WORD COUNTER \\
\hline \multirow[t]{2}{*}{L} & \multirow[t]{2}{*}{E} & \multirow[t]{2}{*}{LOAD WORD COUNT, T/C} & \multirow[t]{2}{*}{LWCT} & 0, 2, 3 & \(D \rightarrow W R\) & \(\mathrm{D} \rightarrow \mathrm{WC}\) & HOLD & ENABLE & HOLD & FORCED HIGH \\
\hline & & & & 1 & \(\mathrm{D} \rightarrow \mathrm{WR}\) & ZERO \(\rightarrow\) WC & HOLD & ENABLE & HOLD & FORCED HIGH \\
\hline \multirow[b]{2}{*}{L} & \multirow[b]{2}{*}{F} & \multirow[t]{2}{*}{REINITIALIZE WORD COUNTER} & \multirow[b]{2}{*}{REWC} & 0, 2, 3 & HOLD & WR \(\rightarrow\) WC & HOLD & ENABLE & HOLD & WD CNTR \\
\hline & & & & 1 & HOLD & ZERO \(\rightarrow\) WC & HOLD & ENABLE & HOLD & WD CNTR \\
\hline \multirow[b]{2}{*}{H} & \multirow[b]{2}{*}{8-F} & \multirow[t]{2}{*}{INSTRUCTION DISABLE, T/C} & \multirow[t]{2}{*}{-} & 0, 1, 3 & HOLD & ENABLE & HOLD & ENABLE & HOLD & WD CNTR \\
\hline & & & & 2 & HOLD & HOLD & HOLD & ENABLE & HOLD & WD CNTR \\
\hline
\end{tabular}

WR = WORD REGISTER \(\quad A C=\) ADDRESS COUNTER
WC = WORD COUNTER \(\quad\) CR = CONTROL REGISTER
\(A R=\) ADDRESS REGISTER \(\quad D=\) DATA
Figure 19. Am 2942 Function Table.


Figure 20. Two 8-Bit Programmable Counters/Timers in a 22-Pin Package.

Figure 21 shows an Am2942 used as a single 16-bit, programmable timer/counter. In this example, the Word Counter carry-out, WCO, is connected to the Address Counter carry-in, \(\overline{A C l}\), to form a single 16 -bit counter which is enabled by the GATE signal.

Figure 22 shows two Am2942s cascaded to form a 32-bit programmable timer/counter. The two Word Counters form the low order 16 bits, and the two Address Counters form the high order bits. This allows the timer/counter to be loaded and read 16 bits at a time.


Figure 21. 16-Bit Programmable Counter/Timer Using a Single Am2942.


Figure 22. 32-Bit Programmable Counter/Timer Using Two Am2942s.

In Figure 23, two Am2942s are shown cascaded to form dual 16 -bit counters/timers. GATE WC and GATE AC are separate enable controls for the respective Word Counter and Address Counter. Using the 16 -bit Data Bus, each 16 -bit counter can be loaded or read in parallel.

Figure 24 shows two Am2942s used as DMA address Generators on a common DATA/ADDRESS bus. The common bus allows the use of the Am2942 multiplexed data and address pins, \(\mathrm{D}_{0}-\mathrm{D}_{7}\). The Am2942 is in a 22 pin package whereas the Am2940, which has separate address and data pins, requires a 28 pin package.


Figure 23. Dual 16-Bit Programmable Counters/Timers.


Figure 24. Am2942s Used as DMA Address Generator on Common Bus.

In this example the Am2942 Address Counter, Word Counter and Control Register are loaded and read directly from the CPU via the DATA/ADDRESS bus. Since the bus carries addresses as well as data, the \(D_{0}-D_{7}\) pins can be used also to enable the address onto the bus.
Four bits in the Microinstruction Register provide the Am2942 Instruction Inputs, \(I_{0}-I_{2}\) and the Instruction Enable input \(\bar{I}_{E}\). The \(I_{4}\) input is tied LOW, selecting the eight DMA instructions. The
microprogram clock is used to clock the Am2942s, and when the ENABLE COUNTERS instruction is applied or the instruction is disabled ( \(\overline{\bar{I}_{E}}=\) HIGH), address and word counting is controlled by the CNT bit of the Microinstruction Register.
Interface control circuitry generates bus control signals and enables the Am2942 address onto the bus at the appropriate. The open-collector DONE outputs are dot-anded and used as a test input to the microprogram sequencer.


Chapter VIII
HEX-29

\section*{INTRODUCTION}

Modern digital systems are becoming faster and increasingly complex. As a result, more is being demanded of digital design engineers. Fortunately, there is a design technique that can greatly simplify the design process. It can also lead to cleaner, more efficient, more reliable finished devices. This technıque is called MICROPROGRAMMING. Do not be confused by this word; it has nothing whatever to do with machine level language or programming a microprocessor. Microprogramming is inherently more powerful than programming in a processor's instruction set for many reasons, not the least of which is the access to the entire functional resources of the hardware on a machine cycle by machine cycle basis. An excellent treatment of microprogramming and microprogrammed machines is available from AMD in previous application notes. Perhaps the easiest to comprehend introduction to this subject is in AMD's Microprogramming Handbook. This is highly recommended reading for any newcomer to this area of digital design.
Though microprogramming has always been an inherently more powerful design technique since its invention in 1955, it has been little used until recently (1976), and with some justification. The reason is quite sımple. The very large majority of IC's available until the 1976-1978 time frame were specifically designed to be used with 'random logic' design techniques. Since these random logic IC's were poorly suited to the highly structured nature of well designed microprogrammed systems, the potential advantages of microprogrammed systems could not be realized easily.

Fortunately for all of us, in the mid 1970's AMD made a significant decision to develop a very extensive family of Schottky technology IC's specifically optimized for use in microprogrammed systems. These circuits belong to the Am2900 family as well as the Am25S, Am26S, Am27S, and Am25LS families. The acceptance has been so great that many of the other large IC manufacturers are now second sourcing many of these parts and introducing others. So, in just three to four years, microprogrammed machine design has come of age. Now, for most any job of medium to very high complexity, a microprogrammed system is the only way to go If a microprocessor isn't fast or versatile enough.
The purpose of this application note is to illustrate the use of microprogramming and 'bit-slice' technology in a high performance 16 -bit time-sharing CPU.This application note is unique in that the CPU being described is the heart of a new commercially available minicomputer system. Thus, it is possible to examine the nature of the CPU as it relates to a complete basic minicomputer system. For this reason, a very short section follows that describes the basic system elements and the system goals toward which the CPU was designed.
The product described herein is called the "HEX-29" CPU. Information on the AMD devices embodied in this application note should be directed to AMD via your local AMD representatives. Inquiries about the HEX-29 CPU and minicomputer system for OEM and/or end user applications should be directed to:

HEX-29
Digital Microsystems, Inc.
4448 Piedmont
Oakland, CA 94661
(415) 658-8532

\section*{SYSTEM DESIGN GOALS}

In any significant project it is mandatory that reasonable, coherent system design goals be spelled out before serious work is begun. This can be a surprisingly short list of general specifica-
tions, but a well thought out system philosophy can make all the difference. Most important, everyone involved should have a copy so everyone will be pulling in the same direction.
The following list represents the system design goals for the HEX-29 CPU and system.
1. Compact, reliable, easy to use.
2. Mult-user, multi-task, timesharıng.
3. Fast, code-efficient high level language processıng.
4. Low cost for complete system.
5. Intelligent microprogrammed channel controllers for high speed I/O

Indeed, this seems like a short list, but it is the list from which the more detailed specificatıons were developed. For example, in order to be compact, switching power supply technology is employed. Reliability evolves from many factors including burn in and testing cycles. Probably the single largest cause of 'flakıness' in digital systems is insufficient cooling. An oversize fan moves about five times the volume of air past the IC's as is normally recommended. This large, slower speed fan has the additional advantage that the lower frequency 'white noise' it generates is far less annoying than the 'whine' from smaller high speed fans.
So, it is easy to see that many of the more specific details of system design will fall readily out of these overall design goals. The features of the final HEX-29 system are shown below. It should be instructive to trace each of these features to one (or more) of the design goals listed above. Reviewing this list will also prepare the context for the more detalled sections to follow in later sections.

\section*{HEX-29 FEATURES}

VERY FAST
-160 ns basic machine cycle
-Only two machine cycles for many instructions
-Microprogrammed clock for increased through-put

\section*{COMPLETE SET OF DATA TYPES}
-Bit operatıons
-Nibble operations
-Byte operations
-Word operations
-Double word operations
-Quad word operations
- Variable field operations

\section*{EXTENSIVE REGISTER SET}
- 16 general purpose/defined purpose registers
- 16 memory management registers
-Extended function condition code register
-4 interrupt control/status registers

\section*{MICROPROGRAMMED}
-Expandable instruction set (on board)
-Writable/fixed control store capablity
- Integral fixed/floating point processor
-Highly structured, comprehendible, modular design
SOPHISTICATED MEMORY MANAGEMENT
-Mult-user and multi-task timesharing structure
-Complete intertask protection and security
- Megabyte addressing space (expandable)
-Software protectable pages for shared re-entrant coding
-Dual mode operating capability

\section*{MULTIPLE STACK PROCESSOR}
-Sophisticated program linkage through defined control stack pointer
-Multıple, general register, data stack processing

\section*{SOPHISTICATED INTERRUPT STRUCTURE}
-8 level maskable vectored priortized hardware interrupts
-Second level proritized expansion on each hardware level
-256 levels of program controlled software interrupts
- Invalid memory access trap is a vectored interrupt
-Non-existent instruction trap is a vector interrupt
-Breakpoint instruction is a special vectored interrupt
-Automatic mode switching on all interrupts

HIGH THRU-PUT DMA/REFRESH CONTROL
-8 level prioritized DMA requests and acknowledges
-Up to four Mega-byte/second DMA transfer rate without slowing program execution
-Up to 12 Mega-byte/second DMA transfer rate
- Integral transparent dynamic memory refresh control

EXTENSIVE HIGH LEVEL INSTRUCTION SET
- Multitude of data types handled
-Enormous variety of addressing modes
-General register and defined register classes of instructions
-Many very fast numeric and string macroinstructions
-Integral 16 and 32-bit integer and 64-bit floating point ADD, SUB, MUL, DIV, CMP, NEG, etc
-Advanced character, byte and word string processing
- Microcoded high level language primitives

VERY HIGH QUALITY PHYSICAL DESIGN
-Four layer P.C. cards throughout system (internal GND and \(V_{C C}\) )
- All bus signals interleaved with direct return ground path
-All bus signals active low; three-state to inactive level

\section*{INTELLIGENT CHANNEL CONTROLLERS}
- Mıcroprogrammed floppy disk and hard disk controllers
-Services multiple users I/O sımultaneously and transparent to CPU program execution
-Reduces executive program complexity and speeds execution

\section*{SOFTWARE SUPPORT}
-Mult-user/multi-task time sharing operating system includes sophisticated file management features
-Sophisticated resident macro-assembler
-Customized micro-assembler
- Superfast, super extended BASIC interpreter
- True PASCAL compiler (not interpreter)
- Advanced editor and word processor package
-More software coming

It should be clear from this list that the HEX-29 minicomputer is a powerful/sophisticated design This is DIRECTLY attributable to the availability of the excellent Schottky technology I.C.'s available from AMD for use in microprogrammed digital systems

In a well designed microprogrammed system there should be VERY few random logic gate packages required. In the HEX-29 CPU, there are only a few gates used as such. If anywhere near \(20 \%\) of a microprogrammed system is composed of gate packages, it is probable that the design can be further simplified to replace the random logic with microcode and/or structured logic techniques It is important to note that the more functions that are implemented with structured logic and controlled by microcode bits, the more versatile and general is the whole design

\section*{MICROPROGRAMMED MACHINES}

It is highly recommended that AMD's MICROPROGRAMMING HANDBOOK be studied before this application note if a detailed understanding of the HEX-29 CPU is desired. The idea is, of course, that the basic principles of microprogrammed machines be familiar before this specific example is examined. The Am2900 Learning and Evaluation Kit is also recommended as a practical introduction tool. For those only interested in the capabillties of a well designed microprogrammed CPU, that readıng is not entrely necessary, and Section V of this Application note will be superfluous. Section IV is a more general discussion for these readers, but is also necessary for those going on to Section V.

A short discussion of microprogrammed systems appears here only as a short refresher for those who have studied the MICROPROGRAMMING HANDBOOK by John Mick and Jim Brick of AMD.

Any microprogrammed machine can be divided into the following two discrete parts:
1. Control store and microprogram control
2. Data routing and function logic

These two sections of a microprogrammed machine are really quite nearly independent. In effect, the control store and microprogram control section is the 'boss and brains' of the operation. It issues all of the orders and makes all the decisions. The data routing and function logic devices are merely puppets that carry out the commands selected by the microprogram control logic from the control store. Note that 'microword memory' and 'microcode' are used interchangeably with 'control store' and are synonomous.

\section*{Control Store and Microprogram Control}

The control store is simply a number of PROM's. The number of locations in this memory is chosen to be large enough to hold the desired number of microprogram routines. The width of the word is chosen to have sufficient bits to control all of the possible functions in the data routing and function logic. Admittedly, RAM or EPROM could be used as the memory devices, but it is best to


Figure 1.
think of it as an array of read only memory devices. So, schematically an example of a control store array looks like Figure 1.

In practice, there is a register between the microword data bits and the actual data routing and function control devices. This register assures that all bits change simultaneously at the beginning of each new microinstruction cycle and allows the execution of one microinstruction with the fetching of the next. The addition of this 'pipeline register' is shown in the Figure 2 expansion of our block schematic

The remaining part of this section is the microprogram control unit, more commonly called the microprogram sequencer. The microprogram sequencer is nothing more than a presettable bi-
nary counter with a few extra functions. Figure 3 shows this device in place.

We show the sequencer as a 12-bit binary counter with a few other inputs. The outputs \((\mathrm{Y})\) drive the address lines of the control store PROMs. So, each time the system clock rises, the counters increment and sequential addresses are accessed from the PROM Note that the current output of the control store is captured in the pipeline register on this same LOW-to-HIGH transition. Thus, the sequencer is always fetching the NEXT control store word which will control the fetching of the next, and so on and so on.

Note that there are several bits from the pipeline register that are routed back to the sequencer. In our example, 12 bits are used as a microword branch address and another bit is used as a preset enable (load) line Normally, each cycle of the system clock increments the sequencer outputs and the next microword is fetched from the control store. However, somewhere down the line we are going to want to branch to a microcode sequence that is not 'in line' with the code that is currently executing. It is very easy to see how this is done.

The microaddress of the routine to which we want to branch is imbedded in the current microword, 12 bits in our example. The microword bit that is connected to the load input of the sequencer is coded to be low on this cycle. So, the sequencer, which is really just a 12-bit counter with a unique load control in our example, will cause the branch address we selected to pass through to the output of the counters and fetch the microword from the microaddress to which we branched. The routine will now continue to execute sequentially addressed microwords untıl we execute another branch code

The only other really necessary function we need from our sequencer is the ability to do conditional branches. In other words, we want to be able to branch to some microcode routine, but only if a certain condition exists. As usual, this capability is easily added; only one multıplexer is needed. Figure 4 shows the new configuration.


Figure 2.

Now two additional microword bits control the conditions under which a microbranch will take place. If input 0 is selected, a branch will always take place since the logic LOW level on input 0 will appear at the load input of the sequencer. Conversely, if input 3 is selected, a HIGH logic level is always routed through the multiplexer to the load input and a load is not performed. Thus the next sequential microinstruction is fetched. So far we can do branch and continue functions with the multiplexer.
If we select inputs 1 or 2 on the condition select multiplexer, we may get one of two conditions. If the selected input is HIGH , it will be routed to the load input of the sequencer and no load will take place. But if the selected condition is at a LOW logic level, the load input of the sequencer is pulled LOW, a load is performed, and a branch has been accomplished. Since a branch only occurs when the condition bit is LOW, this function is called a 'branch on condition \(=0\) '. Clearly a 'branch on condition \(=1\) ' can be ımplemented simply by invertıng the condition bit before it enters the multıplexer.

So as far as controlling the flow of microprograms goes, it is clear that we can make it look very much like assembly language programming of a microcomputer. We can execute sequential microinstructions (in line code), branch conditionally, or branch unconditionally. If we use real live sequencers like the Am2909, Am2910 or Am2911 instead of binary counters we get several other very important functions including micro-subroutining and looping
When we substitute Am2909's, Am2910's or Am2911's as our sequencers, the final element of our complete microprogram memory and control section is in place. Figure 5 shows this configuration.
The next address PROM of Figure 5 converts the microcode branch function bits into one of two sets of bits that control the function performed by the Am2911's. Which of the two is chosen depends upon the logic level of the particular condition bit that is selected.
This is the basic structure of any microprogram control unit regardless of what the rest of the system looks like The width of the microword data word, the microaddress field, the condition select field, etc, will change as needed, but the structure remains the same. Note that some of the microword data bits are used to control the microprogram sequencing logic. The bits left over are used to control the data routing and function logic in the device, i.e., everything elsel

\section*{Data Routing and Function Logic}

The data routing and function logic section of a microprogrammed machine closely reflects the job the device is to perform. In this respect there is some similarity with random logic


MPR-658
Figure 3.


Figure 4.


Figure 5.
designs. The key difference is the glue that binds all of the small functional units that make a device work. In a random logic design it is a more or less random array of gates and flip-flops that interconnect and control these functional units.
The chief advantage of a microprogrammed machine is that this random logic is largely replaced by the coherent sequences of control bits that is the microprogram. Problems such as race conditions, undesirable interactions between functional elements and marginal timing nearly disappear in a mıcroprogrammed design. Often there are one or two internal data buses on which all transfers of internal data between functional units take place.

Think of several possible sources of information that may be needed in a particular design. If they are all three-statable devices, microword bits could be tied to the output enable of each and the desired device enabled onto the internal bus on a microcycle by microcycle basis. Likewise one or more devices may capture this data. Microword bits attached to the clock pulse (CP) inputs of registers and the like can achieve this function.
Further, microword bits select other functions to be performed, for example an ALU or shift function. Much of Section V of this application note will demonstrate the use of these data routing and function logic control bits.

\section*{GENERAL SPECIFICATIONS}

The following section of this application note explores the design of the HEX-29 CPU on an intermediate level. It will be similar in detall to the detailed hardware and software specifications given for most microprocessors by the manufacturers. In other words, all the information needed to use the HEX-29 CPU, including bus timing and instruction set, are examined. This will serve to demonstrate what can be achieved in a medium level microprogrammed machine. It will also serve as a necessary transition for those planning to study the more detailed internal structure of the CPU in the next section of this application note.
It is very important, when designing a microprogrammed machine, that the target device be specified in detall approaching that given in this section. Only then can an intelligent attempt at hardware design begin. It is especially important to define a clean, simple, reliable interface between the microprogrammed device and other system elements. Considerable attention should also be paid to defining data types, instruction formats, interrupt requirements, etc.

\section*{Internal CPU Registers}

The HEX-29 CPU has 36 internal registers. Of these, 16 are memory management (map) registers, 16 are general purpose registers, three are associated with the interrupt structure, and one is the condition code register.
Table 1 shows the functions associated with the 16 general purpose registers of the HEX-29. It is most significant that all 16 general purpose registers have alternate functions. This should not imply that they are not true general purpose registers however. Any register can be used as an accumulator, stack pointer, index register, memory pointer, data counter, etc., in most instructions. To increase coding efficiency and execution speed, however, some instructions use the defined register assignments in Table 1.

TABLE 1.


For example, the instruction set of the HEX-29 CPU can load immediate, push, pop, and move indexed and direct any of the multiple register combinations (FP1, FP0, DW1, DW0) in one instruction. One mode of indexed addressing and many byte processing instructions benefit greatly from the alternate use of some registers.

\section*{Condition Code Register}

The condition code register contans all zeros in its upper byte. The bit assignments in the low byte are shown in Table 2

TABLE 2. CONDITION CODE REGISTER BITS.
\begin{tabular}{|c|c|l|}
\hline \multicolumn{2}{|c|}{} \\
\hline \multicolumn{2}{|c|}{ Position } & Name \\
\hline Bit 7 & U2 & \multicolumn{1}{|c|}{ User Flag \#2 } \\
Bit 6 & U1 & User Flag \#1 \\
Bit 5 & U0 & User Flag \#0 \\
Bit 4 & H & Half Sign Flag (Bit 7, MSb of low byte) \\
Bit 3 & Z & Zero Flag \\
Bit 2 & N & Negative Flag (MSb of result) \\
Bit 1 & V & 2's Complement overflow flag \\
Bit 0 & C & Carry Flag (arthmetic and shift carry) \\
\hline
\end{tabular}

The user flags (U2, U1, U0) are an extra feature of the HEX-29 CPU. They are not altered by any but five special flag modification instructions (SETF, CLRF, COMF, POPF, LDF) These op codes set, clear, complement, pop, or load the flags respectively. Since the user flags are immune to change except by these special purpose flag altering instuctions, they are excellent for passing status information between routines
The half sign flag \((\mathrm{H})\) is set if the result of an operation contains a 1 in the most significant bit of the low byte; otherwise it is cleared. This flag is useful in many byte processing and loop counting routines
If the result of an operation is zero, the zero flag \((Z)\) is set, or else it is cleared This is the most useful of all the flags and is used on comparisons, arithmetic and logical operations, loop counting, etc . . .
When the most significant bit of the result of an operation is a logic 1 , the negative flag \((\mathrm{N})\) is set. Otherwise it is cleared. Note that in two's complement notation, the most significant bit of a number determines the sign of the number If it is a logic 1 , the number is negative, if it is a logic 0 , the number is positive.
If the two's complement result of an arithmetic operation results in a two's complement overflow, the V flag is set. This flag is also used as a general error flag by the HEX-29 CPU For example, the V flag is set if a divide by zero instruction is attempted. In floating point notation, if the exponent becomes too large or small, (arithmetic overflow/underflow), the V flag is set to so indicate.
The carry flag ( \(C\) ) is used for two purposes It is a source and/or destination bit in shift and rotate instructions, and as a carry-out bit when an arithmetic function result is too large to fit in the appropriate destination register. The convention with regard to the carry flag on addition and subtraction follows
\[
\begin{array}{ll}
C \text { flag }=1 \text { if } & 1 . \text { Binary add results in a carry out. } \\
\text { 2. Binary subtract results in no borrow. } \\
C \text { flag }=0 \text { if } & 1 . \\
2 . & \text { Binary add results in no carry out. } \\
2 & \text { Binary subtract results in a borrow. }
\end{array}
\]

All of the condition code flags, except the user flags, have some special meanings in some of the complex 'macro' instructions. These are described in the detailed section on the HEX-29 instruction set.

\section*{Interrupt Registers}

There are three special purpose interrupt registers in the HEX-29 CPU. They are:
1. Mask Register
2. Status Register
3. Vector Register

These registers are command driven, that is, the register selected is a function of the interrupt command being executed. More detailed information on the nature of these registers appears later in this application note.

\section*{Memory Management Registers}

A sophisticated memory management structure is embodied in the HEX-29 CPU. Integral to this structure is the set of 16 memory map registers. These 8 -bit registers contain transformation values that allow multiple users and tasks to share the processing time of the CPU without interacting with each other. Each task logged onto the HEX-29 is unique from all others through its memory map image. When it is chosen to run on the CPU, its memory map image becomes synonomous with the CPU memory map registers. More detailed information on this aspect of the HEX-29 CPU appears later in this application note.

\section*{Instruction Formats}

The instruction formats of the HEX-29 CPU are simpie and few in number For this reason, the HEX-29 instruction set is not difficult to learn and use, even though it is very extensive and quite sophisticated.
Emphasis on the use of 4-bit (hexadecımal), and 8-bit (byte) fields in the instruction formats simplify the organization of the instruction set. All of the instruction formats used in the HEX-29 are shown in Figure 6.


Rs \(=\) Source Regıster (operand, poınter, index reg, stack pointer, etc )
Rd = Destınation Register (operand, pointer, stack pointer, etc ) \(H=4\)-bit (hex) quantity
Byte \(=8\)-bit byte (data, index, offset, address, etc.)
Figure 6. Instruction Formats.

Most instructions involve operations on 16 -bit words. However, the HEX-29 instruction set also includes op-codes that operate on the following data types
\begin{tabular}{llll} 
1 & 1 & Bit & (Bit) \\
2. & 4 & Bits & (Hex or Nibble) \\
3 & 8 Bits & (Byte) \\
4. & 16 Bits & (Word) \\
5 & 32 Bits & (Double Word) \\
6 & 64 Bits & (Quad Word• Floatıng Poınt) \\
7. & \(N\) Bits & (Varıable Format)
\end{tabular}

In addition to working on the fixed length data types, there are many 'macro' instructions that operate on variable length character, byte, and word strings in memory. These strings can be either contiguous in memory or in the form of linked lists. Several of these 'macro' instructions are highly optımızed mıcrocoding of the most critical routines used in high level language processing.

The multiplicity of data types processed efficiently by the HEX-29 increases its ability to meet the diverse demands of modern computıng

\section*{Addressing Modes and Assembly Language}

Much of the power and simplicity of the HEX-29 instruction set is derived from the large number of useful addressing modes available for the most used basıc functıons such as MOV, ADD, SUB, INC, DEC, CMP, etc. Addressing modes specify where operands of an instruction are to be found and where the result is to be stored.

The 16 general purpose 16-bit registers are designated R0, R1, R2 . . RC, RD, RE, RF. These are the primary names of the 16 registers and refer directly to the corresponding registers. In other words, when 'RD' is written in a HEX-29 Assembly Language (HAL) program, the contents of this register are used as an operand or destination in the instruction.

The use of a register as a pointer to memory is called memory pointer addressing. The names M0, M1, M2, . . MC, MD, ME, MF apply to the 16 general purpose registers when they are used as memory pointers

When a register points to a memory locatıon which contains the address of the memory location holding the value of interest, the register is said to be an "indirect pointer". The names 10,11 , I2, . . IC, ID, IE, IF are used to specify the 16 general purpose registers when they are being used with this type of addressing.

Indexed addressing is possible using the names \(Z 0, Z 1, Z 2, \ldots\), \(Z C, Z D, Z E\). The use of one of these names means that the data is at the address formed by adding the contents of the register referenced to the contents of word following the instruction in main memory

Most often, when a register is used as a memory pointer (MD for example), or as an indirect pointer (I9 for example), it is extremely desirable that the register auto-increment or perhaps autodecrement since programs, lists, and stacks are ordered in a positive direction through memory.

In HEX-29 Assembly Language (HAL) it is quite simple to specify that a memory, indirect pointer register, etc. is auto-ıncremented or auto-decremented by appending a '+' or '-' character to the respective register specification.

For example.
MOV M7+, R6 The contents of memory pointed to by R7 is moved into R6. R7 is then incremented.

MOV RA, ME - Decrement RE Then move the contents of RA into the memory location pointed to by RE.

It is significant that auto-incrementıng takes place after the operation while auto-decrementing takes place before the operation; (auto-post-increment and auto-pre-decrement.)

Several very fundamental addressing modes arise from autoincrementing memory and indirect pointers. Consider the following examples:
A. Program Counter (RF) as an auto-incrementing pointer yields 'immediate addressing'.
MOV MF+, RA = Move immediate into RA.
ADD MF+, R6 = Add immediate to R6.
MOV MF+, RF = Jump to address in immediate word.
B. Stack Pointer (RE) as an auto-incrementing pointer yields 'stack addressing'.
MOV ME+, R2 = Pop top of stack into R2.
XOR ME + , R1 = Pop top of stack and XOR into R1.
MOV ME + , RF = Return from subroutine!
C. General registers used as data stack pointers.

ADD MD,\(+ \mathrm{MD}=\) Add top two members of data stack + leave result on top of the stack.
CMP MD+, M8+ = Compare top members of two stacks + remove these values from the stacks.
AND MF + , M6 = AND immediate word with the top member of stack pointed to by R6.
D. Program Counter (RF) as an auto-incrementing indirect pointer yields 'direct addressing'.
MOV IF+, R7 = Move direct into R7.
ADD IF+, RC = Add direct into RC.
MOV IF+, IF+ = Move direct to direct.
It should be clear that these examples represent only a few of the most useful of many possible uses of auto-incrementing and auto-decrementing with memory and indirect pointers. Careful study of the HEX-29 instruction set will reveal many more uses not examined in these examples.

\section*{Classes of Instructions}

The instruction set of the HEX-29 includes many different functions and a multitude of addressing modes. Nonetheless, all instructions fall into one of two classes of instructions. The general register class of instructions are extremely flexible because of the enormous number of variations inherent in each op-code. The defined register class of instructions permits extremely fast and memory efficient code for often used functions and register sets. The power of the HEX-29 instruction set is derived from an extensive combination of the most powerful and efficient instructions from each class.

\section*{General Register Instructions}

In a general register instruction, the function and addressing mode are specified in the op-code field (upper byte). The lower byte then holds two 4-bit (hex) values that specify the registers used in the instruction. It should be clear, therefore, that for every general register instruction there are 256 possible specific actions that can be performed.
The full power of these instructions may not be evident without an example. A discussion of just 5 of the 256 possible variations on the MOV M+, R instruction will demonstrate the extreme flexibility of each and every general register instruction. Execution of the MOV M+, R instruction proceeds as follows:
1. Contents of Rs are moved to the address bus.
2. Rs is auto-incremented by one.
3. The data addressed by Rs is loaded into Rd.

In this instruction, Rs is used as an auto-incrementing memory pointer, hence the M+ notation. Rs and Rd are the source on destination registers. Below is the set of 5 examples of how the one op-code can be used to implement a number of important functions.
1. MOV MF+, R3 (W) = Load immediate word (W) into R3.
2. MOV MF+, RF (W) = Jump direct to address (W).
3. MOV ME + , R6 = Pop top of control stack into R6.
4. MOV MD+, R4 = Load next member of list into R4.
5. MOV ME + , RF = Return from subroutine.

Taking a few mınutes to review this section and understand how all of these functions are achieved with the single MOV M+, R op-code should reveal the nature of the power and flexibility of the general register class of instructions.

\section*{Defined Register Instructions}

A defined register instruction is an instruction whose function, addressing mode, and register assignments are all defined in the op-code field (upper byte). The low byte is then available for use as an offset for short relative branching instructions, an immediate byte or character, an 8 -bit index value, an 8 -bit logical mask, etc. With this class of instructions, often only one word is required for the entire instruction. This speeds execution and improves coding efficiency markedly in most applications. It is for this class of instruction that the alternate register function assignments appear in the model of the HEX-29 register set. An example of a one word defined register instruction with the two word instruction it can replace follows:

\section*{ADC X, A 26 Defined Register Instruction \\ ADC ZB, RA 0026 General Register Instruction}

Both of the instructions accomplish the same thing. In both cases RB ( X index register) is used as an index register and RA (Accumulator) is the destination operand. The value in memory at the address pointed to by the sum of the X index register (RB) plus the hex constant 26 is added to the contents of the Accumulator (RA) and the sum left in Accumulator (RA).

The significant difference between the two instructions is that the defined register instruction takes only half as much code (one word vs. two), and executes faster since there are fewer memory accesses and fewer machine cycles. Very often the defined register instruction will be adequate for the job. But when the choices of registers RB and RA are not acceptable or if an 8-bit index offset is not large enough, the general register instruction would be the proper choice. It allows any register pair to be specified as the index register and destination/accumulator and has a 16 -bit index offset in the word following the instruction.

As mentioned earlier, it is largely the ability of mixing defined and general purpose instructions freely that makes programs written in HEX-29 Assembly Language very code efficient and fast.

\section*{HEX-29 Instruction Set}

The HEX-29 instruction set is quite extensive. It not only offers all of the basic functions in a wide variety of addressing modes, it also includes a multitude of special purpose instructions. These special purpose instructions cover many important aspects of programming including program control, numeric processing, string manıpulation and searching, list processing, etc.

Fortunately, all of these types of instructions fall into one of only four different instruction formats. These were shown in Figure 6. Table 3 shows all of the instructions for the HEX- 29 machine.

TABLE 3. SUMMARY OF MNEMONICS ARITHMETIC OPERATIONS.
\begin{tabular}{ll} 
ADC & Add words plus carry \\
ADD & Add words w/o carry \\
ADDB & Add hyte to word \\
ADDH & Add hex value (nibble) to word \\
DADD & Add double word values ( 32 bits +32 bits \(\rightarrow 32\) bits) \\
FADD & Add floating point values \((64\)-bit FP +64 -bit FP \(\rightarrow 64\)-bit FP) \\
SBB & Subtract with borrow \\
SUB & Subtract w/o borrow \\
SUBB & Subtract byte from word \\
SUBH & Subtract hex value from word \\
RSUB & Subtract words in reverse order \\
DSUB & Subtract double word values \((32\) bits -32 bits \(\rightarrow 32\) bits) \\
FSUB & Subtract floating point values ( 64 -bit FP -64 -bit FP \(\rightarrow 64\)-bit FP) \\
UMUL & Unsigned word multiply (16 bits \(* 16\) bits \(\rightarrow 32\) bits) \\
SMUL & Signed word multiply (16 bits \(* 16\) bits \(\rightarrow 32\) bits) \\
DMUL & Double word signed multiply ( 32 bits \(* 32\) bits \(\rightarrow 64\) bits) \\
FMUL & Floating point multiply ( 64 -bit FP \(* 64\)-bit \(F P \rightarrow 64\)-bit FP) \\
UDIV & Unsigned word divide (16 bits \(\div 16\) bits \(\rightarrow 16\) bits +16 -bit remainder) \\
SDIV & Signed word divide (16 bits \(\div 16\) bits \(\rightarrow 16\) bits +16 -bit remainder) \\
DDIV & Double word signed divide ( 32 bits \(\div 32\) bits \(\rightarrow 32\)-bit +32 -bit remainder) \\
FDIV & Floating point divide ( 64 -bit FP \(\div 64\)-bit FP \(\rightarrow 64\)-bit FP) \\
CMP & Compare words \\
CMPB & Compare byte with word \\
CMPBA & Compare byte with byte \\
CMPH & Compare positive hex value (nibble) with a word \\
CMPHN & Compare negative hex value (nibble) with a word \\
CMPHA & Compare hex value (nibble) with another nibble \\
DCMP & Compare signed double word values \\
FCMP & Compare floating point values \\
NEG & Negate word (2's complement) \\
DNEG & Negate signed double word value \\
FNRM & Normalize floating point number \\
DTST & Test signed double word value for zero + sign \\
FTST & Test floating point value for zero + sign \\
INC & Increment word by one \\
DEC & Decrement word by one
\end{tabular}

\section*{Shifts \& Rotates}

ASR Arithmetic shift right
ASL Arithmetic shift left
CSL Count and shift left (until MSb=1)
DSL Double word shift left
DSR Double word shift right
LSR Logical shift right
RCL Rotate closed left
ROL Rotate left (through carry flag)
ROR Rotate right (through carry flag)
VSL Variable shift left (0 to 15 places)
VSR Variable shift right ( 0 to 15 places)

TABLE 3. SUMMARY OF MNEMONICS ARITHMETIC OPERATIONS. (Cont.)
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|l|}{Logical Operations} & \multicolumn{2}{|l|}{PROGRAM CONTROL} \\
\hline AND & Boolean AND words & EXR & Execute contents of register as an instruction \\
\hline ANDB & Boolean AND byte with word & RTI & Return from interrupt \\
\hline IOR & Boolean inclusive OR words & BPT & Breakpoint trap \\
\hline IORB & Boolean inclusive OR byte with word & JFS & Jump if specified flags are set \\
\hline XOR & Boolean exclusive OR words & JFC & Jump if specified flags are clear \\
\hline XORB & Boolean exclusive OR byte with word & CFS & Call subroutine if specified flags are set \\
\hline COM & Complement word & CFC & Call subroutine if specified flags are clear \\
\hline CLR2 & Clear the specified 2 registers & JIFS & Jump indirect if specified flags are set \\
\hline BTS & Bit set & JIFC & Jump indirect if spectifed flags are clear \\
\hline BTC & Bit clear & CIFS & Call subroutine if specified flags are set \\
\hline BTI & Bit invert & CIFC & Call subroutine if specified flags are clear \\
\hline BTT & Bit test & RTFS & Return from subroutine if specified flags are set \\
\hline CLRF & Clear specified flags & RTFC & Return from subtoutine if specified flags are clear \\
\hline SETF & Set specified flags & JMP & Jump to the address specified \\
\hline \multirow[t]{3}{*}{COMF} & Complement specified flags & CALL & Call subroutine \\
\hline & & CEX & Call executive (software interrupt) \\
\hline & & BGT & Branch if greater than \\
\hline \multicolumn{2}{|l|}{Data Movement} & BGE & Branch if greater than or equal \\
\hline MVN & Move, no flags altered & BLT & Branch if less than \\
\hline MOV & Move, update flags & *TWb & Transition table word branch \\
\hline MVM & Move multiple words & *TTBB & Transition table byte branch \\
\hline MVB & Move a byte & DBNZ & Decrement and Branch Non-Zero \\
\hline LDB & Load a byte & BzD & Branch on zero or decrement \\
\hline STB & Store a byte & *CBB & Compare and branch if in bounds \\
\hline MVH & Move a positive nibble & BR & Branch \\
\hline MVHN & Move a negative nibble & BSR & Branch to subroutine \\
\hline LDI2 & Load immediate 2 registers & BC & Branch if carry flag set \\
\hline XCH & Exchange contents of two registers & BNC & Branch if carry flag not set \\
\hline DXCH & Exchange contents of DW1 and DW0 & BV & Branch if overilow flag set \\
\hline FXCH & Exchange contents of FP1 and FP0 & BNV & Branch if overflow flag not set \\
\hline XCHM & Exchange top two members of any stack & BN & Branch if negative flag set \\
\hline DUP & Duplicate top member of any stack & BNN & Branch if negative flag not set \\
\hline SWT & 'Switch'. Store register indexed and reload indexed & BZ & Branch if zero flag set \\
\hline JAM & Move any bit field from one word to another & BNZ & Branch if zero flag not set \\
\hline SWP & Swap high and low bytes in a word & BH & Branch if half sign flag set \\
\hline PSH2 & Push any two registers onto control stack & BNH & Branch if half sign flag not set \\
\hline POP2 & Pop top two words on control stack into two registers & & \\
\hline PSHF & Push flags (condition code register) onto control stack & & \\
\hline POPF & Pop top of control stack into condition code register & & \\
\hline PSH8 & Push 8 registers onto control stack & & \\
\hline POP8 & Pop 8 registers from control stack & & \\
\hline PSHD & Push R8, R9, RA, RB, RC, RD onto control stack & & \\
\hline POPD & Pop R8, R9, RA, RBRC, RD from control stack & & \\
\hline LDINT & Load interrupt register & & \\
\hline RDINT & Read interrupt register & & \\
\hline RMM & Read a memory map location & Miscell & eous Instructions \\
\hline LMM & Load a memory map location & & , \\
\hline FMM & Fill memory map & NOP & No operation for 2 to 256 cycles \\
\hline BMBF & Block move bytes forward in memory & *SCNB & Scan for match with specified byte \\
\hline BMBR & Block move bytes reverse in memory & *SCNW & Scan for match with specified word \\
\hline BMWF & Block move words forward in memory & *SEAF & Basic fixed entry length list search \\
\hline \multirow[t]{2}{*}{BMWR} & Block move words reverse in memory & *SEAL & Basic variable entry length linked list search \\
\hline & *These 'macro' instructions are examined & in more d & ail on the following pages. \\
\hline
\end{tabular}

SUMMARY OF SELECTED 'MACRO' INSTRUCTIONS
\begin{tabular}{|c|c|}
\hline UMUL & \[
\left.\begin{array}{l}
\text { Unsigned } \\
\begin{array}{rl}
16 \text {-bit multiply } \\
16 \text { bits } & 16 \text { bits }
\end{array} \rightarrow 32 \text {-bit answer } \\
\text { R3 R2 }
\end{array} \rightarrow \text { R3 } \text { (MSW of answer) }\right) \text { (R2 (LSW of answer) }
\] \\
\hline & If \(V=1\) then RO is not zero (Answer is longer than 16 bits) If \(\mathrm{N}=1\) then MSb of RO \(=1\). (No particular significance) If \(Z=1\) then answer is zero ( R 1 and R0 are cleared) \\
\hline SMUL & \begin{tabular}{l}
Signed 16 -bit multiply (Two's complement notation) 16 bits 16 bits \(\rightarrow 32\)-bit answer R3 R2 \(\rightarrow\) R3 (MSW of answer) \\
\(\rightarrow\) R2 (LSW of answer) \\
\(\rightarrow\) R1 (LSW of answer) \\
\(\rightarrow\) R0 (MSW of answer) \\
If \(\mathrm{V}=1\) then answer is longer than 16 bits (overflowed LSW) \\
If \(\mathrm{N}=1\) then answer is negative \\
If \(Z=1\) then answer is zero (R1 and R0 are cleared)
\end{tabular} \\
\hline UDIV & \begin{tabular}{l}
Unsigned 16-bit divide \\
16 bits / 16 bits \(\rightarrow 16\)-bit answer and 16 -bit remainder \\
R3 / R2 \(\rightarrow\) R2 R3 holds remainder \\
If \(\mathrm{V}=1\) then an attempt to divide by zero was refused \\
If \(\mathrm{N}=1\) then MSB of answer \(=1\) \\
If \(Z=1\) then answer is zero ( \(R 2=0 \quad R 3\) need not be zero)
\end{tabular} \\
\hline SDIV & \begin{tabular}{l}
Signed 16-bit divide (Two's complement notation) \\
16 bits / 16 bits \(\rightarrow 16\)-bit answer and 16 -bit remainder \\
R3 / R2 \(\rightarrow\) R2 \(\quad\) R3 holds remainder \\
If \(\mathrm{V}=1\) then an attempt to divide by zero was refused, or overflow \\
If \(N=1\) then the answer is negative \\
If \(Z=1\) then the answer is zero \(\quad(\mathrm{R} 2=0 \quad \mathrm{R} 3\) need not be zero \\
R3 has sign of numerator
\end{tabular} \\
\hline DADD & \begin{tabular}{l}
Double word signed add
\[
32 \text { bits }+32 \text { bits } \rightarrow 32 \text { bits }
\]
\[
D W 1+D W 0 \rightarrow \text { DW0 } \quad \text { ie. } \quad R 3, R 2+R 1, R 0 \rightarrow R 1, R 0
\] \\
The C flag is treated the same as in Rngle word addition \\
If \(\mathrm{V}=1\) then a two's complement overflow occurred \\
If \(\mathrm{N}=1\) then the answer is negative \\
If \(Z=1\) then the answer is zero
\end{tabular} \\
\hline DSUB & \begin{tabular}{l}
Double word signed subtract \\
(Two's complement notation) \\
32 bits -32 bits \(\rightarrow 32\) bits \\
DW1 - DW0 \(\rightarrow\) DW0 ie. \(\quad\) R3,R2 - R1,R0 \(\rightarrow\) R1,R0 \\
The C flag is treated the same as in single word subtract \\
IF \(\mathrm{V}=1\) then a two's complement overflow occurred \\
If \(\mathrm{N}=1\) then the answer is negative \\
If \(Z=1\) then the answer is zero \\
If one divides " 8000 " by "FFFF" \((-32768 \div-1)\) the answer is " 8000 " ( +32768 ). However, 8000 is a negative number in two's complement, so an overflow has occurred
\end{tabular} \\
\hline DMUL & \begin{tabular}{l}
Double word signed multiply \\
32 bits \(=32\) bits \(\rightarrow 64\) bits \\
DW1 DW0 \(\rightarrow\) DW0,DW1 ie. R3,R2 \(\quad R 1, R 0 \rightarrow R 1, R 0, R 3, R\) \\
NOTE: The order of the answer words is as follows:
\[
\begin{array}{lll}
\text { MSW } & \rightarrow & \text { R2 } \\
\text { MSW }-1 \rightarrow & \text { R3 } \\
\text { MSW }-2 \rightarrow & \text { R0 } \\
\text { MSW - } 3 \rightarrow & \text { R1 }
\end{array}
\] \\
(LSW)
\end{tabular} \\
\hline
\end{tabular}

\section*{SUMMARY OF SELECTED 'MACRO' INSTRUCTIONS (Cont.)}
```

The reason for this seemingly unnecessary odd order concerns the results that are desired in DW0 $(R 0, R 1)$ at the end of the operation. The desired result of 32 -bit math operations are nearly always 32 -bit answers. However, a 32bit * 32-bit multiply can generate up to 64 bits. Therefore, the least significant 32 bits of the answer are stored in DWO where the answer is expected on all double word (DW) instructions The most significant 32 bits must be stored in DW1, therefore the seemingly reversed order of storage. If the V flag $=0$ at the completion of an operation, then only the 32 bits in DW0 are significant and the user program can store this 32 -bit double word without fear of losing significant bits. So, in the normal situation where only the least significant 32 bits of the answer is desired and the more significant 32 bits of the answer does not contain significant bits, the answer is where the normal convention specifies, in DWO If the $V$ flag is found set and it is desirable to save the 64 -bit result rather than go to an error routine, a simple DXCH will exchange the contents of DW1 and DW0 and leave the 64-bit answer in a logical order with the MSW in R0 and the LSW in R3. It can then be buffered with any of the floating point register 0 buffer instructions. If $V=1$ then the answer has greater than 32 bits of significance.
If $\mathrm{N}=1$ then the answer is negative
If $Z=1$ then the answer is zero
DDIV Double word signed divide (Two's complement notation)
32 bits / 32 bits $\rightarrow 32$-bit answer and 32 -bit remainder
DW1 / DW0 $\rightarrow$ DW0 Remainder $\rightarrow$ DW1
If $\mathrm{V}=1$ then attempted divide by zero was refused, or overflow
If $\mathrm{N}=1$ then answer is negative
If $Z=1$ then answer is zero (DW0 $=0$. DW1 not tested)
DCMP Double word compare (Two's complement notation)
32 bits - 32 bits $\rightarrow$ Nowhere (Update V,N,Z flags)
DW1 - DW0 $\rightarrow$ Nowhere
The C flag is treated the same as in a single word compare
If $\mathrm{V}=1$ then a two's complement overflow occurred
If $\mathrm{N}=1$ then the difference is a negative value
If $Z=1$ then the difference is zero
DXCH Double word exchange
Operates on any contents of DW1 and DW0
DW1 $\rightarrow$ TEMP $\quad$ DW0 $\rightarrow$ DW1 $\quad$ TEMP $\rightarrow$ DW0
DW1 $=$ R3 and R2
$D W 0=R 1$ and $R 0$
No flags are altered
DNEG Double word negate (Two's complement notation)
$00000000-32$ bits $\rightarrow 32$ bits
00000000 - DWO $\rightarrow$ DW0
If $\mathrm{V}=1$ then a two's complement overflow occurred $\quad \mathrm{DW} 0=800000000$
If $\mathrm{N}=1$ then the final value in DWO is negative
If $Z=1$ then the final value in DW0 is zero
TST DWO Double word test value (Two's complement notation)
Set flags based upon the contents of DW0
$00000000+$ DWO $\rightarrow$ Nowhere (Update V,N,Z)
If $V=1$ then a valid 2's complement value overflows the LSW
If $N=1$ then the value in DW0 is negative
If $Z=1$ then the value in DW0 is zero
FPADD Floating point add Double Precision ( 64 bits)
Standard HEX-29 floating point format
FP1 + FPO $\rightarrow$ FPO
If $\mathrm{V}=1$ then an overflow in the 2 's complement exponent occurred
If $N=1$ then the answer is negative
If $Z=1$ then the answer is zero
FPSUB Floating point subtract Double Precision (64 bits)
Standard HEX-29 floating point format
FP1 - FPO $\rightarrow$ FPO
If $\mathrm{V}=1$ then an overflow in the 2 's complement exponent occurred
If $\mathrm{N}=1$ then the answer is negative
If $Z=1$ then the answer is zero

```

SUMMARY OF SELECTED 'MACRO' INSTRUCTIONS (Cont.)
\begin{tabular}{|c|c|}
\hline FPMUL & \begin{tabular}{l}
Floating point multiply Double Precision (64 bits) \\
Standard HEX-29 floating point format \\
FP1 \(\quad\) FPO \(\rightarrow\) FP0 \\
If \(\mathrm{V}=1\) then an overflow in the 2 's complement exponent occurred If \(\mathrm{N}=1\) then the answer is negative \\
If \(Z=1\) then the answer is zero
\end{tabular} \\
\hline FPDIV & \begin{tabular}{l}
Floating point divide Double Precision ( 64 bits) \\
Standard HEX-29 floating point format \\
FP1 / FP0 \(\rightarrow\) FPO \\
If \(\mathrm{V}=1\) then an overflow in the 2's complement exponent occurred or negative zero refused. \\
If \(N=1\) then the answer is negative \\
If \(Z=1\) then the answer is zero
\end{tabular} \\
\hline FPCMP & \begin{tabular}{l}
Floating point compare Double Precision (64 bits) \\
Standard HEX-29 floating point format \\
Compare the magnitudes of FP1 and FPO \\
If \(N X O R V=1\), then \(\operatorname{FP} 1<\) FPO \\
If \(Z=1\) then \(\mathrm{FP} 1=\mathrm{FPO}\)
\end{tabular} \\
\hline & NOTE: WE HAVE TO FURTHER DEFINE THE WAY THIS WORKS, BUT THIS INSTRUCTION WILL SET THE FLAGS SUCH THAT THE 2's COMPLEMENT BRANCH ON THE EF PAGE WILL WORK!!! \\
\hline FPNRM & \begin{tabular}{l}
Floating point normalize Double Precision ( 64 bits) \\
Standard HEX-29 floating point format \\
The sign of the mantissa must be in the MSb of the exponent word before this instruction is executed Shift mantissa left and increment exponent until MSb of the MSW of the mantissa is one. (Operates on FPO only) If \(\mathrm{V}=1\) there was a 2 's complement overflow of the exponent \\
The C flag is trashed \\
\(N=1\) result is negative \\
\(Z=1\) result is zero
\end{tabular} \\
\hline FPXCH & Floating point exchange Double Precision ( 64 bits) Operates on any contents of FP1 \& FPO (R7 thru RO) FP1 \(\rightarrow\) TEMP \(\quad\) FP0 \(\rightarrow\) FP1 \(\quad\) TEMP \(\rightarrow\) FPO FP1 \(=\) R7, R6, R5, R4 \(\mathrm{FPO}=\mathrm{R} 3, \mathrm{R} 2, \mathrm{R} 1, \mathrm{RO}\) No flags are altered \\
\hline TST FPO & Floating point test Double Precision (64 bit) Standard HEX-29 floating point format Set the flags based upon the contents of FPO If \(\mathrm{N}=1\) then the value in FPO is negative If \(Z=1\) then the value in FPO is zero \\
\hline \multirow[t]{6}{*}{SEAL} & \begin{tabular}{l}
BASIC string variable / numeric or string matrix element search \\
The SEAL instruction provides a very flexible way to rapidly and efficiently search linked lists for a particular entry. In each entry in the list, the first two 16-bit words are ordered as follows: The first word of each entry is the link offset to the next entry in the linked list. The second word is the entry name word. Any 16 -bit value can be used in this field.
\end{tabular} \\
\hline & The name of the entry to be searched for must be put in the accumulator (RA) before this instruction is executed. The format of the instruction is as follows: \\
\hline & SEAL F,Md where F is the literal binary value 1111. \\
\hline & The destination field of the instruction (Md) specifies the register that must point to the beginning of the linked list. Starting at this point, this instruction will link its way thru the list looking for a match between the word after the link offset word (the entry name) and the contents of the accumulator (RA). \\
\hline & At the completion of the instruction, the \(Z\) flag indicates the results of the instruction in the following manner: \\
\hline & \begin{tabular}{l}
\(Z=1 \quad\) No match was found in list (End of list reached) \\
\(\mathrm{Z}=0 \quad\) A match was found and Md is pointing to the word after the entry name that matched the accumulator
\end{tabular} \\
\hline
\end{tabular}

\section*{SUMMARY OF SELECTED 'MACRO' INSTRUCTIONS (Cont.)}

Since the link offset word is a two's complement value, it can link to any other location in memory. The link offset is equal to the difference between the address of the next link offset word and the address of the current link offset word, minus one.

Note that this instruction can be used to search linked lists with entry names that are much longer than 16 bits with ease. For example, If the entry names to be matched are 2 words long, all that need be done is to compare the word at which the pointer is aimed with the second word of the desired variable name. If it matches, then the pointer now points to the first element in the list after the double word entry name. If it does not match, the search can be continued by backing up the pointer to the link offset of the current entry and re-executing the SEAL instruction.

At the completion of the instruction, the contents of the register specified by the Md field in the instruction will contain the address of the word AFTER the variable name in the list entry that matched the one in the accumulator (RA). At the completion of the instruction the \(Z\) flag will indicate the results of the instruction execution. If the \(Z\) flag is at a zero level, the search was successful and the pointer to the table (Md) contains the appropriate value. On the other hand, if the \(Z\) flag is set to a one level, no match to the variable name in the accumulator was found anywhere in the linked list.
LO VN da da ... da da LO VN da ... da da LO VN ...

LO \(=\) Link Offset word
VN = Variable Name word
\(\mathrm{da}=\) data entries irrelevant to instruction
SEAF Basic fixed link offset variable search
The SEAF instruction provides a very flexible way to rapidly and efficiently search lists for a particular entry. It is slightly different from the SEAF instruction in that the link offset word is not imbedded in the list entries. Instead, this instruction assumes that all list entries are of the same length (even though the internal formats may vary). The value of the link offset is the immediate word following the BASF op code word.
Perhaps the most obvious use of this instruction is for searching a numeric variable list for a specific variable name followed by the value. The lists entries can be any length, so single and double word integers and floating point lists can all be handled with equal ease, but not all with the same instruction since the list entries will not be the same length for all of these.
The link offset word following the instruction is a two's complement number. Therefore, any fixed length can be searched forwards or backwards in memory. The link offset constant equals the number of words in each list entry, or its 2's complement for a backwards search.
Again the variable name word to be searched for must be put into the accumulator (RA) before the BASF instruction is executed. And the contents of the destination field register (specified by Md) points to the first element of the list. The form of this instruction is shown below:
SEAF \(0, \mathrm{Md} \quad\) where \(0=\) binary 0000
Scan for word
The SCNW instruction is of the following form:
SCNW Ms,Md
This instruction scans a table of words (pointed to by Rs) for a match with the contents of the accumulator. Each time a word is fetched from the table, Rd is incremented. If Rd contains zero at the beginning of the instruction, then it will contain the number of the words searched in the source table before a match with the accumulator occurred.

Alternatively Rd may contain a pointer to another table. When a match between the accumulator and the source table occurs, Rd will point to a corresponding entry in the 'destination' table.

If the source list pointer and the destination list pointer are the same, then the two tables are interleaved; ie. the combined list would start:

Source list word \#1
Destination list word \#1
Source list word \#2
Destination list word \#2
Source list word \#3
etc.
etc.

This instruction can be very useful in command processing routines and for searching lists that are not linked within the list itself (see BASS and BASF).

The last entry in the source list must be a zero. If no matches were found previous to this zero word, then the \(Z\) flag is set. If the \(Z\) flag was not set, then a match was found and the pointers are valid. This instruction is interruptable on a word by word basis.

Scan for byte
The scan for byte instruction (SCNB) works identically to the scan for word instruction except that the source list contains bytes packed into words. Thus the source list is only half as long as the destination list (if there is one).

Note that both lists must start on word boundaries. Only the low byte of the accumulator is used in the compare with the source bytes. The contents of the accumulator are not affected by the instruction. This instruction is interruptable on every other byte that is compared. The Z flag has the same meaning as for the SCNW instruction.

\section*{Instruction Matrix}

A convenient way to present all of the basic op-codes of the HEX-29 CPU is by way of an 'instruction matrix'. The eight-bit op-code in the upper byte is broken into two nibbles. The most significant nibble of the op-code appears on the left side of the matrix shown in Figure 7. The lower nibble appears along the top row. The second matrix shown in Figure 8, is called the 'extended function' matrix. In the HEX-29 CPU, the low byte of the instruction word is interpreted as an 'extended function' op-code if the upper byte is an 'EF' hex.

\section*{Memory Management}

The HEX-29 incorporates a sophisticated memory management structure. Though very clean and elegant in implementation, the capabilities of the processor are greatly extended by this circuitry. Transparent to the user not requiring its many features, this structure is vital to many very important applications; most significantly the support of multi-user, multi-task, time-sharing operations.
To all programs executing on the HEX-29, all memory addresses are 16 bits long. But before these 16 lines reach the system bus, they pass through the memory management section of the HEX-29 CPU. In this circuitry, the most significant four bits (A15-A12) are 'mapped' into eight bits on the bus (a 'write-protect' bit (WP) and seven address lines (A18-A12)). The net increase of three address bits expands the total addressable memory space to 512 k words or 1 Megabyte. The WP bit is used to write protect the memory in blocks as desired by the executive program.
Since each of the 16 locations in the memory map represents a \(4 k\) word block (or page), up to 64 k words can be addressed by a
program at any time. Any location in the memory may contain any 8 -bit value, so memory that is contiguous to a program need not be contiguous in physical memory. For clarity, Figure 9 shows schematically how this 'memory mapping' works.
The low 4 k words of physical address space is reserved for the nucleus of an operating system; also called an executive or supervisor program. This is called physical page zero. The contents of the memory map can only be altered if the low location of the memory map contains all zeros. Since this is synonymous with the physical page zero address block, only the executive program is able to change the contents of the memory map. And since all I/O devices and channel control blocks are located in physical page zero, all I/O must also be done through the executive program. Likewise, all hardware and software interrupts invoke the supervisor automatically.
Because of this simple but fool-proof security scheme, complete protection of all users memory space and I/O devices can easily be maintained by the executive program.
Also ṇote that the supervisor program can safely make programs that are re-entrant available to several users simultaneously as long as it write protects the code. Since user programs are often no larger than the host program under which it is running, this technique can result in a savings of \(30 \%\) to \(50 \%\) in system memory usage.

Occasionally, for special purposes, a single user may wish sole access to the entire resources of the system. Examples would include programs too large to run in a single user's 128k bytes of memory. Or perhaps a new I/O access method. In any case, it is possible for a single user on the system to gain complete control
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & A & B & C & D & E & F \\
\hline 0 & \[
\begin{aligned}
& \text { MVN } \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADD } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADC } \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{SBC} \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& 10 R \\
& R, R
\end{aligned}
\] & \[
\begin{aligned}
& \text { XOR } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMP } \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { RSUB } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { INC } \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{DEC} \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{COM} \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { NEG } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SWP } \\
& \text { R, R }
\end{aligned}
\] \\
\hline 1 & \[
\begin{aligned}
& \text { MVN } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{MOV} \\
& \mathrm{M}+\mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& A D D \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADC } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& \text { M+R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SBC } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { IOR } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { XOR } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMP } \\
& M+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { RSUB } \\
& \mathrm{M}+\mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \operatorname{INC} \\
& M, R
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{DEC} \\
& \mathrm{M}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{COM} \\
& \mathrm{M}, \mathrm{R}
\end{aligned}
\] & \begin{tabular}{l}
NEG \\
M, R
\end{tabular} & \[
\begin{aligned}
& \text { SWP } \\
& \text { M, R }
\end{aligned}
\] \\
\hline 2 & \[
\begin{aligned}
& \text { MVN } \\
& \text { I }+ \text { R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { I+R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADD } \\
& \mathrm{I}+\mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{ADC} \\
& \mathrm{I}+\mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { SBC } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& 10 R \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { XOR } \\
& \text { I+R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMP } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { RSUB } \\
& \text { I+R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { INC } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { DEC } \\
& 1+R
\end{aligned}
\] & \[
\begin{aligned}
& \text { COM } \\
& \mathrm{I}+\mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { NEG } \\
& \text { I+R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SWP } \\
& 1+R
\end{aligned}
\] \\
\hline 3 & \[
\begin{aligned}
& \text { MVN } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADD } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{ADC} \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& \text { Z, R }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{SBC} \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { IOR } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { XOR } \\
& \text { Z, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMP } \\
& \text { Z, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { RSUB } \\
& \text { Z, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { INC } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{DEC} \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { COM } \\
& \mathbf{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { NEG } \\
& \text { Z, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SWP } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] \\
\hline 4 & \[
\begin{aligned}
& \mathrm{MVN} \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{ADD} \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADC } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{SBC} \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { IOR } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{XOR} \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMP } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { RSUB } \\
& \mathrm{X}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { INC } \\
& \mathrm{x}, \mathrm{SC}
\end{aligned}
\] & \[
\begin{aligned}
& \text { DEC } \\
& \mathrm{X}, \mathrm{SC}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{COM} \\
& \mathrm{x}, \mathrm{SC}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{NEG} \\
& \mathrm{X}, \mathrm{SC}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SWP } \\
& \mathrm{X}, \mathrm{SC}
\end{aligned}
\] \\
\hline 5 & \[
\begin{aligned}
& \text { MVN } \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& A D D \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{ADC} \\
& \mathrm{M}+\mathrm{M}
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUB } \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& S B C \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& \text { AND } \\
& \mathrm{M}+\mathrm{M}
\end{aligned}
\] & \[
\begin{aligned}
& \text { IOR } \\
& M+M
\end{aligned}
\] & \[
\begin{aligned}
& \text { XOR } \\
& \mathrm{M}+\mathrm{M}
\end{aligned}
\] & CMP
M + M + & \[
\begin{aligned}
& \text { RSUB } \\
& \mathrm{M}+\mathrm{M}
\end{aligned}
\] & & & & & \\
\hline 6 & \[
\begin{aligned}
& \text { LDI2 } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CLR2 } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { PSH2 } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { POP2 } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{XCH} \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { ASR } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ASL } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ROR } \\
& \text { R, R }
\end{aligned}
\] & \begin{tabular}{l}
ROL \\
R, R
\end{tabular} & \[
\begin{aligned}
& \text { LSR } \\
& \text { R, R }
\end{aligned}
\] & \begin{tabular}{l}
RCL \\
R, R
\end{tabular} & \[
\begin{aligned}
& \mathrm{CSL} \\
& \mathrm{R}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { VSR } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { VSL } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { DSR } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { DSL } \\
& \text { R, R }
\end{aligned}
\] \\
\hline 7 & \[
\begin{aligned}
& \text { BTS } \\
& \text { R, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { BTC } \\
& \text { R, H }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BTI} \\
& \mathrm{R}, \mathrm{H}
\end{aligned}
\] & \[
\begin{aligned}
& \text { BTT } \\
& \text { R, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MVH } \\
& \text { R, H }
\end{aligned}
\] & \begin{tabular}{l}
MVHN \\
R, H
\end{tabular} & \[
\begin{aligned}
& \text { ADDH } \\
& \text { R, H }
\end{aligned}
\] & \begin{tabular}{l}
SUBH \\
R, H
\end{tabular} & \[
\begin{aligned}
& \text { CMPHA } \\
& \text { R, H }
\end{aligned}
\] & CMPH
R, H & \begin{tabular}{l}
CMPHN \\
R, H
\end{tabular} & \[
\begin{aligned}
& \text { FMM } \\
& \mathrm{R}, \mathrm{H}
\end{aligned}
\] & \[
\begin{aligned}
& \text { VSR } \\
& \text { R, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { VSL } \\
& \text { R, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { EXR } \\
& \text { R }
\end{aligned}
\] & \begin{tabular}{l}
JAM \\
R, R, W
\end{tabular} \\
\hline 8 & \[
\begin{aligned}
& \text { BTS } \\
& \text { Z, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { BTC } \\
& \text { Z, H }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BTI} \\
& \mathrm{Z}, \mathrm{H}
\end{aligned}
\] & \[
\begin{aligned}
& \text { BTT } \\
& \text { Z, H }
\end{aligned}
\] & \[
\begin{aligned}
& \text { ANDB } \\
& \text { B, A }
\end{aligned}
\] & \[
\begin{aligned}
& \text { IORB } \\
& \mathrm{B}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { XORB } \\
& \text { B, A }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SWT } \\
& \text { R, Z }
\end{aligned}
\] & MOV
A, Y & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{Y}, \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \text { LDBI } \\
& \text { R, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { STBI } \\
& \text { R, R }
\end{aligned}
\] & XCH M DUP M & \[
\begin{aligned}
& \text { COMF } \\
& \mathrm{B}
\end{aligned}
\] & MVN CC, R & \begin{tabular}{l}
MOV \\
R, CC
\end{tabular} \\
\hline 9 & \begin{tabular}{l}
MVB \\
B, A
\end{tabular} & \[
\begin{aligned}
& \text { MVB } \\
& \mathrm{Z}, \mathrm{Z}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MVB } \\
& \mathrm{Z}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MVB } \\
& \text { R, Z }
\end{aligned}
\] & \[
\begin{aligned}
& \text { LDB } \\
& \mathrm{M}, \mathrm{M}
\end{aligned}
\] & \[
\begin{aligned}
& \text { STB } \\
& \mathrm{M}, \mathrm{M}
\end{aligned}
\] & \[
\begin{aligned}
& \text { ADDB } \\
& \text { B, A }
\end{aligned}
\] & \[
\begin{aligned}
& \text { SUBB } \\
& \text { B, A }
\end{aligned}
\] & \begin{tabular}{l}
CMPBA \\
B, A
\end{tabular} & \[
\begin{aligned}
& \text { CMPB } \\
& \text { B, A }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CMPB } \\
& \mathrm{z}, \mathrm{z}
\end{aligned}
\] & CMPB
\[
\text { R, } z
\] & \begin{tabular}{l}
CMPB \\
R, R
\end{tabular} & \begin{tabular}{l}
CMPB \\
M, M
\end{tabular} & \[
\begin{aligned}
& \text { SETF } \\
& \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CLRF } \\
& \mathrm{B}
\end{aligned}
\] \\
\hline A & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{M}, \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { I, R }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { M+M+ }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{MOV} \\
& \mathrm{M}+\mathrm{I}+
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{MOV} \\
& \mathrm{M}+\mathrm{Z}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { M+M- }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{A}, \mathrm{X}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { Z. I+ }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{Z}, \mathrm{Z}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { Z, M- }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { RD, } Y
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { RB, Y }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { R9, Y }
\end{aligned}
\] & MVM RR, \(Y\) & \[
\begin{aligned}
& \text { LDINT } \\
& \mathrm{M}+\mathrm{H}
\end{aligned}
\] & \begin{tabular}{l}
LDINT \\
R, H
\end{tabular} \\
\hline B & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{R}, \mathrm{M}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { R, I }
\end{aligned}
\] & \begin{tabular}{l}
MOV \\
R, M+
\end{tabular} & \[
\begin{aligned}
& \text { MOV } \\
& \text { R, } 1+
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { R, Z }
\end{aligned}
\] & \begin{tabular}{l}
MVN \\
R, M-
\end{tabular} & \begin{tabular}{l}
MOV \\
R, M-
\end{tabular} & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{I}+\mathrm{I}+
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{I}+, \mathrm{Z}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \text { I+M- }
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{Y}, \mathrm{RD}
\end{aligned}
\] & \[
\begin{aligned}
& \text { MOV } \\
& \mathrm{Y}, \mathrm{RB}
\end{aligned}
\] & MOV Y, R9 & MVM
Y, R, R & RMM R, R & \begin{tabular}{l}
RDINT \\
R, H
\end{tabular} \\
\hline C & \begin{tabular}{l}
MVM \\
FPO \\
M+, DWO
\end{tabular} & \begin{tabular}{l}
MVM \\
FP1 \\
M+DW1
\end{tabular} & \begin{tabular}{l}
MVM \\
FPO \\
DW0, M-
\end{tabular} & \begin{tabular}{l}
MVM \\
FP1 \\
DW1, M-
\end{tabular} & \begin{tabular}{l}
MVM \\
FPO \\
Z, DW0
\end{tabular} & \begin{tabular}{l}
MVM FP1 \\
Z, DW1
\end{tabular} & MVM FPO DW0, Z & MVM FP1 DW1, Z & \begin{tabular}{l}
MVM \\
X, FPO
\end{tabular} & MVM
\[
\mathrm{X}, \mathrm{FP} 1
\] & \begin{tabular}{l}
MVM \\
FPO, \(X\)
\end{tabular} & \begin{tabular}{l}
MVM \\
FP1, X
\end{tabular} & MVM
X, DW0 & \begin{tabular}{l}
MVM \\
X, DW1
\end{tabular} & \begin{tabular}{l}
MVM \\
DW0, \(x\)
\end{tabular} & \begin{tabular}{l}
MVM \\
DW1, X
\end{tabular} \\
\hline D & \[
\begin{aligned}
& \text { JFS } \\
& \text { B }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{JFC} \\
& \mathrm{~B}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CFS } \\
& \text { B }
\end{aligned}
\] & \[
\begin{aligned}
& \text { CFC } \\
& \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \text { JIFS } \\
& \text { B }
\end{aligned}
\] & \[
\begin{aligned}
& \text { JIFC } \\
& \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CIFS } \\
& \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CIFC } \\
& \mathrm{B}
\end{aligned}
\] & \begin{tabular}{l}
RTFS \\
B
\end{tabular} & RTFC
B & \[
\begin{aligned}
& \text { JMP } \\
& \text { R }
\end{aligned}
\] & CALL R & \[
\begin{aligned}
& \text { CALL } \\
& \mathrm{x}
\end{aligned}
\] & CALL
\[
Y
\] & \[
\begin{aligned}
& \text { CALL } \\
& \mathrm{Z}
\end{aligned}
\] & \[
\begin{aligned}
& \text { CEX } \\
& \mathrm{B}
\end{aligned}
\] \\
\hline E & \[
\begin{aligned}
& B R \\
& +L
\end{aligned}
\] & \[
\begin{aligned}
& B-R \\
& +L
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BC} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BNC} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& B V \\
& \pm B
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BNV} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BN} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BNN} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& B Z \\
& \pm B
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BNZ} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BH} \\
& \pm \mathrm{B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{BNH} \\
& \pm B
\end{aligned}
\] & \[
\begin{aligned}
& \text { DBNZ } \\
& \pm B
\end{aligned}
\] & \[
\begin{aligned}
& B Z D \\
& \pm B
\end{aligned}
\] & \[
\begin{aligned}
& \text { CBB } \\
& \pm B
\end{aligned}
\] & EF \\
\hline F & \[
\begin{gathered}
B R \\
-L
\end{gathered}
\] & \[
\begin{aligned}
& \text { BSR } \\
& -L
\end{aligned}
\] & CALLO B & \[
\begin{aligned}
& \text { JMPO } \\
& 8
\end{aligned}
\] & & & & & \begin{tabular}{l}
BMWF \\
M, M
\end{tabular} & \begin{tabular}{l}
BMWR \\
M, M
\end{tabular} & SEAL M SEAF M & \begin{tabular}{l}
SCNW \\
M, M
\end{tabular} & \begin{tabular}{l}
SCNB \\
M, M
\end{tabular} & \begin{tabular}{l}
TTWB \\
M, M
\end{tabular} & \begin{tabular}{l}
TTBB \\
M, M
\end{tabular} & \[
\begin{aligned}
& \text { NOP } \\
& \text { B }
\end{aligned}
\] \\
\hline
\end{tabular}

Figure 7. HEX-29 Instructions.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & A & B & C & D & E & \(F\) \\
\hline 0 & FADD & FSUB & FMUL & FDIV & FCMP & FXCH & \[
\begin{aligned}
& \text { FNRM } \\
& \text { FPO }
\end{aligned}
\] & \[
\begin{aligned}
& \text { FTST } \\
& \text { FPO }
\end{aligned}
\] & & & & & & & & \\
\hline 1 & DADD & DSUB & DMUL & DDIV & DCMP & DXCH & DNEG DWO & DTST DW0 & & & & & & & & \\
\hline 2 & SMUL & SDIV & UMUL & UDIV & & & & & & & & & & & & \\
\hline 3 & PSHF & PSH8 & PSHD & \[
\begin{aligned}
& \text { LMM } \\
& \text { A }
\end{aligned}
\] & RTI & & & & & & & & & & & \\
\hline 4 & POPF & POP8 & POPD & & BPT & & & & & & & & & & & \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline B & MVM FPO, FP1 & \begin{tabular}{l}
MVM \\
FP1, FP0
\end{tabular} & MVM ABS, FPO & MVM ABS, FP1 & MVM FPO, ABS & \begin{tabular}{l}
MVM \\
FP1, ABS
\end{tabular} & & & & & & & & & & \\
\hline C & \begin{tabular}{l}
MVM \\
DW0, DW1
\end{tabular} & \begin{tabular}{l}
MVM \\
DW1, DW0
\end{tabular} & MVM ABS, DW0 & MVM ABS, DW1 & MVM DWO, ABS & \begin{tabular}{l}
MVM \\
DW1, ABS
\end{tabular} & & & & & & & & & & \\
\hline D & CALL REL & \[
\begin{aligned}
& \text { CALL } \\
& \text { ABS }
\end{aligned}
\] & & & & & & & & & & & & & & \\
\hline E & BGT & BGE & BLT & BLE & & & & & & & & & & & & \\
\hline F & BMBF & BMBR & SRCH & & & & & & & & & & & & & \\
\hline
\end{tabular}

Figure 8. HEX-29 EF Instructions.


Figure 9. Memory Mapping Program Address (Logical Address).
and access to the system by assigning himself as the executive program. This can only be accomplished after a system reset. Hence only those with physical access to the computer (and who have a reset key) can accomplish this operation. This user is then empowered with all of the features and capabilities of the machine with no limitations. Direct access to all of the system I/O devices, the entire interrupt structure, the memory map, etc., is then at the command of the single user in the executive or supervisor mode.

Most often, each user needs only one or two 4 k pages of memory in addition to the host program which is probably shared. Thus it would be very wasteful if each user were to have access to a full 65 k words of physical memory space. For this reason, a page of physical memory has a special designation in the system.

The highest possible physical address block when write protected is called the 'invalid access' block. Whenever a user accesses memory that the supervisor has mapped into the invalid access block, the processor 'traps' to a special location in the supervisor program called the 'invalid access trap'. This occurs
before the current machine cycle is completed. This is treated identically to an interrupt by the processor except that the current instruction is not completed.
Any number of actions can be taken by the supervisor at this time. This will usually depend upon the resources of the machine and the circumstances under which the problem arose. For example, the executive program could inform the program that its memory space had been exceeded, or perhaps just allocate another block of memory to that user's memory map and continue the execution of the offending program. A more detailed discussion of the sequence of events that takes place upon an invalid access appears in the section on the interrupt structure of the HEX-29 CPU.
The highest physical address page, when not write protected, is called the 'dead page'. No action of any kind takes place in this block and there is no memory there for the program to reference. Any number of pages from any number of users may be assigned to this physical page without fear of interaction. This is the block that will normally be assigned by the executive program to all user areas that are not needed or are not to be used.

\section*{Interrupt Structure}

The HEX-29 CPU contains a powerful interrupt structure. As with memory management, this aspect of the CPU operation is largely transparent to users of the system. In most applications the HEX OPERATING SYSTEM FOR TIMESHARING (HOST) program services all interrupts. Nonetheless, it is useful to know the basic structure of the interrupt system. The three types of interrupts serviced by the HEX-29 CPU are examined in the following paragraphs.
The hardware interrupts are caused by signals from physical devices outside of the processor. These signals, generated by peripherals, their controllers, or the real time clock, serve to notify the CPU of some condition or requirement of the interrupting device.
The HEX-29 CPU has eight hardware interrupts. They are individually maskable and are prioritized into eight levels. Each priority level has its own vector associated with it. In other words, each interrupt level has a corresponding memory location through which program control is passed upon that level interrupt. These memory locations are within the defined executive page (physical page 0 ) and thus all interrupts cause the HEX-29 to switch into executive mode automatically. The eight hardware interrupt levels and the associated memory locations are shown below.
\begin{tabular}{|c|c|}
\hline & \\
Hardware Interrupt Level & Memory Location of Vector \\
\hline Highest Priority 7 & \(0407_{\mathrm{H}}\) \\
6 & \(0406_{\mathrm{H}}\) \\
5 & \(0405_{\mathrm{H}}\) \\
4 & \(0404_{\mathrm{H}}\) \\
3 & \(0403_{\mathrm{H}}\) \\
2 & \(0402_{\mathrm{H}}\) \\
1 & \(0401_{\mathrm{H}}\) \\
Lowest Priority & 0
\end{tabular}

So, for example, when an interrupt occurs on level 3 , the HEX-29 CPU will enter supervisor mode, save the users PC and SP, and call the appropriate service routine at the address stored in memory location \(0403_{\mathrm{H}}\).
Normally, each hardware interrupt level is reserved for a class of devices such as hard disc controllers, floppy disc controllers, serial channels, etc. If, for example, there are eight serial devices that are interrupting on level 0 , the service routine is required to locate the one (or more) devices that are requesting service on that interrupt level and processes them accordingly. This could be done by polling all the serial devices whenever the interrupt was received. A more efficient technique, used in the HEX-29 system, is to further prioritize the like devices on a given interrupt level. Then when an interrupt occurs, a vector is read by the executive program that instantly informs it of the highest priority device requesting service on that level. When that device is serviced, the vector is read again to locate any other devices in need of service (if any), and finally resumes normal program execution when all devices are serviced.
A software interrupt is an instruction that, when executed, causes an interrupt to occur. The mnemonic used for this op-code in the HEX-29 CPU is 'CEX', which stands for 'call executive'. This instruction passes an 8 -bit vector to the 'HOST' operating system which is used to determine the action requested by the program executing the CEX. Except that this interrupt is caused by a program rather than a physical device, the CEX operates in the same manner as a hardware interrupt. It vectors through memory
location 040C. A pseudo software interrupt is the breakpoint 'BPT' instruction which vectors through memory location 040B. The BPT instruction does not pass an 8-bit vector to the executive and is thus useful in program debugging.
The third type of interrupt is called a 'trap'. A trap takes place when certain conditions occur that require the processor's immediate attention. For example, if the program currently running on the CPU tries to execute an op-code for which there is no defined instruction, an 'invalid instruction trap' occurs. This is essentially a service to notify a user that his program was defective and that an attempt was made to execute an op-code which has no meaning. These locations are left blank in the instruction matrix since they can subsequently be defined as new instructions. This 'trap' vectors through memory address 040D and acts identically to all other interrupts. The only other trap in the HEX-29 CPU is the 'invalid memory access' condition. This is discussed in more detail in the previous section on memory management. The 'invalid memory access' trap vectors through memory address 0408.

Table 4 shows the memory locations that are defined in the HEX-29 for interrupt handling.

TABLE 4. INTERRUPT MEMORY LOCATIONS.
\begin{tabular}{|c|l|}
\hline \begin{tabular}{c} 
Memory \\
Location
\end{tabular} & \multicolumn{1}{c|}{ System Defined Uses } \\
\hline 040F & Reserved \\
040E & Reserved \\
040D & Vector for invalid instruction trap \\
040C & Vector for call executive (CEX) instruction \\
040B & Vector for breakpoint (BPT) instruction \\
040A & Temperature storage for user stack pointer \\
0409 & Temperature storage for executive stack \\
& pointer \\
0408 & Vector for invalid memory access trap \\
0407 & Vector for hardware interrupt level 7 \\
0406 & Vector for hardware interrupt level 6 \\
0405 & Vector for hardware interrupt level 5 \\
0404 & Vector for hardware interrupt level 4 \\
0403 & Vector for hardware interrput level 3 \\
0402 & Vector for hardware interrupt level 2 \\
0401 & Vector for hardware interrupt level 1 \\
0400 & Vector for hardware interrupt level 0 \\
\hline
\end{tabular}

Again, note that all interrupts are processed identically so that the one return from interrupt (RTI) instruction properly terminates all interrupt service routnes.

\section*{DMA/REFRESH CONTROL}

In order that an efficient multi-user or multi-task system be implemented, it is necessary that the processor not be burdened with the relatively slow transfer of programs and data between system memory and mass storage devices such as floppy and hard disks. For this reason, the controllers for these devices are designed with a high degree of intelligence and self-reliance. These controllers take virtually all of the burden of mass storage transfers upon themselves. This frees the HEX-29 CPU to execute programs for all users not waiting for these mass storage transfers to take place. Because these controllers are essentially separate special purpose microprogrammed CPUs, they are often called 'peripheral processors', 'channel processors', or just 'channels'.

For this scheme to be effective, both the CPU and the channel processors must be accessing system memory concurrently. Fortunately, the inherent structure and operation of the HEX-29 CPU is eminently suited to this requirement.
In every instruction there is at least one machine cycle during which the HEX-29 CPU is decoding or internally executing an instruction. During these machine cycles the CPU does not use the system bus; the system bus and memory are available for access by devices other than the HEX-29 CPU. This is called a 'Free DMA cycle' or 'bus available' cycle. During these machine cycles a channel processor may read or write memory without interfering with, or assistance from the HEX-29 CPU. The act of accessing system memory by any device other than the CPU is called 'direct memory access' or DMA since the channel processor is directly accessing system memory without CPU assistance or intervention.
Resident in the HEX-29 CPU is a very clean, very powerful multi-level prioritized DMA structure. Within this structure up to ten groups of devices can share the system bus on a priority basis. Normally the priority levels are assigned on the basis of transfer speeds . . . the faster the device is able to support memory transfers, the higher the priority it is assigned. In this manner several channel processors can access system memory concurrently at the intervals they require. The DMA structure of the HEX-29 CPU can support very high combined transfer rates with multiple DMA devices using this technique. With high speed memory, the HEX-29 CPU need not even slow down its program execution to support a concurrent combined DMA transfer rate of 4 Megabytes per second. With slower memory, this figure drops to about 2 to 3 Megabytes per second. Even this slower rate corresponds to concurrent DMA by one high speed hard disk plus several floppy disks plus room to spare. Still, the CPU can be halted, if necessary, to achieve combined DMA rates of up to 12 Megabytes per second maximum.
The support of dynamic memory in the HEX-29 system is simplified by signals associated with this DMA structure. Whenever there are no devices requesting the bus for DMA, a signal on the bus indicates this condition. Dynamic memory refresh controllers can take advantage of these unused free DMA cycles to refresh internal dynamic RAM chips if desired. Even when very heavy use of the bus by DMA devices occurs, it is unlikely that too few of these unused free DMA cycles will be available for the dynamic memory refresh controllers. In this event, however, another signal can be used to disable all other DMA priorities and allow the refresh controllers as much time as is required.

\section*{SYSTEM BUS AND TIMING}

When specifying the bus signals and their timing relationships during the early design stage of the HEX-29 CPU, utmost attention was paid to simplicity and reliability. The result is that there are very few signals required to interface to the bus properly, and the timing requirements are quite straight forward and easy to meet.

The following section is a description of the mnemonic names and functions of the HEX-29 system bus signals:

\section*{System Bus}

A18-A0 Three-state outputs. A18-A0 are the 19 physical (Address Bus) address lines of the HEX-29 system address bus. A18 is the most significant bit, A0 is the least significant bit. These outputs are threestated whenever the bus is available ( \(\overline{\mathrm{BA}}\) is low).

D15-D0 (Data Bus)

Three-state and bi-directional input/outputs. D15-D0 are the 16 lines that make up the HEX-29 system data bus. D15 is the most significant bit, D0 is the least significant bit.

WP (also \(\overline{\text { WE }}\) ) Three-state output. WP is used to protect (Write Protect) areas of memory from being written. Practically speaking this signal is active-LOW and would have been called WE (Write Enable) if not for possible confusion with the read/write signal which also must be LOW to write memory.

R/W
(Read/Write)
Three-state output. The R/ \(\overline{\mathbf{W}}\) signal determines whether a read or write operation is performed. A LOW level of the R/W line indicates a write memory is to be performed if VMA (valid memory access) is also LOW when the system clock (CLK) goes LOW.
\(\overline{\text { VMA }} \quad\) Three-state output. \(\overline{\text { VMA }}\) is LOW during all (Valid Memory cycles that a memory access (read or write) Access) will be performed by the processor.
CLK Output, not three-state. CLK is the system clock. All timing in the HEX-29 system is defined relative to this signal. For convenience, the period of each machine cycle that the clock is high is called \(\phi_{1}\) (phase 1) and the period that it is low is called \(\phi_{2}\) (phase 2). All system 'chip selects' are derived from this signal.
\(\overline{\text { SDMA }} \quad\) Output, not three-state. \(\overline{\text { SDMA }}\) is mnemonic for 'synchronize direct memory access'. This bus signal is LOW the cycle before DMA is permissible. The sole purpose of this signal is to notify DMA devices early of an upcoming 'free DMA' cycle. This will make it easier to 'grab the bus' very early in a 'free DMA' cycle to improve the address generation timing.
\(\overline{B A} \quad\) Output, not three-state. \(\overline{B A}\) is LOW on all (Bus Available) cycles during which DMA is permitted by the CPU. When \(\overline{B A}\) is LOW, all three-stateable outputs from the HEX-29 CPU card are turned off and control is relinquished to DMA devices for the current cycle. \(\overline{B A}\) is mnemonic for 'bus available'.
STR Input to HEX-29 CPU. When an addressed de-
(Stretch Clock) vice is not fast enough to be reliably accessed (read or written) within the minimum access time of the HEX-29 CPU, it should pull the STR signal LOW. For each 40 ns that STR is held LOW, the system clock is lengthened by 40 ns and thus the access time required of the addressed device. This signal can be held LOW for as many as 40 ns increments as required to meet the access time of the addressed memory or I/O device.
\(\overline{\mathrm{CLR}} \quad\) Output, not three-state. \(\overline{\mathrm{CLR}}\) is a LOW level (Clear) pulse which is just a 'cleaned up' RESET signal. Any device that requires an initialization pulse should use this line.
17-10
(Interrupts)

Inputs to HEX-29 CPU. \(\overline{17}-\overline{10}\) are the eight hardware interrupt inputs. \(\overline{17}\) is the highest priority and \(\overline{\mathrm{O}}\) is the lowest. These inputs are negative edge catching; that is, an interrupt signal is recognized by the interrupt circuitry in the HEX-29 CPU when the line goes LOW. These
lines should be driven by open collector outputs so that multiple devices can interrupt on the same priority level.
\(\overline{\mathrm{R}}-\overline{\mathrm{RO}}\) Requests)
\(\overline{Q 7}-\overline{Q 0}\)
(DMA
Acknowledge)

Inputs to HEX-29 CPU. \(\overline{\mathrm{R7}}-\overline{\mathrm{RO}}\) are the eight DMA request inputs. \(\overline{\mathrm{R} 7}\) is the highest priority, \(\overline{\mathrm{RO}}\) is the lowest. These lines are activeLOW; i.e., a LOW level requests DMA time.

Outputs, not three-state. \(\overline{\mathrm{Q}}-\overline{\mathrm{QO}}\) are the eight DMA acknowledge lines that reply to the corresponding DMA request lines ( \(\overline{\mathrm{R7}}-\overline{\mathrm{RO}}\) ). A reply to the highest requesting priority is acknowledged by a LOW level on the corresponding acknowledge line. Only one of these lines will be LOW at any given time; i.e., the highest priority request gets the acknowledge.
\(\overline{N R Q} \quad\) Output, not three-state. \(\overline{N R Q}\) is LOW when no (No DMA Request)
\(\overline{\text { DDMA }} \quad\) Input to HEX-29 CPU. When \(\overline{\text { DDMA }}\) is pulled (Disable DMA) LOW, no DMA requests are acknowledged. Essentially this line is just the highest prority DMA request line - except there is no corresponding acknowledge signal. This signal is normally reserved for dynamic memory refresh controllers. If the refresh interval is about to expire and some locatıons have not yet been refreshed, this line can be pulled LOW to disable all other DMA devices and assure adequate time to refresh the remaining locations. Note that \(\overline{N R Q}\) is not LOW when \(\overline{\text { DDMA }}\) is active (LOW). The \(\overline{\text { DDMA }}\) line should be driven by open collector outputs.

HALT Input to HEX-29 CPU. When pulled LOW, the HALT input will cause the processor to terminate program execution at the conclusion of the current instruction. At this time the bus will become continuously available for DMA as all three-state outputs of the HEX-29 CPU will turn off and \(\overline{B A}\) will go active (LOW). This line can be held LOW indefinitely. When released, the processor will continue program execution. This line should be driven by open collector outputs.
\(\overline{\text { FETCH }} \quad\) Output, not three-state. This signal is LOW only (Fetch Instruction) on memory read cycles when an instruction is being fetched from system memory. Otherwise this signal is normally not used except during system development and debugging for single instruction execution.
\(\overline{\text { RESET }}\)

\section*{OSC}
(Oscillator)
Input to HEX-29 CPU. This is the sıgnal from which system reset \((\overline{\mathrm{CLR}})\) is derived. Normally this input is simply grounded with a pushbutton or keyswitch to reset the HEX-29 system.

Output, not three-state. This is the crystal controlled master oscillator from which the system clock is derived. The period of this oscillator is normally 40ns. (25MHz).

\section*{System Timing}

In any microprogrammed system which must interface to a number of external devices (as a CPU must), it is critical that considerable forethought be given to the methods of inter-device communication. It is quite common to design and build devices that operate with very high degrees of reliability - only to find that overall system reliability is inadequate when the various devices are interfaced.
One of the utmost goals in designing the HEX-29 CPU was to develop an extremely reliable, easy to use, system bus definition. Simplicity and reliability go hand in hand and this is reflected in the HEX-29 system bus. Perhaps the single most important decision in this regard was to define that all memory and I/O device accesses by the processor or DMA devices would share one set of timing rules. In other words, one set of timing specifications applies to any kind of access of any device by any other device. Some systems have different timing requirements for all sorts of reasons; a few examples are listed here.
1. Memory read tıming is more critical (shorter) if the memory being fetched is an instruction.
2. Variations exist in the set-up and hold times required on read memory vs. write memory cycles.
3. Memory devices and I/O devices use some different signals and timing specifications.
4. DMA devices are required to meet a different set of timıng requirements than the processor.
5. Interrupt processing routines violate the normal memory access techniques.

Special cases carry special problems and should be avoided like the plague. It is always best and easiest to have all devices and situations share one set of control signals and one set of timing relationships. Another good practice put into effect on the HEX-29 CPU is the exclusive use of active-LOW bus signals. This is important in many respects. First, bipolar logic IC's can sink (pull LOW) far more current than they can source. Thus any noise spikes need to carry far more energy to force the signal into an invalid level. Secondly, all signals that three-state (turn-off) will be pulled-up (float) to the inactive level. Furthermore, this scheme tends to reduce the power required by bus signal drivers and therefore reduce heat dissipation.
Physical design is also important to system reliablity. It is wise to use four layer PC cards with GND and \(\mathrm{V}_{\mathrm{CC}}\) planes as the internal layers, as do all of the HEX-29 system cards. An additional feature of the HEX-29 system bus is that all signals are interlaced with GND traces that return directly to the internal GND plane next to each bus signal. System termination should also be provided whenever signals must travel more than 18 ". Bypass capacitors should abound on all system cards, one per three IC's as a minimum. The HEX-29 averages one per IC.
The tıming of each machine cycle in a HEX-29 system is a combination of synchronous and asynchronous characteristics. Actually, all signals are synchronous with - or are synchronized by - the master oscillator from which the system clock is derived. Thus, despite the fact that some signals seem to be asynchronous, they are actually synchronized automatically with the system clock. The simplicity of this approach will become clear once the relationship of all signals to the system clock is explained.
The conventions regarding the HEX-29 system clock are very simple. All machine cycles begin when the system clock goes HIGH and end simultaneously with the begining of the next machine cycle. The period of time that the system clock (CLK) is HIGH is called \(\phi_{1}\) (phase 1) and the period of time that it is LOW is called \(\phi_{2}\) (phase 2) See Figure 10 for clarification.


Figure 10.

During all memory and \(\mathrm{I} / \mathrm{O}\) accesses, the processor (or DMA controller) must guarantee that all address lines and control signals are valid for at least 20 ns before the end of \(\phi_{1}\) (falling edge of clock). Depending upon the addressing mode, the processor will require a variable period of time to generate a valid address. Thus it is the responsibity of the processor to control the period of \(\phi_{1}\) to meet its requirements. If no external accesses are made by the CPU, \(\phi_{1}\) and \(\phi_{2}\) will last only 80 ns each unless a DMA device takes control of the bus on that cycle and requires longer times.
Similarly, \(\phi_{2}\) is controlled by the memory and I/O devices on the bus. If none are being accessed on a particular machine cycle, no control need be exercised on the system clock and \(\phi_{2}\) will last for 80 ns. However, when accessed, many memory and I/O devices more than 80 ns to perform a successful read or write operation. They must be able to lengthen \(\phi_{2}\) of the system clock to increase the access time appropriately. This is accomplished with the STR bus signal. When a device is accessed that requires that \(\phi_{2}\) be longer than 80 ns , it must bring STR LOW within 50 ns of the falling edge of system clock (i.e., 50 ns into \(\phi_{2}\) ). For every 40 ns that STR is held LOW, the system clock is held in its present state for an additional 40 ns . \(\phi_{2}\) can thus be extended indefinitely as required by the access time of the addressed device. \(\phi_{1}\) can also be extended in 40 ns increments with the STR signal if so required by DMA devices with slow address generation times, or the like.
A DMA device must gain access to the bus before it can access the memory location that it desires. This is very simple. It simply pulls its DMA request line LOW and waits for the corresponding DMA acknowledge signal to go LOW in reply. Then, at the beginning of the first machine cycle which finds these signals plus SDMA LOW, the DMA device has been granted access to the bus and may immediately generate the appropriate signals on the address, data, and control buses to accomplish the transfer. The memory device being accessed does not care whether it is the processor or a DMA device on the bus since the bus signals and timing used by the memory card is identical for both. Thus it controls \(\phi_{2}\) with the STR signal as necessary and the access is completed in exactly the same manner as if it had been the processor controlling the bus. The Boolean equation for a DMA device gaining access to the bus follows - and Figure 11 is a schematic showing how easy the implementation can be.
\[
\begin{aligned}
& \overline{\mathrm{Q}_{X}} \cdot \overline{R_{X}} \cdot \overline{\mathrm{SDMA}} \cdot \mathrm{CLK}=\text { DMA device has access for the } \\
& \text { current cycle } \\
& X=\text { any DMA priority level }
\end{aligned}
\]

The timing relationships for the HEX-29 bus are shown in Figure 12.


Figure 11. DMA Bus Signals.

\section*{INTERNAL OPERATION}

\section*{Block Diagram}

The block diagram of the HEX-29 CPU (Figure 13) shows the following functional modules:
1. System Clock
2. Microprogram Control
3. \(\mu\) Word Memory (Control Store)
4. Am2901A Bit Slice ALU/Register Sets
5. ALU Arithmetic Carry In Control
6. Shift and Rotate Linkages
7. Condition Code Control
8. Am2901A Output Bus
a. Data Output Latches
b. Address Latches
c. Memory Management RAM
d. Condition Code Register
9. Am2901A Input Bus
a. Data Bus Input Registers
b. Byte Swap Input Registers
c Microword Data Registers
d. Clear Byte/Bit Set Logic
e. Instruction Decode PROMs
f. Condition Code Register
10. Interrupt Control
11. DMA/Refresh Control

Sections 8 and 9 are more difficult to isolate on the block diagram since they are the buses that connect many function modules together. A full detailed schematic of the HEX-29 is shown in Figure 14; a fold out drawing at the back of the chapter. A discussion of the function of each of the above modules follows.

\section*{System Clock (Figure 15)}

All timing in the HEX-29 CPU is controlled by the system clock. The positive going edge of the system clock (LOW-to-HIGH transition) marks the end of one machine cycle and the beginning of the next. All input signals to the HEX-29 CPU from the system bus are captured on this edge. The next microinstruction is clocked into the pipeline register on this edge.



Figure 13. System Block Diagram.

Normally a system clock is a simple square wave or more complex waveform with a fixed period and duty cycle. But the system clock of the HEX-29 CPU is microprogrammed. In other words, the period and duty cycle are selected by microword bits in each microcycle. The advantage of this approach is one of through-put (speed).

In any CPU, some internal operations require longer to execute reliably than others. And one or more of these operations requires the maximum length of time to complete reliably. This is called the worst case delay path or "critical path". Normally the period of time required to perform this "critical path" operation is chosen as the clock period for all instructions.
Since the "critical path" operation may take a factor of \(30 \%\) to \(100 \%\) longer to execute than typical operations, it is clear that much processor time is being wasted in any typical program. Two microword bits are used to control the HEX-29 microprogrammed system clock so that each microcycle lasts only as long as necessary for the operation being performed. An overall speed gain of about \(30 \%\) to \(40 \%\) is realized with this technique. This was discussed in detail in Chapter II and Chapter III.
The master oscillator from which the system clock is derived is a 25 MHz crystal controlled oscillator. Phase \(1\left(\phi_{1}\right)\) of the system clock cycle (Figures 10 and 12) is programmed to be 2, 3, 4 or 5 times the 40ns fundamental period of the oscillator. The duration of \(\phi_{2}\) of the system clock is 80 ns . Since main memory will rarely be as fast as 80 ns access time, a method to allow system memory cards to lengthen \(\phi_{2}\) is also provided with the \(\overline{\text { STR }}\) bus signal. When the STR signal is low, the Am74S161 is disabled from counting and the state of the clock will not change until it is released and it counts out normally.
The conventions regarding the system clock are very simple and were chosen as the easiest to interface with a variety of memory and \(I / O\) devices.
All machine cycles begin when the system clock goes HIGH. The period of time that the clock remains at a HIGH logic level is called \(\phi_{1} . \phi_{2}\) is the period that it is LOW. During all memory access (and I/O since I/O is memory mapped), the processor guarantees that all address lines and control bus signals ( \(R / \bar{W}\), VMA, WP, etc.) are valid and stable at least 20 ns before the end of \(\phi_{1}\). In other words, the CPU must make all bus signals valid at least 20 ns before \(\phi_{2}\) begins.
Depending upon the addressing mode being used, the processor will require more or less time to make all necessary signals to the system bus and memory cards valid.
For example, indexed addressing requires an arithmetic operation from the Am2901B's rather than logical operations or a direct pass, therefore indexed addressing is bound to take slightly longer than immediate, direct, or pointer addressing.

It is for these indexed operations and some others that \(\phi_{1}\) can be lengthened in 40 ns increments by microword bits \(\mathrm{ST}_{1}\) and \(\mathrm{ST}_{0}\). So the processor controls the system clock during \(\phi_{1}\) to meet its requirements. When there is no memory access, the minimum 80 ns for \(\phi_{1}\) is generally more than adequate. Simple addressing modes require \(80 \mathrm{~ns}-120 \mathrm{~ns}\). The most complex addressing modes can take 160 ns to 190 ns using the worst case specs for all IC's in the address generation path.
At the end of \(\phi_{1}\) (the beginning of \(\phi_{2}\) ), the processor relinquishes control of the system clock to the memory or I/O device that is being accessed. Since \(I / O\) is mapped into normal memory space, there is only one set of timing rules for both memory and \(\mathrm{I} / \mathrm{O}\) accesses. If no more than 80 ns is required to properly complete the read or write operation, then \(\phi_{2}\) will last only 80 ns . But
the access time of most main memory cards will be greater than 80 ns so a way of increasing the duration of \(\phi_{2}\) is provided with STR bus signals.
If this signal (STR) is pulled LOW within the first 50 ns of \(\phi_{2}, \phi_{2}\) will be lengthened by 40 ns for every 40 ns that \(\overline{\text { STR }}\) is held LOW. Thus \(\phi_{2}\) can be extended indefinitely to match the access time of the device being addressed. Naturally this input should be driven by open collector outputs so that all cards can share the one STR line.
Though the STR signal is intended to be used during \(\phi_{2}\) on memory reference cycles, it works in an identical fashion during \(\phi_{1}\). This can be used to advantage by DMA controllers that require more than 60 ns to generate valid address, data, or control signals on transparent DMA cycles.
A jumper option on the microprogrammable system clock allows the default period of \(\phi_{2}\) to be increased from 80 ns to 120 ns on memory reference cycles only. This is useful in systems where no memory or I/O devices have access times of 80 ns or less, and/or when more than 50 ns is required to pull STR LOW to lengthen \(\phi_{2}\). Figure 16 is a table of the \(\phi_{1}\) and default \(\phi_{2}\) periods available with the microprogrammed clock on the HEX-29 CPU.
\begin{tabular}{|c|c|c|c|c|c|}
\hline ST1 & STO & \(\overline{\text { VMA }}\) & \(\phi_{1}\) Period & \begin{tabular}{c} 
Default \\
\(\phi_{2}\) Period
\end{tabular} & \begin{tabular}{c} 
Default \(\phi_{2}\) \\
Period with VMA \\
Option Jumpered
\end{tabular} \\
\hline 1 & 1 & 1 & 80 ns & 80 ns & 80 ns \\
1 & 1 & 0 & 80 ns & 80 ns & 120 ns \\
1 & 0 & 1 & 120 ns & 80 ns & 80 ns \\
1 & 0 & 0 & 120 ns & 80 ns & 120 ns \\
0 & 1 & 1 & 160 ns & 80 ns & 80 ns \\
0 & 1 & 0 & 160 ns & 80 ns & 120 ns \\
0 & 0 & 1 & 200 ns & 80 ns & 80 ns \\
0 & 0 & 0 & 200 ns & 80 ns & 120 ns \\
\hline
\end{tabular}

Figure 16. Microprogrammed System Clock Timing.

\section*{Microprogram Control}

The microprogram control section (Figure 17) of the HEX-29 CPU performs several functions; they are:
1. System reset and initialization
2. Interrupt and halt control
3. Machine level instruction to microinstruction mapping
4. Microinstruction sequencing and microsubroutining
5. Invalid Access Memory Management Trap

When the system reset button or keyswitch is closed, the input to a one-shot is pulled LOW. When it is released, the rising edge triggers a \(500 \mu \mathrm{~s}\) pulse. This is synchronized with the system by gating it through a flip-flop driver by system clock. The resulting signal is used to zero the outputs of the Am2909 microprocessor sequencer. Thus, when the one-shot times out, the microprogram will begin execution at microaddress 000 . The microcode needed to initialize the system is stored at this and the following several microaddresses and assures the proper system start-up.
Each time a machine level instruction is fetched, the microprogram control logic checks for a hardware interrupt or halt signal from the system bus. If either signal is active, the microprogram branches to the appropriate microinstruction address to execute the appropriate microcode to service the request. The interrupt routine will buffer user registers, switch to supervisor mode, and call a machine level routine through a vector table element as defined by the priority level of the interrupt. If the halt


Figure 14 a .



Figure 14b.



signal is pulled LOW, the external system bus is released to DMA devices or refresh controllers until the halt bus line is released and the program continues execution.
When an instruction has been fetched and there are no interrupts or halt signals pending, the microprogram must begin executing microinstructions at a new microaddress. This microaddress is a function of the machine instruction to be executed. The "mapping" of the machine level instruction into a microaddress is done courtesy of the Am27S29 instruction decode PROM's. The opcode is placed on the PROM address lines and the microaddress appears at the outputs which are connected to the direct inputs to the Am2909's. The Am2909's simply pass this microaddress to the microword memory by executing a Branch to Address on direct inputs function.

This, and all other microprogram sequencer operations are selected by the outputs of the microprogram branch PROM which is driven by microword bits. This PROM, an AM27S21 contains the output combinations required to execute a variety of microprogram control functions including microbranching, microsubroutining, and two-way microbranching either unconditionally or upon condition code bits selected by microword bits. The function code for this PROM is shown in Figure 18.
As part of the multi-user, multi-task time sharing capabilities, the HEX-29 CPU provides an invalid memory access trap. In this structure, the executive program can assign any unused page of user memory space as either non-existent (transparent) or as an
\begin{tabular}{|c|c|}
\hline Address & Function \\
\hline 0 & BR C \(=0\) or continue \\
\hline 1 & BR C \(=1\) or continue \\
\hline 2 & BR V \(=0\) or continue \\
\hline 3 & BR V \(=1\) or continue \\
\hline 4 & BR N \(=0\) or continue \\
\hline 5 & BR \(\mathrm{N}=1\) or continue \\
\hline 6 & BR \(\mathrm{Z}=0\) or continue \\
\hline 7 & BR Z \(=1\) or continue \\
\hline 8 & BR H \(=0\) or continue \\
\hline 9 & BR H=1 or continue \\
\hline A & BR LZ \(=0\) or continue \\
\hline B & BR LZ \(=1\) or continue \\
\hline c & BR HLT \(=0\) or continue \\
\hline D & BR HLT \(=1\) or continue \\
\hline E & BR \(\mathrm{IH}=0\) or continue \\
\hline F & BR \(\mathrm{HH}=1\) or continue \\
\hline 10 & BR \\
\hline 11 & Not used \\
\hline 12 & CALL \\
\hline 13 & Not used \\
\hline 14 & CALL \(\mathrm{N}=0\) \\
\hline 15 & Not used \\
\hline 16 & RTS \(\mathrm{Z}=1\) \\
\hline 17 & Not used \\
\hline 18 & RTS \\
\hline 19 & Not used \\
\hline 1 A & Not used \\
\hline 1 B & Not used \\
\hline 1 C & Not used \\
\hline 1 1 & Not used
BRMAP IH \(=0\) or BR \\
\hline 1 F & CONTINUE \\
\hline
\end{tabular}

Figure 18. Microprogram Sequencer Branch Code.
invalid access area. If any user instruction attempts to access memory in a page that has been assigned as an invalid access page, the microprogram control logic takes action.
Before the current machine cycle completes, the next instruction address is forced to the highest value in the current 512 -word microword block using the Am2909 OR inputs. At this point a microbranch to the invalid access trap microroutine is performed. The invalid access is processed just like another (highest) level of hardware vectored interrupt except that the current machine level instruction does not complete before the microprogram recognizes and acts upon the condition.

\section*{MICROWORD MEMORY}

Any number of memory device types could have been chosen for the microword memory in the HEX-29 CPU. RAM has the advantage that it is dynamically alterable, but if this feature is utilized much more hardware support would have been necessary and the overall cost increased significantly. Besides, the effect of writable control store can be simulated with fixed memory devices by microcode bank switching at much lower cost and complexity if the feature is desirable. For development of new microcode routines, RAM writable control store in the address space of another computer system offers many advantages. This is particularly true if the other computer happens to support a microassembler and file management system as does the System 29.*
Though EROM's and EAROM's are also viable microword memory devices for microcode development, they are much too slow to make efficient use of the rest of the high speed microprogrammed processor in the production device.
Fuse-link bipolar PROM's are the only viable microword memory devices for production systems for a variety of reasons. They are very fast, ( 45 ns maximum access on the HEX- 29 CPU), small ( \(512 \times 8\) in 20 pins), less expensive than fast RAM, and more flexible than a mask ROM would be. It is a simple matter to alter or extend the microprogram of commercial systems with fuse-link PROM microword memory.
As mentioned, the microword memory of the HEX-29 is composed of AM27S29 \(512 \times 8\) fuse-link PROM's and is shown in Figure 19. These space efficient 20 pin parts have worst case access times of 45 ns over the commercial temperature and voltage range. Up to 4 k of microword memory can be addressed by the set of three Am2909 microprogram sequencers on the CPU card. Space for up to \(2 k\) of microword memory PROM's is available on the HEX-29 CPU card. Though a perfectly adequate instruction set can be coded in less than 512-words of microword memory, the HEX-29 has a very extensive high level instruction set including 16 and 32-bit integer and 64-bit floating point ADD, SUB, MUL, DIV, CMP, and extensive buffering instructions. In addition to the extremely complete numeric processing package, numerous nibble, character, byte, and word macroinstructions are implemented with scans, linked and unlinked searches, block moves, and etc. A stack processor is a subset of this more than complete instruction set. For all of the capabilities of the HEX-29 CPU, less than 1.5 k of microword memory was required. Thus, more than 0.5 k of space remains for future expansion by the user before a larger PC card is needed (extremely unlikely).

Connections for the microword data, address and select lines are available at connectors at the top of the HEX-29 CPU card. Thus, it is quite straightforward to support off-board microword memory.

\footnotetext{
*System 29 is a development system for microprogrammed systems avaılable from Advanced Micro Computers.
}


It is even perfectly reasonable to use an off-board writable control store with up to \(2 k\) of microword RAM concurrently with up to \(2 k\) of PROM resident on the PC card.

If the on board PROM contains an instruction set, it is then a simple matter to use the off board writable control store to develop new microcode for the machine interactively on the one HEX-29 system!
The outputs of the microword memory devices are attached to the inputs of Am74S374 registers. These registers are called the pipeline registers since they allow the fetching of the next microinstruction concurrently with execution of the current one. Clocking of the pipeline registers occurs on the LOW-to-HIGH transition of the system clock. The outputs of the pipeline registers are the 64 microword (or pipeline) bits that control every aspect of the processor.
These 64 bits can be logically grouped into several functional fields a follows:
1. Microword Data/Microbranch Address and Control
2. ALU Source Select
3. ALU Destination Select
4. ALU Function Select
5. ALU Carry In Select
6. Shift Linkage Select
7. ALU A and B Specifications
8. A and B Fields Select
9. Enable onto ALU inputs Select
10. Latch External Data Inputs
11. Latch CPU Outputs
12. Control Bus Signals
13. Microprogrammed Clock
14. Condition Code Controls
15. Enable Interrupt Circuitry
16. Memory Map Control

Notice that with the exception of the microword data and microbranch address and control fields, no other fields are overlapped. This is a 'horizontally' structured design. Overlapping several fields leads to 'vertically' structured systems. This latter class of machines can save some microword memory, but only at the expense of through-put and increased hardware complexity. Now that the cost of the PROM's has come down significantly, the savings accrued from using a vertically structured design approach is generally insignificant when compared with the overall system cost.
A summary of the functions of the microword bits is shown in Figure 20.

\section*{Am2901B ALU/REGISTER SETS}

The heart of the HEX-29 CPU is the set of four Am2901B bit slice ALU/Register Sets depicted in Figure 21. All arithmetic and logical operations are performed in these bipolar LSI IC's, including address generation. The user accessible set of 16 registers and routing functions are also internal to these remarkable and extremely versatile chips.

The operation of these units, though very elegant and comprehendible, is too lengthy to include here and the user is referred to the Am2900 Family Data Book by AMD.
Carry lookahead is accomplished by the Am2901B's and an external IC, the Am2902A. Shift control is partially within the Am2901B's and is supported by other external circuitry to be discussed later.

A summary sheet of the Am2901B ALU functions appears on page 29 but should be supplemented by studying the AMD literature already mentioned. A good supplement is the AMD Schottky and Low Power Schottky Handbook.
The A and B input fields to the Am2901B's are multiplexed by 4 Am74S253's in the following four ways.
\begin{tabular}{ll} 
Am2901B B Inputs & Am2901B A Inputs \\
\(\mu\) word Memory & \(\mu\) word Memory \\
Upper Nibble ABL & Upper Nibble ABL \\
Lower Nibble ABL & Lower Nibble ABL \\
Upper Nibble ABL & Lower Nibble ABL
\end{tabular}
\(A B L=A, B\) Latch (On data bus bits 27-20.)

\section*{CARRY IN CONTROL}

The arithmetic carry-in \(\left(\mathrm{C}_{\mathrm{N}}\right)\) signal (Figure 22) to the Am2901B bit slices can be selected from four sources as follows:
1. Logic 0 (No carry-in add instruction, borrow in subtract instruction.)
2. Logic 1 (Carry-in in add instruction, no borrow in subtract.)
3. Carry Flag (C bit in condition code register.)
4. Q Shift Bit (Double length shifts.)

Note that the natural state of the Carry Flag output from the Am2901B is 1 for carry on add, 0 for no carry on add, 1 for no borrow on subtract, and 0 for borrow on subtract. This convention has been maintained in the condition code and carry in logıc. Some other machines operate differently with respect to this convention, but others do not and the HEX-29 maintains the faster convention for lack of a good reason to alter it. Some programmers will be required to remember this convention while others will be used to it.

\section*{SHIFT AND ROTATE LINKAGE}

The shift and rotate linkage (Figure 23a) of the HEX-29 is composed of an Am74S253 and an Am74LS125 plus the internal shift control structure of the Am2901B's. The functions that can be performed by this circuitry are shown in Figure 23b.
The solid lines in Figure 23b delineate the basic shift linkages. The dotted lines are optional linkages which can also be enabled. With these linkages, all of the normal shifts and rotates can be performed plus a number of double word shifts including special shifts for high speed multiplies and divides.

\section*{CONDITION CODE CONTROL}

The condition code register shown in Figure 24 of the HEX-29 has eight flags. The definitions and placement of these flags are defined in Figure 25.
In addition to the very useful and farly common C, V, N, Z flags, a half sign is provided for easier byte processing. The three user flags are not changed by any of the normal arithmetic or logical operations. However, they can be read by the processor and written by the processor with special instructions such as load flags, read flags, set bits in flags, clear bits in flags, invert bits in flags. The fact that none of the user flags is changed by any but this type of special routine is very significant. It means that varous routines and program segments can pass flags back and forth freely without fear of modification or restriction on the instructions that can be executed. Reading the condition code flags into the processor, or branching or subroutining upon combinations of bits set or clear does not alter the flags.


Figure 20A. Microword Bits.


Figure 20B. Am2909 Microprogram Branch Control, Bits 12-16.
\begin{tabular}{|l|c|c|c|l|}
\hline & LCC & LCU & LNZ & \\
\hline LCN & 0 & 0 & 0 & New CVNZH \\
\hline LC & 0 & 0 & 1 & New CV Old NZH \\
\hline LN & 0 & 1 & 0 & Old CV New NZH \\
\hline (Nom.) & 0 & 1 & 1 & Old CVNZH \\
\hline BCC & 1 & 0 & 0 & Bus \(\rightarrow\) CVNZHV \\
\hline & 1 & 0 & 1 & Bus \(\rightarrow\) CV Old NZH \\
\hline & 1 & 1 & 0 & Shift Old V Bus \(\rightarrow\) NZHU \\
\hline LSC & 1 & 1 & 1 & Shift C Old V Old NZH \\
\hline
\end{tabular}

Figure 20C. Condition Code Manipulation, Bits 45-47.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline ALU & CIB & CIA & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 \\
\hline \multirow[t]{2}{*}{0} & \[
\begin{aligned}
& 0 \\
& 0
\end{aligned}
\] & \[
\begin{aligned}
& 0 \\
& 1
\end{aligned}
\] & \[
\begin{gathered}
A+Q \\
A+Q+1
\end{gathered}
\] & \[
\begin{gathered}
A+B \\
A+B+1
\end{gathered}
\] & \[
\begin{gathered}
Q \\
Q+1
\end{gathered}
\] & \[
\begin{gathered}
B \\
B+1
\end{gathered}
\] & \[
\begin{gathered}
A \\
A+1
\end{gathered}
\] & \[
\begin{gathered}
D+A \\
D+A+1
\end{gathered}
\] & \[
\begin{gathered}
D+Q \\
D+Q+1
\end{gathered}
\] & \[
\begin{gathered}
D \\
D+1
\end{gathered}
\] \\
\hline & 1 & 0 & \(A+Q+C\) & \(A+B+C\) & \(Q+C\) & \(B+C\) & A + C & \(D+A+C\) & \(D+Q+C\) & \(D+C\) \\
\hline \multirow{3}{*}{1} & 0 & 0 & \(Q-A-1\) & \(B-A-1\) & Q-1 & B-1 & A - 1 & \(A-D-1\) & \(Q-D-1\) & -D-1 \\
\hline & 0 & 1 & Q - A & B - A & Q & B & A & A - D & Q - D & - D \\
\hline & 1 & 0 & \(Q-A-C\) & \(B-A-C\) & Q - C & B - C & A - C & A-D-C & Q - D - C & -D-C \\
\hline \multirow{3}{*}{2} & 0 & 0 & \(A-Q-1\) & \(A-B-1\) & \(-Q-1\) & \(-\mathrm{B}-1\) & \(-A-1\) & \(D-A-1\) & \(D-Q-1\) & D-1 \\
\hline & 0 & 1 & A-Q & A-B & -Q & - B & - A & D - A & D - Q & D \\
\hline & 1 & 0 & \(A-Q-C\) & \(A-B-C\) & \(-Q-C\) & \(-\mathrm{B}-\mathrm{C}\) & \(-A-C\) & \(D-A-C\) & \(D-Q-C\) & D - C \\
\hline 3 & - & - & AVQ & \(A \vee B\) & Q & B & A & DVA & DVQ & D \\
\hline 4 & - & - & A \(\wedge Q\) & \(A \wedge B\) & 0 & 0 & 0 & D^A & D^Q & 0 \\
\hline 5 & - & - & AへQ & \(A \wedge B\) & Q & B & A & D^A & D^Q & 0 \\
\hline 6 & - & - & \(A \forall Q\) & \(A \forall B\) & Q & B & A & \(D \forall A\) & \(D \forall Q\) & D \\
\hline 7 & - & - & \(A \forall Q\) & \(A \forall B\) & Q & B & A & \(D \forall A\) & D \(\forall\) Q & D \\
\hline
\end{tabular}

Figure 20D. Am2901 Source, Carry-in \& Function Select, Bits 20-27.
\begin{tabular}{|c|ll|}
\hline DST & \multicolumn{2}{|c|}{ Rotates } \\
\hline 0 & \(\mathrm{~F} \rightarrow \mathrm{Q}\) \\
\hline 1 & NONE \\
\hline 2 & \(\mathrm{~F} \rightarrow \mathrm{~B}\) & \(\mathrm{~A} \rightarrow \mathrm{Y}\) \\
\hline 3 & \(\mathrm{~F} \rightarrow \mathrm{~B}\) & \\
\hline 4 & RIGHT & \(\mathrm{F} / 2 \rightarrow \mathrm{~B} \quad \mathrm{Q} / 2 \rightarrow \mathrm{Q}\) \\
\hline 5 & RIGHT & \(\mathrm{F} / 2 \rightarrow \mathrm{~B}\) \\
\hline 6 & LEFT & \(2 \mathrm{~F} \rightarrow \mathrm{~B} \quad 2 \mathrm{Q} \rightarrow \mathrm{Q}\) \\
\hline 7 & LEFT & \(2 \mathrm{~F} \rightarrow \mathrm{~B}\) \\
\hline
\end{tabular}

Figure 20E. Am2901 Destination Codes, Bits 28-30.
\begin{tabular}{|l|l|l|}
\hline & Right & Left \\
\hline 0 & MUL & RCL \\
\hline 1 & ROR & ROL \\
\hline 2 & ASR & DRL \\
\hline 3 & LSR & LSL \\
\hline
\end{tabular}

Figure 20G. Shift \& Rotate Control, Bits 18-19.
\begin{tabular}{|c|c|c|}
\hline ABMUX & A reg & B reg \\
\hline 0 & \(\mu \mathrm{~W}_{\mathrm{A}}\) & \(\mu \mathrm{W}_{\mathrm{B}}\) \\
\hline 1 & \(\mathrm{R}_{\mathrm{S}}\) & \(\mathrm{R}_{\mathrm{D}}\) \\
\hline 2 & \(\mathrm{R}_{\mathrm{S}}\) & \(\mathrm{R}_{\mathrm{S}}\) \\
\hline 3 & \(\mathrm{R}_{\mathrm{D}}\) & \(\mathrm{R}_{\mathrm{D}}\) \\
\hline
\end{tabular}

Figure 20F. Am2901 A, B Field Selects, Bits 40-41.
\begin{tabular}{|c|c|}
\hline STR & CLOCK \\
\hline 0 & 280 ns \\
\hline 1 & 240 ns \\
\hline 2 & 200 ns \\
\hline 3 & 160 ns \\
\hline
\end{tabular}

Figure 20H. Microprogrammed System Clock Stretch, Bits 42-43.

Figure 21.






Figure 25.

Eight condition code operations provide all the useful operations needed for complete flexibility. They are shown in Figure 26a and 26 b in two different formats. Note that the codes are grouped into three categories; arithmetic ( C and V ), logıcal/arithmetic ( \(\mathrm{N}, \mathrm{Z}, \mathrm{H}\) ) and user \(\left(\mathrm{U}_{2}, \mathrm{U}_{1}, \mathrm{U}_{0}\right)\).
These eight conditions include all the necessary and desirable features such as updating only the shift carry bit and the ability to do operations that read, operate on, and reload the condition code register all in one machine cycle (160ns). Also, a feature of immense importance where microcoded floating point or fixed point math is concerned is the ability to update flags on a cycle by cycle basis! An unusual feature.
\begin{tabular}{|r|l|l|l|}
\hline & \multicolumn{1}{|c|}{\begin{tabular}{c} 
Carry/ \\
Overflow C, V
\end{tabular}} & \multicolumn{1}{|c|}{\begin{tabular}{c} 
Negative/Zero/ \\
Half N, Z, H
\end{tabular}} & \multicolumn{1}{|c|}{\begin{tabular}{c} 
User Flags \\
U2, U1, U0
\end{tabular}} \\
\hline 7 & Shitt Bit C,V V & No Change & No Change \\
\({ }^{*} 6\) & Shitt Bit C,V V & Load From Bus & Load From Bus \\
\({ }^{*} 5\) & Load From Bus & No Change & No Change \\
4 & Load From Bus & Load From Bus & Load From Bus \\
3 & No Change & No Change & No Change \\
2 & No Change & Update & No Change \\
1 & Update & No Change & No Change \\
0 & Update & Update & No Change \\
\hline
\end{tabular}
*Less useful than other codes but perfectly legal.
Figure 26A.
\begin{tabular}{|l|c|c|c|c|c|c|c|c|}
\hline Name & U2 & U1 & U0 & H & Z & N & V & C \\
\hline Shift MSb or LSb into C & NC & NC & NC & NC & NC & NC & NC & S \\
*Shift into C - & & & & & & & C & \\
Bus Load Rest & B & B & B & B & B & B & NC & S \\
*Bus Load C \& V Flags & NC & NC & NC & NC & NC & NC & B & B \\
Bus Load All Flags & B & B & B & B & B & B & B & B \\
No Changes & NC & NC & NC & NC & NC & NC & NC & NC \\
Update N, Z, H Flags & NC & NC & NC & \(\mu\) & \(\mu\) & \(\mu\) & NC & NC \\
Update C and V Flags & NC & NC & NC & NC & NC & NC & \(\mu\) & \(\mu\) \\
Update C, V, N, Z, H & NC & NC & NC & \(\mu\) & \(\mu\) & \(\mu\) & \(\mu\) & \(\mu\) \\
\hline
\end{tabular}

\footnotetext{
\(\mu=\) updated, \(\mathrm{NC}=\) unchanged, \(\mathrm{B}=\) loaded from internal bus,
\(S=\) Shift Bit
}

Figure 26B.

\section*{Am2901B OUTPUT BUS}

Being a highly structured, modular device, the HEX-29 CPU is very bus oriented. The output bus of the Am2901B's generate the addresses and data to the rest of the system devices as well as some internal function. The four logical units on this bus (shown in Figure 27) are:
1. Address Out Latches - (System Address bus)
2. Data Out Latches-(System data bus)
3. Memory Map/Latches-(Memory Management Features)
4. Condition Code MUX - (For updating flags from processor)

Any memory reference requires that an address be valid on the system address bus. The source of this address is generally one of the Am2901B internal registers or modifications thereof from previous fetch cycles (such as indexed addressing).
On a write cycle, data must be placed on the system data bus. This is accomplished in the same manner as address generation except that a different microword bit is used to activate the data latches.
In a multi-user/multi-task/timesharing environment, it is desirable to have a powerful memory management scheme. The HEX-29 CPU implements this via a flexible memory mapping system where the upper four bits of the 16-bit address generated by the Am2901B's are 'mapped' into seven address bits and a write protect bit. Invalid access traps and one Megabyte address space are integral features of this system. The loading of this MAP RAM (2 Am29701's) is also accomplished via the Am2901B output bus.
Another important characteristic of the HEX-29 CPU is its ability to read, write, test and operate upon the eight condition code flags in the byte form. All eight flags can be written to by the Am2901B's at one time, in one microcycle. This is very useful for many flag operations and is absolutely necessary for efficient updating of the user flags for interroutine parameter and condition passing.
The logic of these bussed systems is quite simple. A separate microword bit or bit field is used to cause each of these logical units on the bus to accept the data bus. Therefore, simple microprogramming techniques are applicable to this busing approach.

\section*{Am2901B INPUT BUS}

Much of the power and modularity of the HEX-29 design is due to the highly structured bus approach on the Am2901B Data Inputs. The logical units that can drive this bus (Figure 28) are listed below:
1. Data Input Registers
2. Swap Input Registers
3. Microword Data Registers
4. Clear Upper Byte/Clear Lower Byte/Bit Op Logic
5. Condition Code Register

Data input from the system bus is captured in the data input registers and the swap input registers. The data input registers bring the upper and lower bytes of the data bus to the corresponding bytes in the Am2901B cascade while the swap input registers switch the upper byte of the data bus to the lower byte on the Am2901B cascade and the lower byte of the data bus to the upper byte of the Am2901B cascade.
Additionally, logic to set all bits in the upper or lower byte to zeros, (clear upper byte and clear lower byte), allow selecting arithmetic or logical zeros in either byte field. If the bit set option is enabled, all bits are pulled low except the one selected by the hexadecimal value in the low nibble of the nibble latch from an instruction or other data source.
All eight condition code bits can be enabled onto the low byte if desired. All flags can thus be sampled by the Am2901B's at once.

Figure 27.


Data from microword memory from three-state regısters in parallel with the pipeline register can be enabled onto the upper and lower bytes for direct loading of the Am2901B's from microprograms.
In the absence of any device being enabled onto a particular byte on this bus, it will be pulled up into a logic 1 state. This can be useful for masking in logical operations and filling or biasing in arithmetic operations.
An important factor in the flexibility of this approach on the HEX-29 is that the upper and lower bytes of the data in registers, swap in registers, and the clear upper/lower byte logic are separately enabled. Also, the condition code register only drives the lower byte and the pull-up feature will operate on etther byte individually. Thus the upper and lower bytes can be individually driven on a 'mix and match' basis from several sources.
The versatility so generated allows numerous fast processing modes. See Table 5 for a list of all of the possible combinations of high and low byte inputs to the 2901B's.

TABLE 5.
\begin{tabular}{|l|l|}
\hline \multicolumn{1}{|c|}{\begin{tabular}{c} 
Into Upper Byte \\
of Am2901's
\end{tabular}} & \multicolumn{1}{c|}{\begin{tabular}{c} 
Into Lower Byte \\
of Am2901's
\end{tabular}} \\
\hline 0. Microword memory bits P15-P8 & Microword memory bits P7-P0 \\
1 \begin{tabular}{l} 
Bit set value (upper byte) \\
2 Upper byte - data bus \\
3. Upper byte - data bus \\
4. Upper byte - data bus \\
5. Upper byte - data bus
\end{tabular} & Bit set value (lower byte) \\
6. Upper byte - data bus & Upper byte - data bus - data bus \\
7. Lower byte - data bus & Clear lower byte \\
8. Lower byte - data bus & All high generator \\
9 Lower byte - data bus & Lower byte - data bus \\
10. Lower byte - data bus & Upper byte - data bus \\
11 Lower byte - data bus & Clear lower byte \\
12. All high generator & All high generator \\
13. All high generator & Condition code register \\
14 All high generator & Lower byte - data bus \\
15 All high generator & Clear byte - data bus \\
16 All high generator & All high generator \\
17 Clear upper byte & Condition code register \\
18. Clear upper byte & Lower byte - data bus \\
19. Clear upper byte & Upper byte - data bus \\
20. Clear upper byte & All high generator \\
*21. Clear upper byte & Conditon code register \\
\hline
\end{tabular}
*Note: Interestingly enough this is the only case in the entire table that the hardware CANNOT generate on the bus, but IS the ONLY one of these codes that CAN be generated by the AMD 2901B slices! (How convenient!)

\section*{Examples of uses for some of these modes include:}
1. Clearing upper byte for 8 -bit index offset
2. Fast bit set/clear/test/invert operations
3. Set upper byte high to AND lower byte with upper byte change
4. Clear upper byte to AND off upper byte and operate lower
5. Upper byte of data bus to lower byte for all byte ops on upper byte
6. Load defined values from microcode for tamper-proof constants, vectors, etc.
7. Normal data input or address input without swap or modification.
8. Clear upper byte and data in low-bite immediate ops, etc.

\section*{INTERRUPT CONTROL}

The powerful maskable priority vectored interrupt system (Figure 29) of the HEX-29 is a direct derivative of the incredible Am2914 bipolar LSI interrupt control IC. This circuit is so well integrated that it uses only one microword bit and requires very little support circuitry. The general set of operations that can be executed by the Am2914 is shown below. For more detailed information on this chip see the Am2900 Family Data Book.

\section*{F. Enable Request}
E. Load Mask Register
D. Disable Request
C. Clear Mask Register
B. Bit Set Mask Regıster
A. Bit Clear Mask Register
9. Load Status Register
8. Set Mask Register
7. REad Mask Register
6. Read Status Register
5. Read Vector
4. Clear Interrupts Last Vector Read.
3. Clear Interrupts via M Register
2. Clear Interrupts via M Bus
1. Clear all Interrupts
0. Master Clear

Flow charts of the actions taken in microcode by the HEX-29 CPU are shown in Figure 30 and Figure 31.

\section*{DMA CONTROL}

The DMA structure is quite straightforward. There are eight ac-tive-LOW DMA request lines and eight corresponding DMA acknowledge lines. The highest priority requesting a DMA cycle at the beginning of the microcycle before DMA will be allowed gets an acknowledge signal that lasts up untIl the DMA cycle - at least.
If no devices are requesting DMA, the \(\overline{N R Q}\) (no request) bus signal goes LOW. This is an excellent opportunity for dynamic RAM circuitry to refresh sequential rows on each DMA cycle that \(\overline{\mathrm{NRQ}}\) is LOW.
Another input signal \(\overline{\text { DDMA }}\), will override all priorities and not acknowledge any level of DMA request. This could be used by dynamic RAM refresh circuitry when it must be permitted to refresh itself soon or chance losing data.
Many schemes of DMA handling can be accomplished with this simple and uncomplicated priority controlled system. An Am74S374 captures the DMA requests (Figure 32) on a cycle by cycle basis. An Am2913 prioritizes these requests and acknowledges the highest level request with a three-bit binary code. An Am74S138 expands this to the eight bits of DMA acknowledge that correspond to the eight input bits. The Am2913 supplies the \(\overline{\mathrm{NRQ}}\) bus signal and provides for the \(\overline{\mathrm{DDMA}}\) bus signal.

Figure 29.


Figure 30.


Figure 31.

Figure 32.

\section*{SYSTEM BUS INTERFACE EXAMPLE HEX-64KBS STATIC MEMORY CARD}

It was possible to design the system bus to be very simple to work with because the HEX-29 is a microprogrammed device. The following section discusses an implementation of a 64 k byte static memory card for the HEX-29 system bus using Am9124 memory ICs. The purpose is to show that designing cards that interface with the HEX-29 system bus is relatively easy. Note that a design for I/O devices would be similar to this implementation since I/O devices are memory mapped and share exactly the same set of bus signals and timing requirements.

Starting from the left hand of the schematic shown in Figure 33, we find that the low 13 bits of the address bus and the four control bus signals (CLK, \(\overline{V M A}, R / \bar{W}\), and WP) are buffered from the system bus by two Am74S240 ICs and three sections of an Am74S244. These are inverting and non-inverting buffers respectively, and offer extremely high current drive ( 64 mA sink current) and very high speed ( \(\sim 4\) to 6 ns ) with only very light bus loading ( \(400 \mu \mathrm{~A}\) low level).

Ten of the address lines buffered by these ICs then drive the address lines of half of the memory array through series type termination resistors. These resistors ( \(\sim 33\) ohms) serve to prevent undershooting zero volts by more than the permissible 0.5 V on negative edge transitions of the address lines. This type of termination has the advantage that it does not draw current from the driver ICs; it is highly recommended over split termination for memory arrays where current loading is negligible, but capacitive loading is significant. Note that to further reduce these capacitive loading effects, the address lines of only half of the memory array are driven by one set of buffers. (Find the second set of Am74S240 address buffers at the far right of the schematic.)

The remaining 3 address lines that were buffered by the Am74S240s drive the A, B, and C inputs of (4) Am74S138 one-of-eight decoders. These ICs develop the 321 k word chip selects that enable the appropriate Am9124 memory ICs for read and write operations when they are addressed.

Of course only one of the Am74S138 ICs should be enabled when the board is addressed. This is a function of the higher address lines, A18-A13. Since each Am74S138 is able to select 1k word blocks of memory, each Am74S138 should be addressable on 8k word boundaries. Decoding the upper address lines (A18-A13) to match selectable 8 k boundary addresses is accomplished with four Am25LS2521 8-bit equal to comparators, one for each Am74S138.

The DIP switches on the right hand side of each Am25LS2521 define the conditions under which the corresponding Am74S138 will be selected. When the eight inputs on the left hand side of these chips correspond to the values set on the DIP switch on the
right, the Am74S138 is enabled. Note that the VMA bus signal (Valid Memory Access) must be LOW to enable the Am25LS2521. Also note that each 8 k word bank can be unconditionally removed from the system memory space by leaving the lowest DIP switch open. Thus the board may be filled in 8 k word increments if desired.

The Am74S138 ICs are also enabled by the system clock via the CLK signal. Therefore, memory chip selects can only occur during the time that the system clock is LOW (called \(\phi_{2}\) ). The importance of this will be discussed shortly. Another signal that must be valid for these ICs to be enabled is the DIS Signal. Whenever the R/W signal is LOW (indicating a write) and WP (write protect) is HIGH (protect the memory), then the DIS signal is brought LOW. This disables the Am74S138's and blocks the selecting of any memory ICs, thereby write protecting all on-board memory.
Above the memory array on the schematic are the data bus buffers, one set for each half. Again, this is done to reduce capacitive loading, this time on the data lines. Am74S373 octal tri-state latches are used for all eight of these data buffers. The enable inputs are driven by the inversion of the system clock bus signal so that they are transparent during all of \(\phi_{2}\), which is when the data is transferred. The appropriate Am74S373 latches are turned on ( \(\overline{\mathrm{OE}} \mathrm{LOW}\) ) during read and write signals so that the data is buffered in the proper direction.
The Am26S02 one-shot is used to stretch \(\phi_{2}\) of the system clock to meet the access time of the memory. Without this signal, \(\phi_{2}\) would last only 80 ns and the access time specifications of the Am9124 memory ICs would not be met. The Am26S02 is activated whenever memory ICs on the board are addressed when the system clock enters \(\phi_{2}\) (negative edge). Once fired, the duration of \(\phi_{2}\) is stretched by 40 ns for every 40 ns that the STR bus signal is held LOW. Since the Am9124 EPC memory devices have an access time of 200 ns worst case, \(\phi_{2}\) must be stretched by 120 ns .

\section*{Summary}

As can be seen, the HEX-29 16-bit design represents a simple, straightforward design approach to building a high-performance 16-bit processor. This design takes advantage of many of the features of the Am2901 and Am2909. The instruction set shown in this application note is intended to be representative of the more common types of instructions to be executed on a machine of this class. In addition, microcode could be developed to execute a great many additional instructions as well as other classes of instruction such as entire floating point package. This design utilizes microprogram control throughout, and is a good demonstration of parallel microprogramming in a most straightforward application.

AMD wishes to thank Mr. Mike Simmons and Mr. Lee McDonald of HEX for their work on this invited paper as a part of this application note series.

\section*{APPENDIX}

\section*{HEX-29 Microcode}

This appendix contains 256 words of HEX-29 microcode. The first part is a definition file which defines the HEX-29 hardware structure for the AMDASM \({ }^{\text {TM }}\) assembler. The various inputs to the Am2901 are defined via equates while all other microword fields are literally defined. The second part is the assembly file which symbolically, via terms defined in the definition phase, constructs each microword. Each microword begins with an optional label (such as RESET:). Next is the Am2909 branch control field, followed by all of the remaining control fields. This structure gives the appearance of a conventional assembler, i.e., LABEL, OPERATION, OPERANDS. A microinstruction which has no

Am2909 branch control specified, such as microwords 3 and 4, uses microword bits 0-15 (which includes the branch control field) to place immediate data directly on the internal Am2901 bus. The Am2909 is then forced to "CONTINUE" by the "LIN" field. LIN, besides Latching \(\mathbb{\mathbb { N }}\) the data on the Am2901 bus, disables the microprogram branch control register output, causing the "CONTINUE" function to be selected in the branch control PROM (see Figure 20B).

These 256 microwords represent a reasonable subset of the HEX-29 standard instructions, i.e., branch, conditional branch, data moves (MDV), and, or, add, sub, etc.

\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \% & & & \[
\begin{aligned}
& \text { \& NOSWP } \\
& \& \text { LDI } \\
& \& \text { NOCIA }
\end{aligned}
\] & \[
\begin{aligned}
& \& \text { DINHI } \\
& \& \text { NOSTR } \\
& \& \text { RCLMUL }
\end{aligned}
\] & \[
\begin{aligned}
& \text { \& NOSDA } \\
& \text { S. MWAM } B \\
& \text { \& NOLIN }
\end{aligned}
\] & \[
\begin{array}{r}
\delta \\
\& \\
\delta \\
\text { NORCC } \\
\text { FETCH }
\end{array}
\] & \[
\begin{aligned}
& \& \\
& \& \\
& \text { ICCLICV } \\
& \text { NOCIB }
\end{aligned}
\] \\
\hline iSR + : & \multicolumn{2}{|l|}{\multirow[t]{6}{*}{contaie}} & am2901 & R14 & & Ramp & SUBR \\
\hline , & & & \& NOINE & S NOSDMA & \& noums & \& READ & \\
\hline , & & & EA & \& LAD & 8 NORMM & 8 LMM & \& NOCLB \\
\hline , & & & NOSWP & \& NODIN & \& NOSDA & \& NORCC & \({ }_{6}\) ICCLCV \\
\hline \multirow[t]{3}{*}{,} & & & NOLDI & \(\chi_{\text {¢ }}^{\text {N }}\) NCLTRTR &  & \% NOFTCH & H 8 NOCIB \\
\hline & & & nocia & \& RCLMUL & 8 Nolin & & \\
\hline & \multicolumn{2}{|l|}{\multirow[t]{5}{*}{brance ifetch}} & AM2901 &  & & \({ }_{8}^{\text {Rama }}\) WRITE \({ }^{\text {a }}\) & A \\
\hline \% & & & NOINE & \& NOSDMA & \% NORAM & \(8_{6}^{8}\) LMM & \& CLhio \\
\hline , & & & Nosw & \& DINHI & \(\delta\) nosda & \& Norcc & LCCLCV \\
\hline 1 & & & NOLDI & \& NOSTR & o myambi & \& NOFTCH & H 8 nocib \\
\hline 1 & & & NOCIA & \& RCLMUL & 8 NOLIN & & \\
\hline BSR- & \multicolumn{2}{|l|}{\multirow[t]{5}{*}{contnue}} & \[
\begin{aligned}
& \text { \& } \text { AM2901 } \\
& \text { S NOINE }
\end{aligned}
\] & \[
\delta_{\text {R14 }}^{\text {NOSDMA }}
\] & \begin{tabular}{l}
R14 \\
8 NOVMI
\end{tabular} & \[
\underset{\&}{\text { RAMF }} \underset{\text { READ }}{\prime}
\] & SUBR , ZA \\
\hline , & & & \({ }_{\text {BA }}{ }^{\text {a }}\) & \& LAD & \% NORMM & \(\bigcirc\) \& LMM & \(\delta\) NOCLB \\
\hline , & & & NOSup & \(\delta\) NODIN & S NCSDA & 6 NORCC & 8 LCCLCV \\
\hline , & & & NOLDI & \& NOSTR & \(\delta\) mwamer & \& NOPTCH & H 8 nocib \\
\hline 1 & & & nocia & \& RClmul & \& NOLIN & & \\
\hline & \multirow[t]{6}{*}{Efanch} & \multirow[t]{6}{*}{IFETCH} & am2901 & R15 & \({ }^{\text {R15 }}\) & Rama, & ADD . DA \\
\hline ' & & & NOINE & \& NOSDMA & \(\delta \mathrm{V}_{\text {Ma }}\) & \(\delta\) white & \\
\hline , & & & NOBA & \(\delta\) NOIAD & 8 normm & 8 LMM & NOCLB \\
\hline , & & & NOSW & \& DINHI & 5 NOSDA & \& NORCC & \% LCCLCV \\
\hline 1 & & & NOLD 1 & \& NOSTR & \(\delta\) \% mammb & \(\&\) NOFTCH & H 8 nocib \\
\hline 1 & & & NOCIA & \& RCLMUL & \(\&\) NOLIN & & \\
\hline BC+ : & \multirow[t]{5}{*}{BPCD} & \multirow[t]{5}{*}{IFETCH} & AM2961 & R15 & R15 & QREG & ADD \\
\hline , & & & NOINE & \& NOSDMA & \& NOVMA & \(\delta\) READ & \\
\hline , & & & BA & \(\delta\) NOLAD & 8 NORMM & \(\delta\) IMM & \(\delta^{6}\) CLRLIO \\
\hline \% & & & \(8{ }^{8}\) N NOSWP & \% DINHI & \% NOSDA & \(¢_{6}^{8}\) NORCC &  \\
\hline 1 & & & \(\bigcirc\) nocia & \& RCLMUL & \& NOLIN & & \\
\hline & \multirow[t]{5}{*}{BRANCH} & \multirow[t]{5}{*}{instr} & AM2981 & R15 \({ }^{\text {, }}\) & R15 & Ramp & ADD , 2 C \\
\hline , & & & NOINE & \& SDMA \({ }_{\text {a }}\) & \% VMA & \% READ & \\
\hline 1 & & & Noba & \({ }^{5}\) LAD & \& NORMM & \(8{ }_{8}\) LIMM & \({ }_{\text {noclib }}\) \\
\hline \% & & & NoSWP & \(\delta^{8}\) NODIN & 8 nospa & \(\delta\) NCRCC & \(\delta_{\delta}^{\text {\& }} \mathrm{LCCLCV}\) \\
\hline \% & & & \({ }_{8}^{8}\) LDOCIA &  & \& M NOL In & 6 FETCH & \\
\hline BC- & \multirow[t]{6}{*}{PRC0} & \multirow[t]{6}{*}{IFETCH} & AM2901 & R15 & R15 & QREG & ADD , DA \\
\hline & & & NOINE & 8 NOSLMA & \& NOVMA & \% READ & \\
\hline , & & & BA & \(\delta\) NOIAD & \& NORMM & \& LMM & \(\delta_{8} \mathrm{NOCLB}\) \\
\hline , & & & NoSW P & \(\delta\) DINHI & \(\delta\) NOSDA & o NCRCC & \& ICCLCV \\
\hline , & & & NOLDI & \(\delta\) NOSTR & \(\delta\) MiNAMw & \& NCFTCE & O NOC \\
\hline 1 & & & Nocia & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{INSTR} & AM2981 & R15 & R15 & ramp & DD \\
\hline 1 & & & NOINE & \(\delta\) S DMa & \% VMA & 6. READ & \\
\hline 1 & & & NOBA & \% LAD & \% NORMM & \(\delta\) LMM & \& NOCLB \\
\hline , & & &  & \& NODIN & \(\delta_{\text {\& }}^{8}\) NOSDA & \({ }_{\text {B }}^{8}\) \& NORECCH & \({ }_{\&}^{8} \mathrm{LCCLCV}\) \\
\hline 1 & & & Nocia & \& RCLMUL & \& NOLIN & & \\
\hline bnc+: & \multirow[t]{6}{*}{BRC1} & \multirow[t]{6}{*}{IFETCH} & am2901 & & R15 & QREG & ADD , DA \\
\hline , & & & NOINE & \& NOSDMA & \& NOTMA & \& READ & \\
\hline , & & & FA & \(\delta\) NOLAD & \& NORMM & \& LMM & ¢ CLRLO \\
\hline , & & & NoSWP & \(\delta^{\delta}\) DINHI & \& NOStA & 6. NORCC & \\
\hline , & & & NoLDI & \& NOSTR & \& miambi & 8 NOFTCP & F \& NOCIB \\
\hline 1 & & & nocia & \& RCLMOL & \& NOLIN & & \\
\hline & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{INSTR} & am2901 & R15 & & & ADD , ZQ \\
\hline & & & NOINE & \% SDMA & \& VMA & \(\delta\) READ & \\
\hline , & & & NOBA & \% LAD & \& NORMM & \(\delta\) LMM & \(\delta\) NOCLB \\
\hline , & & & NOSW & \% NODIN & \& NOSDA & \& NORCC & LCCLCV \\
\hline , & & & LDI & \& NOSTR & \& M Mambib & \& FETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \% NOLIN & & \\
\hline bnc- & \multirow[t]{5}{*}{BRC1} & \multirow[t]{5}{*}{IFETCH} & AM2901 & & R15 & QRES & ADD , DA \\
\hline & & & NOINE & \& NOSDMA & 8 NOVMA & & \\
\hline \% & & & BA & \(\delta\) NOLAD & 8 NORMM & 8 LMM & \& NOCLB \\
\hline , & & & NoSWp & ¢ DINHI & \& MWAMw & \% \& NOFTCE & H \& NOCIB \\
\hline , & & & \& NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline & \multirow[t]{6}{*}{brance} & \multirow[t]{6}{*}{INSTR} & \& AM2961 & R15 & & & ADD , 2Q \\
\hline 1 & & & NOINE & 8 SDMA & \& VMA & \& Read & \\
\hline 1 & & & noba & \& LAD & \& NORMM & \& LMM & \(\delta\) nocli \\
\hline , & & & \& NoSWP & \& NODIN & 8 NOSDA & \& NORCC & \(\delta\) LCCLCV \\
\hline 1 & & & LDI & \% NOSTR & ¢ MWAMw \({ }^{\text {b }}\) & \& FETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \& NOLIN & & \\
\hline \(\mathrm{BV}+\) : & \multirow[t]{6}{*}{BRV®} & \multirow[t]{6}{*}{IfETCH} & AM2901 & \({ }^{\text {R15 }}\) ( \({ }^{\text {a }}\) & & QREG \({ }_{\text {c }}\) & DD , DA \\
\hline & & & NOINE & \% NOSDMA & 8 NOVMA & \& READ & \\
\hline / & & & BA & \& NOLAD & 8 NORMM & \& LMM & Clalo \\
\hline , & & & \& NOSWP & \(\delta\) DINHI & \& NOSDA & \& NORCC & \(\delta\) LCCLCV \\
\hline 1 & & & \({ }_{8}^{8}\) NOLDI & \& NOSTR & 8 Mwamb \({ }^{\text {d }}\) & \& NCFTCH & H \& NOCIB \\
\hline 1 & & & \& Nocia & \& RCLMUL & \& NOLIN & & \\
\hline & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{InSTR} & \% AM2901 & R15 & \({ }^{\text {R15 }}\), & & ADD , 2 Q \\
\hline , & & & \(\delta\) NOINE & \& S DMA & & \(\delta_{\text {\% READ }}\) & \\
\hline 1 & & & noba & \% LAD & \% NORMM & \% LMM & \& NOCLB \\
\hline , & & & \& NOSWP & \& NODIN & \& NOSDA & \% NORCC & \& LCCLCV \\
\hline \% & & & LDI & \& NOSTR & \& MWAMEB & \& FETCH & nocib \\
\hline 1 & & & nocia & \(\delta\) RCLMUL & & & \\
\hline \({ }_{1}^{\text {BV-: }}\) & \multirow[t]{5}{*}{3RV8} & \multirow[t]{5}{*}{IfETCH} & \& AM29e1 & & & & ADD , DA \\
\hline 1 & & & \(\delta_{\delta}^{\text {N }} \mathrm{BA}\) INE & \& NOSDMA & \& NOVMA & § READ & \(\delta\) NOCLB \\
\hline \% & & &  & S DINHI & \& NOSDA & \& NORCC & 8 LCCLCV \\
\hline , & & & of NOLDI & \& NOSTR & \& mwamwi & B \(\delta\) nortch & H 6 NOCIB \\
\hline 1 & & & \(\checkmark\) nocia & \& RCLMUL & 6 NOLIN & & \\
\hline & \multirow[t]{6}{*}{branca} & \multirow[t]{6}{*}{Instr} & Am2961 & R15 & R15 & Ramp & DD , 2Q \\
\hline 1 & & & \& NOINE & \& SDMA & \% VMA & \& READ & \\
\hline , & & & ¢ NOBA & 8 IAD & \(\delta\) NORMM & 8 IMM & \(\delta^{8} \mathrm{NOCLB}\) \\
\hline 1 & & & \& nosup & 8 NODIN & \(\&\) NOSDA & 8 NORCC & 8 ICCLCD \\
\hline \% & & & LDI & 8 NOSTR & \% mwamvi & B \& PETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline \({ }^{\text {BNV }+}\) & \multirow[t]{6}{*}{BEV1} & \multirow[t]{6}{*}{IFETCP} &  & \({ }_{\delta}^{\text {R15 }}\) NOSDMA & \({ }_{8}^{\text {R15 }}\) NOMMA & \({ }_{\delta}^{\text {QREG }}\) READ & DA \\
\hline & & & \& NOINE & \& NOSDMA & \% Novma & & \\
\hline , & & & \(\delta\) BA & \& NOLAD & \% NORMM & \& LMM & 5 CLRLO \\
\hline 1 & & & \& NOSUP & 8 DINHI & \(\delta\) NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & & \& Moidi & \& NOSTR & \& Mwamwi & B \& Nortce & H \(\&\) Nocib \\
\hline / & & & \& NOCIA & \(\delta\) RCLmul & \(\delta\) NOLIN & & \\
\hline & \multirow[t]{3}{*}{branch} & INSTR & \& AM2901 & & & & ADD , 2 C \\
\hline & & & \& NOINE & \% SDMA & \(\delta\) Vma & ¢ READ & \\
\hline 1 & & &  & \(\delta_{8}^{8}\) LAD & \({ }_{8}^{\text {S }}\) SNORMM & \({ }_{8}^{6}\) LMMM & \[
\begin{aligned}
& \& \\
& \& ~ N O C L B \\
& \text { LCCLCD }
\end{aligned}
\] \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \[
1
\] & & & \[
\begin{array}{ll}
\& & \text { LDI } \\
\& & \text { NOCIA }
\end{array}
\] & \begin{tabular}{l}
\& NOSTR \\
\& RCLMUL
\end{tabular} & \begin{tabular}{l}
\& MwAMis \\
\& NOLIN
\end{tabular} & B \& FETCB & NOCIB \\
\hline bnv-: & BRV1 & IFETCH & AM2901 & R15 & & QREG \({ }^{\text {a }}\) AD & DA \\
\hline , & & & NOINE & \& NOSDMA & \& novma & \(\delta\) READ & \\
\hline , & & & BA & 8 NOLAD & \& NORMM & 6 LMM & NOCLB \\
\hline 1 & & & NOSWP & \& DINEI & \& NOSDA & \(\delta\) NORCC & LCCLCV \\
\hline 1 & & & NOLDI & \& NOSTR & \& mwamwb & \& NOPTCA & NOCIB \\
\hline 1 & & & NOCIA & \& RCLIMUL & 8 NOL & & \\
\hline & branch & INSTR & \(¢_{8}\) AM2961 & \[
\begin{aligned}
& \text { R15 } \\
& \varepsilon_{\text {SDMA }}
\end{aligned}
\] & R15 & \({ }_{\delta}^{\text {RAMP }}\) READ \({ }^{\text {a }}\) ADD & 28 \\
\hline \% & & & NOINE & \({ }_{8}^{8}\) SDMA & \(\delta_{8}^{8}\) NORMM & \& LMM & NOCLB \\
\hline , & & & NOSW & \& NODIM & 8 NOSDA & 8 norcc & lcaldy \\
\hline , & & & LDI & \& NOSTR & 8 MWAMwb & \& FETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline BNV + : & brne & IFETCE & AM2961 & R1 & & QREG & DA \\
\hline 1 & & & NOINE & \& NOSDMA & & & \\
\hline 1 & & & \({ }_{\text {ba }}\) & \(\delta\) NOLAD & 8 NORMM & \({ }_{8} 8\) LMM & CLRLO \\
\hline ' & & & NOSWP & ¢ DINHI & \& NOSDA & ( \({ }_{\text {¢ }}^{\text {¢ }}\) NORCC & LCCLCV \\
\hline 1 & & & nocia & 8 rclmul & \& NOLIN & & \\
\hline & branct & Instr & AM2961 & R15 & R15 & Ramp , ADD & 2 Q \\
\hline , & & & NOINE & \& SDMA & \% VMA & \& READ & \\
\hline / & & & NOBA & \(\&\) LAD & \& NORMM & \& LMM & NOCLB \\
\hline 1 & & & NOSWP & \& NODIN & \(\delta\) NOSDA & 6 NORCC & LCCLCV \\
\hline 1 & & & LDI & \& NoSTR & \& Mwambi & B \& FETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \& NOLIN & & \\
\hline B1-: & bRA 0 & IFETCH & AM2961 & R15 & R15 & QREG , ADD & , DA \\
\hline & & & NOINE & \& NOSDMA & \& NOVMA & \& READ & \\
\hline 1 & & & BA & \(\delta\) NOIAD & \& NORMM & \(\delta\) LMM & NOCLB \\
\hline 1 & & & NOS\#P & s. DINFI & \& NOSDA & \& NORCC & LCCLCV \\
\hline ' & & & noldi & \& NOSIR & o Mvamw \(B\) & \& NOPTCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \& NOIIN & & \\
\hline & branch & Instr & AM2901 & R15 & & Ramp & ZQ \\
\hline & & & NOINE & \% SDMA & \(\delta\) VMA & \& READ & \\
\hline , & & & NOBA & \% LAD & \(\delta\) NORMM & \% LMM & NOCLB \\
\hline 1 & & & NoSWP & 8 NODIN & \% NOSDA & \& NORCC & LCCLCV \\
\hline , & & & LDI & \& NOSTR & 8 mammb & \(B\) \& FETCH & NOCIB \\
\hline 1 & & & nocia & \(\delta\) RCLMUL & \& NOLIN & & \\
\hline bNn+: & BRM1 & IPETCH & AM2901 & R15 & R15 & QREG , ADD & DA \\
\hline , & & & 6. NOINE & 8 NOS DMA & \& NOVMA & \& READ & \\
\hline 1 & & & BA & 8 NOLAD & \& NORMM & \(\delta_{8}^{8}\) LMM \({ }_{\text {NOBCC }}^{\delta}\) & clrlo \\
\hline , & & & Noidi & 8 DNOSTR & \& Mwamib & \(B\) \& NOTTCH \& & Nocib \\
\hline 1 & & & nocia & \(\&\) RCLMUL & \& NOLIN & & \\
\hline & branch & instr & am2981 & & R15 & RAMP \({ }^{\text {a }}\) ADD & 2Q \\
\hline / & & & NOINE & \& SDMA & \(\delta^{\text {c MmA }}\) & \& READ & \\
\hline , & & & noba & \& LAD & \% NORMM & \% LMM & NOCL \\
\hline , & & & NoSWP & \(\delta\) NODIN & \(\&\) NOSDA & 8 NORCC & LCCLCV \\
\hline 1 & & & LDI & \& NOSTR & \& Mrambi & B \& PETCH & NOCIB \\
\hline 1 & & & nocia & \& RCLMUL & 8 NOLIN & & \\
\hline bNN- & BRM1 & Ifetce & AM2961 & R15 & R15 & QREG \({ }^{\text {a }}\) ADD & DA \\
\hline & & & NOINE & \& NOSDMA & \& NOVMA & \& READ & \\
\hline , & & & BA & 8 NOLAD & \% NORMM & \(\delta\) LMM & NOCLB \\
\hline 1 & & & NOSWP & \(\bigcirc\) DINEI & \& NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & & NOLDI & \& NOSTR & \(\delta\) miammb & B \& NOFTCH & nocib \\
\hline 1 & & & NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline & brance & INSTR & AM2981 & R15 & R15 & RAMP , ADD & 2Q \\
\hline / & & & \& NOINE & \% SDMA & \(\delta\) VMA & ¢ READ & \\
\hline 1 & & & NOBA & \& LAD & \& NORMM & \& LMM & NoCLB \\
\hline 1 & & & NoswP & \& NODIN & \& NOSDA & \% NoRCC & LCCLCV \\
\hline \% & & & LDI & \& NOSTR & \& mwamw & \& FETCB & nocib \\
\hline 1 & & & NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline & BRZ 8 & IFETCH & \(\chi_{8}^{8}\) AM2901 & & & \[
\underset{\delta}{\mathrm{QREG}} \underset{\mathrm{READ}}{ }
\] & DA \\
\hline & & & \& NOINE & \& NOSDMA & \& AOVMA \& NORMM & \[
\% \text { READ }
\] & \\
\hline 1 & & & \({ }^{\text {BA }}\) NOSUP & \(\delta^{\circ}\) DINHI & \& NOSDA & \% NORCC & LCCLCV \\
\hline , & & & NOLDI & 8 NOSTR & \& mwamwb & B \& NOFTCH \& & NoCIB \\
\hline 1 & & & nocia & \& RCLMUL & 8 NOLIN & & \\
\hline & branch & Instr & am2901 & R15 & & Ramp , ADD & 2Q \\
\hline 1 & & & NOINE & \& SDMA & \% VMA & \& READ & \\
\hline 1 & & & NOBA & \& LAD & \& NORMM & \& LMM & NOCLB \\
\hline , & & & NOSW P & 8 NODIN & \& NOSDA & \& NORCC & LCCLCD \\
\hline 1 & & & LDI & \& NOSTR & \% mbamwi & \& PETCH & NOC IB \\
\hline 1 & & & nocia & \& RCLMUL & \& NOLIN & & \\
\hline B2-: & BR20 & 0008 & \& AM2901 & R15 & R15 & QREG , ADD & DA \\
\hline & & & NOINE & \& NOSDMA & 8 NOVMA & \& READ & \\
\hline 1 & & & PA & \& NOLAD & \& NORMM & \& LMM \& & NOCLB \\
\hline , & & & \& NOSWP & \& DINHI & \& NOSDA & \& NORCC \& & LCClicv \\
\hline 1 & & & NOLDI & \& NOSTR & \& mwamb \({ }^{\text {a }}\) & B \& NOFTCH & nocib \\
\hline 1 & & & nocia & \& RCLMUL & 8 NOLIN & & \\
\hline & Branch & INSTK & \& AM2901 & & \({ }^{\text {R15 }}\) & ramp & 2 Q \\
\hline 1 & & & ¢ NOINE & \& SDMA & \(\delta\) VMA & \& READ & \\
\hline , & & & \& NOBA & \& LAD & \(\delta\) NORMM & \& LMM & NOCLB \\
\hline 1 & & & NoSW P & \(\underbrace{\substack{\text { N }}}_{\text {\& }}\) & \& NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & & LDI & \& NOSTR & \& MWAMWB & B \& FETCH & NOCIB \\
\hline 1 & & & \& Nocia & \& RCLMUL & \& NOLIN & & \\
\hline BNZ+ : & BRZ1 & 0088 & \% AM2901 & \({ }^{\text {R15 }}\) ( \({ }^{\text {a }}\) & & \[
\text { QREGGEAD } A D D
\] & , DA \\
\hline 1 & & & \& NOINE & \& NOSDMA & \& NOVMA & \& READ & \\
\hline , & & & BA & 8 NOIAD & \(\delta\) NORMM & \& LMM \(\delta\) & CLRLO \\
\hline 1 & & & NOSWP & \(\delta\) DINHI & \& NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & & NCLDI & \& NOSTR & 8 mwamwb & \(B\) \& NOFTCH & NOCIB \\
\hline 1 & & & NOCIA & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline & branch & Instr & \& AM2901 & R15 & R15 & Ramp , ADD & ZQ \\
\hline 1 & & & NOINE & \& SDMA & \& VMA & \(\delta\) READ & \\
\hline \% & & & NOBA & \& LAD & \& NORMM & \& LMM & NOCLB \\
\hline \% & & & NOSW & \& NODIN & \& NOSDA & \% NoRCC & LCCLCV \\
\hline \% & & & LDI & 8 NOSTR & \& MwAmwb & \& FETCH & NoCib \\
\hline 1 & & & \& Nocia & \& RCLMUL & \& NOLIN & & \\
\hline BNZ-: & BR21 & 8008 & \& AMz901 & R15 & R15 & QREG , ADD & , DA \\
\hline , & & & \(\delta\) NOINE & \& NOSDMA & \& NOVMA & \(\delta\) read & \\
\hline , & & & \& BA & \& NOIAD & \& NORMM & \(\delta\) IMM & NOCLB \\
\hline , & & & NOSUP & \& DINHI & 8 NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & & NOLDI & \& NOSTR & \& mwamwb & B \(\delta\) NOPTCH & NoCib \\
\hline 1 & & & \& NOCIA & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline & branch & INSTR & \[
\begin{aligned}
& \& \\
& \& \\
& \& \\
& \text { AM2901NE }
\end{aligned}
\] & \[
\varepsilon^{\text {R15 }} \text { SDMA }
\] & \[
{ }_{\delta}^{R 15} V_{M A}
\] & \({ }_{\delta}^{\text {RAMP }}\) READ \({ }^{\text {a }}\) ADD & 28 \\
\hline \[
1
\] & & & \& NOBA & \& LAD & \& NORMM & \[
\& \text { LMM }
\]
\& NCRCC & \[
\begin{aligned}
& \text { NOCLB } \\
& \text { LCCLCV }
\end{aligned}
\] \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline ' & & & \[
\& \text { LDI }
\] & \begin{tabular}{l}
\& NOSTR \\
\& RCLMUL
\end{tabular} & \[
\begin{aligned}
& \& \text { MVAMWB } \\
& \& \text { NOLIN }
\end{aligned}
\] & \& PETCH & \& NOCIB \\
\hline \(\mathrm{BE}+\) : & \multirow[t]{5}{*}{brise} & \multirow[t]{5}{*}{IPETCK} & AM2901 & R15 & & QREG , AD & DD , DA \\
\hline , & & & NOINE & ¢ NOSDMA & \& Novma & \({ }^{6}\) c READ & \\
\hline , & & & 8. BA & 8 NDLAD & \% NORMM & 6 LMM & Clalo \\
\hline , & & & \(\delta\) NOSUP & 8 DINHI & \% NoSDA & \% Norcc & LCCLCV \\
\hline ' & & & \(\delta_{8}\) NoLDI & \(\delta_{\text {\& }}\) NOSTR &  & 6 noptce & nocib \\
\hline & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{INSTR} & 6 am2901 & R15 & R15 & Ramp & ADD , 2 Q \\
\hline / & & & 5 NOINE & 8 S DMA & \(\delta\) VMA & \(\delta\) READ & \\
\hline , & & & nOBA & \(\delta\) L.AD & \% NORMM & 8 LMM & NOCI \\
\hline , & & & 8 NOSUP & 8 NODIN & \(\delta\) NOSDA & 8 NORCC & LCCLCV \\
\hline 1 & & & \({ }_{8}^{8} \mathrm{LDDI}\) &  & \& MVAMVB & 6 FETCR & 8 Nocib \\
\hline BR-: & \multirow[t]{6}{*}{bRiSe} & \multirow[t]{6}{*}{IFETCH} & am2901 & & & & ADD , DA \\
\hline \% & & & \& NOINE & \& NOSDMA & \(\delta_{\text {\% Novmi }}\) & \({ }_{6}{ }^{\text {remaj }}\) & a \({ }^{\text {a da }}\) \\
\hline , & & & BA & 8 NOLAD & \& NORMM & 8 LMM & \(\delta^{8}\) NOCLB \\
\hline , & & & \(\delta\) NOSWP & \(\&\) DINHI & \& NOSDA & \({ }_{6} 6\) NORCC & 8 LCCLCV \\
\hline , & & & \& NOLDI & \& NOSTR & \% MVAMVB & & NOCIB \\
\hline 1 & & & NOCIA & \(\delta\) RCLMUL & 6 NOLIM & & \\
\hline & \multirow[t]{5}{*}{bkanch} & \multirow[t]{5}{*}{Instr} & \% AM2901 & R15 & & ramp & D \\
\hline 1 & & & \(\delta\) NOINE & \(\delta\) SDMA & \% VMA & \% Read & \\
\hline ' & & & \({ }_{8}^{\text {\& }}\) N NOBA & \& LAD & \& \% Normm & \({ }_{\text {S }}^{8}\) L LMM & acclb \\
\hline , & & & \(\delta\) LDI & \& NOSTR & \& mıamib & \& FETCH & \& NOCIB \\
\hline 1 & & & \& NOCIA & \& rcimul & \& NOLIN & & \\
\hline BNH + : & \multirow[t]{5}{*}{BRHS1} & \multirow[t]{5}{*}{IFETCH} & AM2901 & R15 & & QRE & A \\
\hline / & & & \(¢_{8}^{8} \mathrm{NOINE}\) & \(\chi_{\&}^{8}\) NOSDMA & \& NOVMA & \(\delta^{8}\) READ & \\
\hline , & & & \& NOSWP & 8 DINAI & \& NOSDA & \% NORCC & 8 Leclev \\
\hline , & & & \& NOLDI & \& NOSTR & \& mwamis & \& NOPTCH & \% 8 NOCIB \\
\hline 1 & & & \& NOCIA & \& riclmul & \& NOLIIN & & \\
\hline & \multirow[t]{5}{*}{brance} & \multirow[t]{5}{*}{InSTR} & \% AM2901 & & & & ADD , 2Q \\
\hline / & & & \& NOINE & \& SDMA & \% VMA & \(\delta\) READ & \\
\hline 1 & & & \({ }^{8}\) N NOBA & \(\delta_{8}^{8}\) LAD & \& NORMM & \(\delta_{8}^{8}\) LMM & noclb \\
\hline 1 & & & \& LDI & \(\delta^{8}\) N NOSTR & \& MVAMwb &  & \({ }_{8}^{8}\) LOCIB \\
\hline 1 & & & nocia & \& RCLMUL & \& NOLIN & & \\
\hline bnh- & \multirow[t]{5}{*}{BR H 51} & \multirow[t]{5}{*}{IPETCH} & am29e1 & & & & ADD , DA \\
\hline & & & \& NOINE & 6. NOSDMA & \& novma & \(\delta\) READ & \\
\hline \% & & & \(\delta^{8}\) BA & \& NOLAD & 6. NoRMM & \(\delta^{8}\) LMM & NOCL \\
\hline / & & & \& N NOLDP & \& DINHI & & \({ }_{8}^{8}\) NORCC & LCCLICV \\
\hline 1 & & & ¢ Nocia & \&. RCLMUL & \(\delta\) NOLIN & \% & \\
\hline & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{InSTR} & \% Am2901 & & R15 & & ADD , zQ \\
\hline / & & & \(\delta\) NOINE & \% SDMA & \% MMA & ¢ READ & \\
\hline , & & & noba & 6 LAD & \& NORMM & \(\delta\) LMM & NOCLB \\
\hline , & & & \& NOSWP & 8 NODIN & \(\delta\) NOSDA & 8 NORCC & \(\checkmark\) lcclev \\
\hline \% & & & \(8_{8}^{8}\) LDIL & \({ }_{8}^{8}\) N NOSTR & \& Muambr & \& FETCH & \& NOCIB \\
\hline & \multirow[t]{6}{*}{contaue} & \multirow[t]{6}{*}{} & & & & & \\
\hline DBNZ + : & & & \begin{tabular}{l}
\& AM2901 \\
\& NOINE
\end{tabular} & \[
{ }_{\delta}^{\text {R9 }}{ }_{\text {SDMA }} \text {, }
\] & \({ }_{\delta}^{\text {R9 }}\) NOVMÁ & \[
\underset{\delta \text { READ }}{\text { RAMF }}
\] & SUBR , 2A \\
\hline & & & & & \& NOVMA &  & \\
\hline \% & & & \(8_{6} 8\) NOSWP & \& N NODİ & \% NORMM & \(\delta_{8}^{8}\) LMM NORCC & \({ }_{\&}^{8} \mathrm{NOCLB}\) \\
\hline , & & & \& NOLDI & \& NOSTR & \& MWAMWB & \& NOFTCH & \& NOCIB \\
\hline 1 & & & S Nocia & \& rclmul & 6 NOLIN & & \\
\hline & \multirow[t]{5}{*}{BRLz1} & \multirow[t]{5}{*}{IPETCH} & AM2901 & R15 & R15 & QREG & AD \\
\hline \% & & & NOINE & \& NOSDMA & \& Novmí & ¢ Read & \\
\hline , & & & BA & \& NOLAD & 8 NORMM & \& LMM & \\
\hline 1 & & & NOSWP & 8 DINHI & 6 nosda & \(\delta\) NORCC & 6 LCCLCV \\
\hline \% & & & \& NoLDI & \begin{tabular}{l}
\& NOSTR \\
\& RCLMUL
\end{tabular} & \[
\begin{aligned}
& \text { \& MWAMWB } \\
& \text { \& NOL IN }
\end{aligned}
\] & \& NOFTCH & \& NOCIB \\
\hline & \multirow[t]{5}{*}{brance} & \multirow[t]{5}{*}{INSTR} & 8 am2sid & R15 & R15 & ramp & OR , zQ \\
\hline , & & & 8 NOINE & 8 SDMA & \& VMA & \% read & \\
\hline 1 & & & \& NOBA & 6 LAD & \& NORMM & \& LMM & \& NOClb \\
\hline , & & & \& NOSVP & \& NODIM & ¢ NOSDA & \% NORCC & \& LCCLICV \\
\hline , & & & \(\delta_{8}^{8}\) LDID NOCIA & \({ }_{\text {¢ }}^{6}\) N NOSTR & \[
\begin{aligned}
& \delta \text { MWAMMB } \\
& \delta \text { NOLIN }
\end{aligned}
\] & B 8 Ferch & NOCIB \\
\hline DBN 2-: & \multirow[t]{5}{*}{contnue} & \multirow[t]{6}{*}{} & am2901 & & & & Subr , 2A \\
\hline & & & 8 NOINE & \& SDMA & \& novmi & \(\delta\) READ & \\
\hline , & & & \(\delta^{8}\) BA & \& NOLAD & \& NORMM & 6 LMM & \(\delta^{6}\) NOCLB \\
\hline , & & & 8 NOSw? & \& NODIN & ¢ nosta & \% NORCC & \% LCCICV \\
\hline , & & & NOLDI & \& NOSTR & ¢ mwamub & \& NOFTCH & nocib \\
\hline 1 & \multirow[t]{6}{*}{BRL21} & & NOCIA & \& RCLMUL & 8 NCLIN & & \\
\hline & & \multirow[t]{5}{*}{ifetcr} & AM2901 & R15 & & qREG & ADD , DA \\
\hline \% & & & \(\delta_{8}^{8} \mathrm{NOA} \mathrm{INE}\) & \({ }_{\text {\& }}^{\&}\) NOSDMA & \& NOVMA & \({ }_{8}^{8} \mathrm{READ}\) & NOC \\
\hline 1 & & & \& NOSWP & \& DINEI & \& NOSDA & \& NORCC & \& LCCLCV \\
\hline 1 & & & \(\&\) NOLDI & \& NOSTR & \(\delta\) misamib & B \& NOFTCH & \& NOCIB \\
\hline 1 & & & \& NOCIA & \& RCLMul & \(\delta\) NOLIN & & \\
\hline & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{INSTR} & - Am2981 & R15 & R15 & RMm & 2Q \\
\hline \% & & & 8 NOINE & \% SDMA & & & \\
\hline , & & & \(\delta^{\text {c }}\) NOBA & \(\delta\) LAD & 8 NORMM & \& LMM & \& NOCLB \\
\hline , & & & \(\delta\) NOSWP & 8 NODIN & s NOSDA & \(\delta\) NORCC & LCCLCV \\
\hline \% & & & \& LDI & \& NOSTR & \& mwambi & \& FETCH & 6 NOCIB \\
\hline 1 & & & \(\delta\) nocia & \& RCLMUL & \& NOLIN & & \\
\hline SQUEEZ: & \multirow[t]{5}{*}{contnue} & \multirow[t]{5}{*}{} & \begin{tabular}{l}
\& AM2901 \\
6 NOINE
\end{tabular} & \[
\mathcal{S}^{R}{ }^{\boldsymbol{N}} \mathrm{NOSSDMÁ}^{\prime}
\] & \[
{ }_{\S}^{\mathrm{R} \theta} \mathrm{NOVMİ}
\] & \[
\underset{\delta: \text { READ }}{\text { QRIG }}
\] & D , AQ \\
\hline / & & & \(¢_{\&}{ }^{\text {BA }} \mathrm{NA}\) & \& NOLAD & S NORMM & \& LMM & NOCLB \\
\hline , & & & \& NOSuP & \& NODIN & \& NOSDA & \& NORCC & 8 LCCLCV \\
\hline 1 & & & \& NoLDI & 8 NOSTR & \(\&\) RSRS & \& NOFTCH & \& NOCIB \\
\hline 1 & & & 8 NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline & \multirow[t]{6}{*}{brance} & \multirow[t]{6}{*}{INSTR} & \& am2901 & Re & & & SUBR , DQ \\
\hline 1 & & & \(\delta\) NOINE & ¢ SDMA & \& VMA & \& READ & \\
\hline , & & & 8 NOBA & \& NoLAD & \& NORMM & 8 LMM & \& CLRLO \\
\hline , & & & \& NOSW? & \& DINBI & \(\delta\) NOSDA & \& NORCC & \& NOLCC \\
\hline \% & & & \(\delta_{6}\) LDI & \& NOSTR & \& Mwamwb & \& FETCH & \& NOCIB \\
\hline 1 & & & \& Cia & \& RCLMUL & 8 NCLIN & & \\
\hline CBE \({ }^{\text {: }}\) & \multirow[t]{6}{*}{branct} & \multirow[t]{6}{*}{BIC} & \& AM2901 & \({ }^{\text {R15 }}\) & & & ADD \\
\hline & & & \& NOINE & 8 SDMA & \& NOVMA & \& READ & \\
\hline 1 & & & 8 BA & \& NOLAD & \& NORMM & \& LMM & \& Clato \\
\hline 1 & & & 8 NOSWP & \(\delta\) DINHI & \& NoSda & \& NORCC & lccicv \\
\hline 1 & & & \& NOLDI & \& NOSTR & \& mwambi & \& NOPTCH & \& NOCIB \\
\hline 1 & & & \& NOCIA & \% RCLMUL & \& NOLIN & & \\
\hline CBE-: & \multirow[t]{4}{*}{cominue} & & \& amz901 & R15 & & & ADD , D \\
\hline 1 & & & a NOINE & \% SDMA & \& Novmi & \& READ & ADD , dA \\
\hline , & & & 8 BA & \& NOLAD & \& NORMM & \& LMM & \& noclib \\
\hline \% & & & NOLDI & \(¢_{\&}\) D NOSTR & \(¢_{8}\) N MVAMVA & \(\delta_{6}\) \& NORTCC & \[
\begin{aligned}
& \text { LCCLCV } \\
& \text { NOCIB }
\end{aligned}
\] \\
\hline
\end{tabular}


\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{} & & & \multicolumn{5}{|c|}{MUL 8 NOLIN} \\
\hline & \multicolumn{2}{|l|}{branch instr} & AM2901 & Re & & ramp \({ }^{\text {a }} 1\) & ADD , 2 Q \\
\hline , & & & NOINE & \& SDMA & & ¢ Read & \\
\hline , & & & noba & 8 NOLAD & \& Normm & \(\delta\) LMM & \({ }^{\delta} \mathrm{NOCLB}\) \\
\hline , & & & noswp & 8 NODIN & \& NOSDA & 8 NORCC & \& LCCLCV \\
\hline / & & & LDI & \% NOSTR & \(\delta\) RShS & \& PETCH & nocib \\
\hline 1 & & & \& nocia & \& RCLMUL & 8 NOLIN & & \\
\hline MVNRR : & brance & INSTR & \& AM2961 & \({ }^{R} \mathrm{~S}\) & & & DD \\
\hline 1 & & & \& NOINE & \(\$_{\&}^{8}\) SDMA & \& VIMA & \({ }_{8}^{8}\) L READ & NOCLB \\
\hline , & & & NOSt \({ }^{\text {N }}\) & \& NODIN & \(\&\) NOSDA & 8 NORCC & ICCLICV \\
\hline \% & & & LDI & \(\delta\) NOSTR & \& RSRD & \& FETCH & \& NOCIB \\
\hline 1 & & & NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline Movrr: & \multirow[t]{5}{*}{brance} & \multirow[t]{5}{*}{INSTR} & \multirow[t]{5}{*}{\[
\begin{array}{ll}
\text { \& } & \text { AM2901 } \\
\text { \& } & \text { NOINE } \\
\& & \text { NOBA } \\
\& & \text { NOSWP } \\
\text { \& } & \text { LDI } \\
\& & \text { NOCIA }
\end{array}
\]} & \multicolumn{2}{|l|}{R \({ }_{\text {d }}\), H} & \(\underset{\delta}{\text { Ramp }}\) READ & ADD \\
\hline & & & & & & & \\
\hline 1 & & & & \(\&\) NOLAD & 8 NORMM & \& LMM & \({ }_{8}^{\delta} \mathrm{NOCLE}\) \\
\hline , & & & & \& NODIN & \& NOSDA & \& FETCH & \({ }_{8}\) \% NOCIB \\
\hline 1 & & & & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline ADDRR: & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{INSTR} & \& am29el & R6 , & & ramp & , \(\triangle\) B \\
\hline & & & 8 NOINE & 8 SDMA & \& VMA & \(\delta\) Read & \\
\hline , & & & NOBA & \({ }_{8}^{\text {\& N NOLAD }}\) & \& NoRMM & \(\delta_{8}^{8}\) LMM NORCC & 8
8
NOCLB
NOLCC \\
\hline \% & & & LDI & \& NOSTR & \& RSRD & \& FETCE & \& NoCIB \\
\hline 1 & & & nocia & \& RCLMUL & 8 NOLIM & & \\
\hline ADCRR : & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{INSTR} & \multicolumn{2}{|l|}{\& AM2981} & & \[
\underset{\& ~ R E A D}{\text { RAMF }}
\] & ADD \\
\hline 1 & & & NOINE & \({ }_{6}^{8}\) SDMA & \& VMA & \({ }_{\delta}^{8} \mathrm{REAMD}\) & \\
\hline , & & & Nosup & \(\delta\) NODIN & \& NoSda & \& NORCC & NOLCC \\
\hline , & & & LDI & 8 NOSTR & \& \(\operatorname{sind}\) & \& FETCH & \& CIB \\
\hline 1 & & & nocta & \& RCLMUL & \(\&\) NOLIN & & \\
\hline SUBRR : & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{instr} & \multirow[t]{6}{*}{\[
\begin{array}{ll}
\& & \text { AM29e1 } \\
\text { \& } & \text { NOINE } \\
\text { \& } & \text { NOBA } \\
\text { NOS } \\
\text { N } & \text { LDI } \\
\& & \text { CIA }
\end{array}
\]} & \multirow[t]{2}{*}{\[
\delta \stackrel{R \theta}{S D M A}
\]} & \multirow[t]{2}{*}{\[
{ }_{\delta}^{\mathrm{R} \oplus} \mathrm{VMA}
\]} & \(\stackrel{\text { Ramp }}{\text { READ }}\) & \multirow[t]{2}{*}{SUBK , AB} \\
\hline 1 & & & & & & ¢ R READ & \\
\hline , & & & & \& NOLAD & \(\delta\) NORMM & \(\delta^{\text {\& }}\) LMM & \(\delta^{8} \mathrm{NOCLB}\) \\
\hline , & & & & \(¢_{6}\) NODIN & 8 NOSDA & \% NORCC & \& NoLCC \\
\hline \% & & & & \& NOSTR & \& RSRD
\& NOLIN & \(\checkmark\) FETCE & 6 NOCIB \\
\hline & & & & & & & \\
\hline SbCRR: & \multirow[t]{5}{*}{branct} & \multirow[t]{5}{*}{INSTR} & \multirow[t]{5}{*}{\begin{tabular}{l}
AM2901 \\
NOINE \\
NOBA \\
NOSUP \\
LDI \\
NOCIA
\end{tabular}} & \({ }^{\text {R }}\) S \({ }^{\text {d }}\), & & Ramp \({ }^{\text {, }}\) & SUbr , ab \\
\hline & & & & & & & \\
\hline \% & & & & \(\delta^{\text {c }}\) NOLAD & \& NORMM & \(\delta_{8}^{8}\) LMM NORCC & \(¢_{8}^{\text {¢ N NOCLIB }}\) \\
\hline \% & & & & & & \(\%_{8} 8\) NORCC &  \\
\hline ' & & & &  &  & \& FETCB & ¢ CIB \\
\hline and & \multirow[t]{5}{*}{BRANCH} & \multirow[t]{5}{*}{INSTR} & \multirow[t]{5}{*}{\[
\begin{aligned}
& \text { AM2901 } \\
& \text { NOINE } \\
& \text { NOBA } \\
& \text { NOSWP } \\
& \text { LDI } \\
& \text { NOCIA }
\end{aligned}
\]} & \multirow[t]{2}{*}{\({ }_{8}^{\text {Re }}\) SDMa \({ }^{\text {a }}\)} & \multirow[t]{2}{*}{\[
{ }_{\delta}^{R 6} V M A^{\prime}
\]} & \multirow[t]{2}{*}{\({ }_{\delta}^{\text {rami }}\) READ} & \multirow[t]{2}{*}{AND} \\
\hline & & & & & & & \\
\hline , & & & & \(\delta\) NOLAD & \(\delta\) NORMM & \& LMM & \& NOCLB \\
\hline , & & & & \& NODIN & \(\delta\) NOSDA & \% NORCC & LCV \\
\hline \% & & & &  & 8
\& R R A PL
NOLIN & \& FETCH & NOCI \\
\hline IORRR: & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{INSTR} & \& Am2901 & \multirow[t]{2}{*}{\[
\varepsilon_{S}^{R G}{ }_{S M A}
\]} & \multirow[t]{2}{*}{\[
{ }_{\delta}^{\mathrm{R} 0} \mathrm{MMA},
\]} & \multirow[t]{2}{*}{} & \multirow[t]{2}{*}{} \\
\hline & & & NOINE & & & & \\
\hline , & & & noba & \(\delta\) NOLAD & \& NORMM & \multicolumn{2}{|l|}{\multirow[t]{12}{*}{}} \\
\hline 1 & & & NOSW & \(\delta\) NODIN & 8 NOSDA & & \\
\hline & & & LDI & \% NOSTR & 8 RSRL & & \\
\hline 1 & & & NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline xORRR : & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{INSTR} & \& Am2961 & \multicolumn{2}{|l|}{R 6 , R 0} & \multirow[t]{2}{*}{\({ }_{\delta}^{\text {ramp }}\) Read} & \multirow[t]{2}{*}{ExOR , ab} \\
\hline , & & & ¢ NOINE & 6. SDMA & & & \\
\hline 1 & & & NOBA & \(\delta\) NOLAD & \% NORMM & & \\
\hline , & & & NOSWP & \& NODIN & \% NOSDA & & \\
\hline , & & & LDI & \& NOSTR & \(\delta\) RSAD & & \\
\hline 1 & & & nocia & \(\delta\) RCLMUL & \& NOLIN & & \\
\hline CMPRF : & \multirow[t]{6}{*}{branct} & \multirow[t]{6}{*}{INSTR} & \multirow[t]{6}{*}{\[
\begin{array}{ll}
\delta & \text { AM2901 } \\
\text { \& } & \text { NOINE } \\
\text { S } & \text { NOBA } \\
\text { \& } & \text { NOSWP } \\
\& & \text { LDI } \\
\text { S CIA }
\end{array}
\]} & \multirow[t]{6}{*}{} & \({ }^{\text {Re }}\) VMA \({ }^{\text {a }}\) & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{}} \\
\hline & & & & & & & \\
\hline 1 & & & & & & & \\
\hline , & & & & & \(\delta_{8}\) NOSDA & & \\
\hline \% & & & & &  & & \\
\hline 1 & & & & & & & \\
\hline INCRR: & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{INSTR} & \multicolumn{2}{|l|}{\& AM2901 Re} & \multirow[t]{2}{*}{\[
{ }^{R \emptyset} y_{M A}
\]} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{}} \\
\hline & & & \multicolumn{2}{|l|}{\& NOINE \& SDMA} & & & \\
\hline , & & & \multirow[t]{3}{*}{NobA
NoSUP
LDI} & \multirow[t]{2}{*}{\& NOEAD} & \multirow[t]{2}{*}{\[
\begin{aligned}
& \delta \text { NORMM } \\
& \delta \text { NOSDD }
\end{aligned}
\]} & \multicolumn{2}{|l|}{\multirow[t]{4}{*}{\begin{tabular}{llll} 
\& & LMM & \& & NOCLB \\
\& NORCC & \& & LCV \\
\& & FETCB \\
\& & NOCIB
\end{tabular}}} \\
\hline , & & & & & & & \\
\hline 1 & & & & \& NOSTR & \(\delta\) RSRD & & \\
\hline 1 & & & \({ }^{5}\) L LDI & \& RCLMUL & 8 NOLIN & & \\
\hline decris: & \multirow[t]{5}{*}{branch} & \multirow[t]{5}{*}{Instr} & \multicolumn{2}{|l|}{\& AM2901 R6} & & \multicolumn{2}{|l|}{RAMF , SUbr , 2 La} \\
\hline & & & \multirow[t]{3}{*}{8 8 N NOBA} & \multirow[t]{2}{*}{\& NOLAD
\& NODIN} & \[
{ }_{\delta}^{\pi \bullet} V M A^{\prime}
\] & \multicolumn{2}{|l|}{} \\
\hline 1 & & & & & \({ }_{\text {S }}^{8}\) \% NORMM & \(8_{8}^{8}\) LMM NORCC & \\
\hline 1 & & & & \(¢_{6}\) NODIN & & \multirow[t]{2}{*}{S FETCH} & \multirow[t]{2}{*}{\& LCV} \\
\hline , & & & \[
\begin{aligned}
& \& \text { LDI } \\
& \& \text { NOCIA }
\end{aligned}
\] & \[
\begin{aligned}
& \delta_{0} \text { NOSTR } \\
& \delta . \text { RCLMUL }
\end{aligned}
\] &  & & \\
\hline & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{InSTR} & \multirow[t]{6}{*}{\[
\begin{array}{ll}
\delta & \text { AM29e1 } \\
\text { \& } & \text { NOINE } \\
\text { \& } & \text { NOBA } \\
\text { NOSWP } \\
\& & \text { LDI } \\
\text { \& } & \text { NOCIA }
\end{array}
\]} & \multirow[t]{6}{*}{} & \multirow[t]{2}{*}{\(\delta_{8}^{\text {Re }}\) VMA \({ }^{\text {a }}\)} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{Ramp head exnor, za}} \\
\hline & & & & & & & \\
\hline / & & & & & 8 NORMM & ¢ LMM & \& NOCLB \\
\hline 1 & & & & & \% NOSDA & 6 NORCC & \& LCV \\
\hline 1 & & & & & \% RSRD & 6 6. FETCH & \& NOCIB \\
\hline / & & & & & 8 NOLIN & & \\
\hline negri : & \multirow[t]{6}{*}{branct} & \multirow[t]{6}{*}{INSTR} & \multirow[t]{6}{*}{\[
\begin{array}{ll}
\delta & \text { AM29\&1 } \\
\& & \text { NOINE } \\
\text { \& } & \text { NOEA } \\
\delta & \text { NOSWP } \\
\delta & \text { LDI } \\
\delta & \text { CIA }
\end{array}
\]} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{}} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{}} \\
\hline , & & & & & & & \\
\hline , & & & & 8 NOLAD & \% NORMM & \(\delta^{8}\) LMM & NOCLIB \\
\hline , & & & & \(\delta\) NODIN & 8 NOSDA & \& NORCC & \& NOLCC \\
\hline 1 & & & & \(\delta\) NOSTR & \(\delta\) RSRD & \& FETCH & \& NOCIB \\
\hline 1 & & & & \& RCLMUL & NOLIN & & \\
\hline SUPRR: & \multirow[t]{6}{*}{contnue} & \multirow[t]{6}{*}{} & \multirow[t]{6}{*}{\begin{tabular}{l}
\& AM2901 \\
8 NOINE \\
\(\delta\) NOBA \\
\& NOSWP \\
\(\&\) LDI \\
© NOCIA
\end{tabular}} & Rø & Re , & \multicolumn{2}{|l|}{NOOP , ADD , 2 A} \\
\hline 1 & & & & \% NOSDMA & \% Novma & \(\delta^{6} \mathrm{VRITI}\) & \\
\hline , & & & & \& NOLAD & 5 NORMM & \(8_{8}\) LIMM & \(\delta^{\delta} \mathrm{NOCLB}\) \\
\hline , & & & & \& NODIN & \& NOSDA & \& NORCC & 8 LCCICP \\
\hline , & & & & 8 NOSTR & 8 RSAD & \& NOFTCE & CH \(\&\) NOCIB \\
\hline / & & & & \& RCLMUL & NOLIN & & \\
\hline & \multirow[t]{6}{*}{branch} & \multirow[t]{6}{*}{Instr} & \multirow[t]{6}{*}{\[
\begin{array}{ll}
\delta & \text { AM29D1 } \\
\text { \& } & \text { NOINE } \\
\text { \& } & \text { NOBA } \\
\delta & \text { SWPHL } \\
\text { S } & \text { IDI } \\
\delta & \text { NOCIA }
\end{array}
\]} & \multirow[t]{2}{*}{\(\delta^{R}{ }_{\text {SDM }}{ }^{\text {d }}\),} & R6 , & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{\({ }_{\delta}^{\text {Ramp Read }}\) OR , DZ}} \\
\hline \% & & & & & & & \\
\hline 1 & & & & \% NOLAD & \% NORMM & \(\delta\) LMM & 8 NoCls \\
\hline 1 & & & & \% NODIN & \% NOSDA & \& NoRCC & \& LCV \\
\hline , & & & & \& NOSTR & \& RSRD & \& FETCH & \& NOC \\
\hline 1 & & & & \& RCLMUL & NOLIN & & \\
\hline U ERR : & \multicolumn{2}{|l|}{\multirow[t]{5}{*}{brancy instr}} & \multicolumn{5}{|l|}{\multirow[t]{5}{*}{}} \\
\hline & & & & & & & \\
\hline \% & & & & & & & \\
\hline \% & & & & & & & \\
\hline 1 & & & & & & & \\
\hline
\end{tabular}


\begin{tabular}{|c|c|c|c|c|c|c|}
\hline / & & & & & & \\
\hline \% & & NOSVP & \& NODIN & S NCSDA & \(\delta\) NORCC \(\delta\) & \begin{tabular}{l}
NOCLB \\
LCCLCV
\end{tabular} \\
\hline 1 & & LDI & \(\delta\) NOSTR & \(\delta\) RSRS & \(\delta\) NOFTCH \(\delta\) & vCCIE \\
\hline 1 & & cia & \& rclmul & \% nolin & & \\
\hline andin : & branch ifetch & Am29e1 & & & Ramp and & DA \\
\hline / & & \(\delta\) NOINE & \& NOSDMA & \(\delta^{\text {NOVMA }}\) & \(\delta^{8} \mathrm{READ}\) & \\
\hline , & & \& BA & \& NOLAD & \% NORMM & \& 1 MM & nCCLF \\
\hline 1 & & \(\delta\) noswp & \(\delta\) LINHL & \& Nospa & \& NORCC \(\delta\) & LCV \\
\hline \% & &  & ¢ NOSTR & \& R RRD \({ }_{\text {che }}\) & ¢ NOFTCH \(\$\) & NOCIB \\
\hline ANDI + R : & contnue & am2961 & R@ & & rama & , 2A \\
\hline & & NOINE & \& NOSDMÁ & \% VMa & \(\delta\) READ & \\
\hline , & & \(\&\) NOBA & \& LAD & 8 NORMM & \& LMM \(\delta\) & NOCLB \\
\hline , & & \(\delta\) NCSWP & 8 NODIN & \(\delta\) NoSDA & \& NORCC \(\%\) & ICCLCV \\
\hline 1 & & \& LDI & \& NOSTR & \& R RSRS & S NOFTCH \& & nccib \\
\hline 1 & & CIA & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline & branch andin & \& amz9e1 & Re & & NOOP , OR & , D2 \\
\hline / & & NOINE & 8 SDMA & \(\delta\) vma & \(\delta\) biad & \\
\hline / & & NOBA & \& LAD & \(\delta\) NORMM & \% LMM \(\delta\) & noclb \\
\hline 1 & & \(\delta\) NOSWP & \(\delta\) DINHL & \% NOSDA & \(\delta\) NORCC \(\delta\) & ICCLCV \\
\hline 1 & & \(\delta\) LDI & \& NOSTR & \& MWAMm B & B 8 NOFTCH \(\delta\) & nCCIB \\
\hline / & & nocia & \& RCLMUL & 8 NOLIN & & \\
\hline AND2R: & contaue & \[
\begin{aligned}
& \& \\
& \& \\
& \text { AM2901 } \\
& \text { NOINE }
\end{aligned}
\] & \[
\varepsilon_{\&}^{\text {R15 NOSDMA }}
\] & \[
{ }_{\delta}^{R 15}{ }_{V M A}^{\prime}
\] & \[
\underset{\delta R E A D}{R A M A}
\] & , 2A \\
\hline & & \& NOINE & \& NOSDMA & \& VMA & \& READ & \\
\hline \% & & 5 NOBA & \& LAD & & \& LMM \({ }^{\circ}\) & noclb \\
\hline , & & \& NoSwP & \& NCDIN & \& NGSDA & \& NORCC \(\delta\) & LCCLCV \\
\hline , & & \& LDI & 8 NOSTR & 8 Mvam* & \& NCFTCH \(\%\) & actib \\
\hline 1 & & cia & \& RCLmul & ¢ Nolin & & \\
\hline & branct andin & \& AM2901 & \({ }^{R D}\) & \[
R \emptyset
\] & NOCP \({ }^{\text {c }}\) ADD & , DA \\
\hline 1 & & \(\checkmark\) NOINE & ¢ S SMA & \(\delta^{\circ} \mathrm{VMA} A\) & 5 Read & \\
\hline 1 & & NOBA & \(\delta\) LAD & 5 NCRMM & 5 LMM & NOCLB \\
\hline 1 & & NOSW & \& DINHL & \& ncspa & 5 NORCC & ICCLCV \\
\hline , & & \& IDI & \& nCSTR & \(\delta\) RSPS & \& NOFTCE & nocib \\
\hline 1 & & NOCIA & 5 RCLMUL & \(s\) NOLIN & & \\
\hline ICRM+R : & ccitnue & am2901 & R9 & & rama , add & , 2A \\
\hline & & NOINE & \% SDMA & \(\delta{ }^{\text {PMA }}\) & \(\delta\) RFAD & \\
\hline , & & nOBA & s. LAD & 5 NORMM & \& LMM \({ }^{\text {\& }}\) & NOCLB \\
\hline , & & \% NOSWP & \& NODIN & 5 Nosda & \& NCRCC \(\delta\) & lecicv \\
\hline 1 & & \& LDI & s NOSTR & S RShS & \& NOFTCH \(\delta\) & NOCIB \\
\hline 1 & & cia & \& RCLMUL & \& NoLin & & \\
\hline IORIN: & bhanch ifetch & AM2901 & Re & & Ramp \({ }^{\text {a }}\) OR & DA \\
\hline & & \& NOINE & \& NOSDMA & 8 novma & \& READ & \\
\hline , & & BA & \(\&\) Nolad & \& NORMM & \(\delta\) LMM \(\delta\) & noclb \\
\hline , & & NOSWP & \& DINAL & \& NOSDA & \& NORCC \& & ICV \\
\hline 1 & & \& NOLDI & \& NCSTR & 5 R RDRD & \& NCFTCH & NOCIB \\
\hline 1 & & nocia & \& RCLMUL & 8 nolia & & \\
\hline IORI + h: & contnue & \& amzer1 & & & Rama \({ }^{\text {add }}\) & A \\
\hline & & \& NOINE & \& NOSDMA & & \& Read & \\
\hline , & & noba & \& LAD & 8 NORMM & \& LMM & noclb \\
\hline , & & \& NCSWP & \& NODIN & \(\delta\) NoSDA & \& NORCC \& & LCCLCV \\
\hline 1 & & \& LDI & \& NOSTR & \(\delta\) RSRS & \& NOFTCH \(\delta\) & nocib \\
\hline 1 & & cia & 5 rclmul & \(\delta\) NOLIN & & \\
\hline & brance iorin & \& am2901 & Re \({ }^{\text {d }}\) & & NOCP , OR & D2 \\
\hline / & & \& NOINE & \& SDMA & & \& READ & \\
\hline 1 & & \& NOBA & \& LAD & ¢ NORMM & \& LMM \(s\) & NOCLB \\
\hline 1 & & \& NOSWP & \& DINEL & \& NOSDA & \& NORCC \(\%\) & LCCLCV \\
\hline , & & \({ }_{6}\) LDI & \& NOSTR & \& MVAMWB & B \& NOFTCE \& & nocib \\
\hline 1 & & \& NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline IOR 2R: & CONTNuE & \% Am2991 & \({ }^{\text {R15 }}\) 5 \({ }^{\text {a }}\) & & RAMA ADD & 2 A \\
\hline  & & \& NOINE & & & & \\
\hline ', & & \(\delta^{\delta}\) NOBA & 8 \& LAD & & \({ }_{\&}^{\delta}\) LMM NORCC \({ }_{\delta}^{\text {¢ }}\) & \({ }_{\text {a }}^{\text {NOCLB }}\) \\
\hline \% & & \& N NOSWP & \& NODIN & 6 NOSDA &  &  \\
\hline 1 & & \& CIA & 8 RCLMUL & \& NCLIN & & \\
\hline & branch iorin & amz9e1 & & & NOOP \({ }^{\text {, }}\) ADD & , DA \\
\hline \% & & \& NOINE & \& SDMA & & & \\
\hline 1 & & \%. NOBA & 8. LAD & \(\delta\) NORMM & \(\delta\) LMM \(\delta\) & NOCLB \\
\hline 1 & & \(\delta^{8} \mathrm{NOSW}\) & \(\delta\) DINHL & \& NOSDA & \(\delta_{8}\) NORCC \({ }^{\text {d }}\) & ICCLCV \\
\hline 1 & & \(\delta\) IDI & \& NOSTR & \(\leqslant\) RSRS & \& Noftch \& & NoCib \\
\hline 1 & & \& NoCIA & \& RClmul & \& NOLIN & & \\
\hline XORM + R : & contaue & \& AM2901 & & & zama & \\
\hline & & \(\&\) NOINE & 6. SDMA & \(\delta\) Ima & ¢ READ & \\
\hline , & & a NOBA & \& LAD & \& Normm & \& LMM \(\delta\) & noclb \\
\hline , & & \& NoSWP & \& NODIN & \(\delta\) NOSDA & \& NORCC \(\%\) & LCCICV \\
\hline , & & \(\delta\) LDI & \% NCSTR & \(\delta\) R RSR & \& NOFTCH \& & nocib \\
\hline 1 & & CIA & \& RClmul & & & \\
\hline XORIN: & bRANCl Ifltch & \[
\begin{aligned}
& \& \\
& \text { AM2901 } \\
& \&
\end{aligned}
\] &  & \[
\delta_{\text {Re }} \text { NOVMÁ }
\] & \[
\underset{\& ~ R E A D}{\text { RAMF }} \text { EXOF }
\] & , DA \\
\hline , & & \& \({ }_{\text {BA }}\) & \& nolad & \& NORMM &  & nocli \\
\hline , & & \& NOSWP & 5 DINHL & \& NCSDA & \& NORCC \(\&\) & LCV \\
\hline , & & \(\delta\) N NLDI & \& NOSTR & \(\delta\) RDRD & \& NOFTCH \& & nocib \\
\hline 1 & & \& NOCIA & \& RCLMUL & \& NOLIN & & \\
\hline XCRI + P: & cominue & \& am2901 & Re & & rama , add & , 2A \\
\hline / & & \& NOINE & \& NOSDMA & \(\delta \mathrm{vma}\) & \(\delta\) fead & \\
\hline , & & \(\delta\) NOBA & \& LAD & \(\delta\) NORMM & \& Lmm \(\%\) & mocib \\
\hline 1 & & \& NoSwP & \(\&\) NODIN & 8 NoSDA & \& NORCC \({ }^{\text {s }}\) & LCCLCV \\
\hline , & & \& LDI & \& NOSTR & \(\delta\) RSRS & \& Noftch \& & nccif \\
\hline 1 & & \(\delta\) CIA & \& RCLMUL & \& NOIIN & & \\
\hline & brance xorin & \& am2901 & R 9 & & NOCP & D2 \\
\hline 1 & & \({ }_{8}{ }^{\text {N NOINE }}\) & \& SDMA & \(\delta^{\text {g jna }}\) & \& READ & \\
\hline , & & \({ }_{8}{ }_{8}\) Noba & \& LAD & ¢ NORMM & \(\delta_{8}^{\text {S L M }}\) S \({ }^{\text {s }}\) & \(\mathrm{vaclb}^{\text {a }}\) \\
\hline , & & \& LDI & \& NOSTR & 8 mwamb & \& NOFTCE \(\delta\) & ACCIb \\
\hline , & & \& Nocia & \& RCLMUL & S NOLIN & & \\
\hline XORZR: & contnue & \& am2901 & R15 & R15 , & Raya , add & , za \\
\hline 1 & & ¢ NOINE & \& NOSDMA & \(\delta\) VMA & \& READ & \\
\hline , & & 8 noba & \& LAD & \(\delta\) NORMM & \& LMM \(\delta\) & NOCIB \\
\hline , & & \& NOSWP & \& NODIN & \(\delta\) NOSDA & \& NORCC 5 & LCCLCV \\
\hline , & & \(\delta\) LDI & \& NoSTR & \(\delta\) mwamub & \& NOPTCH \(\&\) & nccib \\
\hline 1 & & \% CIA & \& RCLMUL & \(\delta\) NOLIN & & \\
\hline & branct xorin & \& AM2991 &  & \(\delta_{\delta}^{\mathrm{Ra}} \mathrm{VMA}{ }^{\text {a }}\) & \(\mathrm{O}_{0} \mathrm{OP}\) PEAD & DA \\
\hline , & & \& \({ }_{\text {\% }}\) NOBA & \(¢_{\text {\& LAD }}\) & & & nocib \\
\hline , & & \& NOSWP & \& DINHL & 8 NOSDA & \& NORCC \% & LCCLCV \\
\hline 1 & & \& LDI & \& NOSTR & \(\&\) RSRS & \& MOTTCE \(\%\) & NCCIB \\
\hline 1 & & \& nocia & \& Relmul & \(\delta\) NOLIN & & \\
\hline CMPM + R : & contnue & \[
\begin{array}{ll}
\& & \text { AM29@1 } \\
\& & \text { NOINE }
\end{array}
\] & \[
\varepsilon^{R \emptyset}{ }_{S D M A},
\] & \[
{ }_{\delta}^{R \subset} \text { VMA }
\] & \[
\underset{\& ~ R E A D}{\text { RAMA }}
\] & A \\
\hline
\end{tabular}


\begin{tabular}{|c|c|c|c|c|c|c|}
\hline & branch iecout & \% AM2981 & Re & \({ }^{R}\) & NOCP \({ }^{\text {a }}\) OR & D 2 \\
\hline 1 & & \& NOINE & 5 NOSDMA & 5 VMA & - RFAD & \\
\hline 1 & & \& NOBA & \& Lad & \& NORMM & \(\delta\) LMM & NOCLB \\
\hline 1 & & \% NOSWP & \(\&\) DINHL & ¢ NOSDA & \& NORCC & LCCLCV \\
\hline 1 & & 6 IDI & \& NOSTR & \% MWAM'B & \& NOFTCL & nOCIb \\
\hline , & & - NOCIA & \& RCLMUL & \& NOLIA & & \\
\hline DEC2F: & contaue & \& AM2901 & R15 & R15 & RAMA, ADD & 2A \\
\hline 1 & & \& NOINE & a NOS DMA & \(\delta\) VMA & \(\delta\) READ & \\
\hline 1 & & NOBA & ¢ LAD & \& NORMM & 5 LMM & NOCLB \\
\hline 1 & & NOSW & \& NCDIA & \& MOSDA & 6 NCFCC & LCCLCV \\
\hline , & & \(\delta\) LDI & \& NOSTR & \% MWame \(B\) & \& NOFTCH & nOCIb \\
\hline 1 & & \& CIA & \& RCLMUL & \& NOLIN & & \\
\hline & BRANCH LECOUT & AM2901 & R® & & NOOP , ADD & DA \\
\hline 1 & & S NOINE & \& NOSDMA & \& VMA & \(\delta\) RFAD & \\
\hline 1 & & \& NOBA & \(\delta\) Lad & \& NORMM & \(\delta\) LMM & NOCLB \\
\hline 1 & & ¢ NOSWP & \(\delta\) DINAL & \& AOSDA & \& NCFCC \% & LCCLCV \\
\hline 1 & & \& LDI & ¢ NOSTR & \% RSRS & \(\delta\) NOFTCH & NOCIB \\
\hline 1 & & \& NOCIA & \& RCLMUL & \& NOLIA & & \\
\hline COMMF: & CONTNLE & \& Am2901 & R0 & R0 & Rama , ADD & , 2A \\
\hline 1 & & \& NOINE & \& NOS DMA & \& VMA & \& READ & \\
\hline 1 & & \& NOEA & \& LAD & \& NORMM & \& LMM & NOCLB \\
\hline 1 & & \& NOSWF & \& NODIN & \& NOSDA & \& NORCC \(\%\) & LCCLCV \\
\hline 1 & & \& LDI & \& NOSTR & \& RSRS & \& NOFTCH & NOCIB \\
\hline 1 & & 6 NOCIA & \& RCLMUL & 8 NOIIN & & \\
\hline
\end{tabular}

\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline COMI +R : & COntnue & \& & AM2901 & R8 & Re & RAMA , 4 DD & 2 A \\
\hline \% & & \(\delta\) & NOINE & \& NOSDMA & \& VMA & \& READ & \\
\hline 1 & & \(\delta\) & NOEA & \& IAD & 8 NOPMM & \(\delta\) LMM & NOCLB \\
\hline / & & \(\delta\) & NOSWP & \& NODIN & \& NOSDA & \% NORCC 6 & LCCLCV \\
\hline 1 & & \(\delta\) & LDI & \& NOSTR & \& RSRS & \& NOFTCH 8 & nocir \\
\hline 1 & & \& & CIA & \& RCIMUL & 8 NOIIN & & \\
\hline
\end{tabular}
bRANCH COMOUT \& AM2901 RO , RO , NOCP, OR , D2
 S LDI

\begin{tabular}{|c|c|c|c|c|c|c|}
\hline negcut : & branch ifetch & \& AM2901 & R \({ }^{\text {d }}\) & Ro & 2amp , SUbR & \\
\hline & & ¢ NOINE & \& NOSDMA & \(\delta\) VMA & \& hhite & \\
\hline , & & \& \(10 B A\) & \& Nolad & \& NCFMM & 5 LMM \& & NOCLB \\
\hline / & & \& NCSWP & \& DINPL & \& NOSDA & \& NORCC & NCLCC \\
\hline , & & \(\delta\) NOLDI & \(\delta\) NOS PR & \& RDEL & 5 NOFTCII & NOCIS \\
\hline / & & \& CIA & \(\bigcirc\) R RCLMUL & \& NOLIN & & \\
\hline NEGI +R: & contnue & \& AM2901 & R \(\gamma\) & & RAMA , ADD & , 2A \\
\hline & & \& NOINE & 6 NOSDMA & \& VMA & \& RFAD & \\
\hline / & & \& NCBA & 6 LAD & \& NOPMM & \(\delta\) LMM & NCCL.B \\
\hline , & & \& NOSWP & 6 NCDIN & \& NOSDA & \(\delta\) NORCC & LCCLCV \\
\hline / & & \(\delta\) IDI & 6 NOSTR & 6 RSKS & G NOFTCH & nocib \\
\hline / & & \& CIA & \(\delta\) RCLMMU & \& NOLIN & & \\
\hline
\end{tabular}
 \(\begin{array}{llllll}\text { NOINE } & \text { \& NOSDMA } & \delta & \text { VMA } & \text { \& READ } \\ \text { \& } & \\ \text { NOBA } \\ \text { \& }\end{array}\)

\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline NESTR: & CONTNUE & \% AM2901 & R15 & R15 & RAMA \({ }^{\text {a }}\) & ADD & Z. \\
\hline / & & \& NOINE & \& NOSDMA & \& VMA & \& READ & & \\
\hline / & & \& NOBA & \& LAD & o NORMM & \& LMM & \(\delta\) & NCCLP \\
\hline , & & \& NOSWP & \(\&\) NODIN & \(\delta\) NOSDA & \& NORCC & \(\delta\) & LCCLCV \\
\hline / & & \%. LDI & \& NOSTR & \& MbAMb 3 & 5 NOFTCH & & NOC IB \\
\hline 1 & & \& CIA & \& RCLMUL & 8 NOLIN & & & \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline & BRANCH NEGOUT & AM2901 & Re & Re & NOOF \({ }^{\text {a }}\) ADD & DA \\
\hline 1 & & 8 NOINE & 6 NOSDMA & \(\delta\) VMA & \& READ & \\
\hline 1 & & \& NOBA & \& LAD & \(\delta\) NORMM & \& IMM & NOCLB \\
\hline 1 & & \& NOSWP & \& DINFL & \& NOSDA & 6 NORCC & LCCLCV \\
\hline 1 & & \& LDI & \& NOSTR & \& R RRS & \& NOPTCE 8 & NCCIE \\
\hline 1 & & \& NOCIA & \& RCLMUL & 6 NOLIN & & \\
\hline SWPMR: & CONTNUE & \& AM2901 & R 0 & R0 & Rama , ADD & \\
\hline 1 & & \& NOINE & \& NOSDMA & \(\delta\) VMA & \& READ & \\
\hline / & & \& NOBA & \& LAD & \& NORMM & \& LMM & NOCLB \\
\hline
\end{tabular}



\section*{Chapter IX Super Sixteen}

\section*{INTRODUCTION}

The AMD 16-Bit Computer design is an example of a high-speed microprocessor system which takes full advantage of AMD's Am2900 Family of Bipolar microprocessor circuits to provide an economical, high performance, self contained 16-bit computer. It was designed to demonstrate the principles of a microprogrammed system.
This design is intended to show some of the techniques used to achieve high performance. This includes pipelining at the microprogram level as well as pipelining at the macro or machine instruction program level. A powerful instruction set is demonstrated which allows the user to write efficient programs in a minımum amount of time.
One of the unique features of the design is that in addition to using the high performance Am2900 Bipolar microprocessor family, it takes advantage of the MOS peripherals normally associated with MOS microprocessors. These are used to perform the slower functions, particularly in the I/O interface area.

\section*{SYSTEM ORGANIZATION}

The 16 -Bit Computer is designed to perform in a system environment as shown in Figure 1. The system consists of a central processing unit (the 16-Bit Computer), memory units, I/O units (peripheral controllers), and a bus controller. These units communicate over the system bus consisting of a 16 -bit wide address bus, 16 -bit wide bi-directional data bus, and a control bus. The control bus is a collection of signals that include the memory and I/O interface controls and the interrupt request lines.

This organization allows systems to be configured with more than one CPU and multiple memory and I/O units. The bus controller arbitrates requests for bus use from the CPU's or I/O units that require DMA transfers.
This application note concentrates on the design of the CPU portion of the system.

\section*{INSTRUCTIONS}

An instruction is either one or two 16-bit words in length and must be located in main memory on an integral word boundary The left most eight bits of the instruction is always the operation code, followed by two, 4 -bit register designation fields (Figure 2). The 16-bit (one word) instruction is always this format. The 32-bit (two words) instruction has the first (left most) word exactly like the 16 -bit instruction. The second word of the 32 -bit instruction is always full 16 -bit value (d) which acts as a memory reference address or an immediate value (Figure 3). This architecturally simple instruction format becomes very powerful when implemented on a microprogrammed machine.
The 8-bit opcode provides for 256 primary instructions, which is usually more than enough for most general purpose computers. The 4-bit register fields ( \(R_{1}\) and \(R_{2}\) ) each designate one of the sixteen, 16 -bit registers ( \(\mathrm{R}_{0}-\mathrm{R}_{15}\) ). Depending upon the operation, each register can act as either an accumulator for arithmetic and logic operations, or an index register in modulo address arithmetic. On operations where the result is placed in a register, the \(R_{1}\) field depicts the destination register and \(\mathrm{R}_{2}\) (or \(\mathrm{R}_{2}+\mathrm{d}\) ) is, or points to the source field in main memory. On operations where the


Figure 1. System Organization.


Figure 2. 16-Bit Instruction (RR, RS, SS).
result is transferred from a register to memory, the \(\mathrm{R}_{1}\) field depicts the source register and \(R_{2}\) (or \(R_{2}+d\) ) points to the destination memory location. Memory to memory transfers will have \(R_{2}\) as the source pointer and \(\mathrm{R}_{1}\) as the destinatıon pointer. Even though the \(R_{1}\) and \(R_{2}\) fields are architecturally wired to the Am2903 register address inputs, variations of the source/destination assıgnment may be implemented via microcode.
The complete defined standard instruction set is given in Table 1. This is a typical "machine level" instruction set. It allows manıpu-


Figure 3. 32-Bit Instruction (RX, RSI).

Table 1. 16-Bit Computer Instruction Summary Mnemonic Instruction Format.

lation of bit, byte, word and multibyte data; PUSH/POP single or multiple registers to/from stacks; maintain multiple stacks; decimal, binary and integer arithmetıc; byte and word I/O; and maintain supervisory control over hardware and software generated interrupts.

\section*{Instruction Format}

Many of the instructions have multiple formats. These formats depict addressing modes and determine where the source and destination fields are located. The defined instruction formats are shown in Figure 4.

\section*{The Program Control Unit}

The Program Control Unit (PCU) under control of the microprogram is used to update the Program Counter and load this value into the Memory Address Register (MAR) for reading instructions/data from main memory. The PCU is also used to update the stack pointer and compare this value to the stack limits during stack operations. As can be seen in Figure 5, the Computer Block Diagram, data can be sent to the PCU from the ALU via the Transfer Register. The PCU can also output data onto the PCU bus to the Y -bus of the ALU via the bi-directonal PCU transfer drivers.


Figure 4. Instruction Formats.

The instructions set consists of nine instruction groups:
- Fixed-point load/store
- Fixed-point arithmetic
- Byte
- Shift/rotate
- Branch control
- I/O
- Stack
- Extended
- System

A complete description of each instruction is given in Appendıx A.

\section*{CENTRAL PROCESSING UNIT ARCHITECTURE}

\section*{Processor Organization}

The organization of the computer is shown in Figure 5 (Computer Block Diagram) The computer is organized into several distinct sections, the Program Control Unit (PCU), the Arıthmetıc and Logic Unit (ALU), and the Computer Control Unit (CCU), the Data Path, the Memory Control and Clock Control, and Input/Output Interface and Interrupt Section The logic diagrams for the CPU are located in Appendıx F. Earlier chapters in the Build a Microcomputer series have described the principle sections of a computer and the Am2900 components used in these sections. This chapter describes how these components are used to implement a very high-speed low cost computer.

The PCU is organized around four Am2901's. The use of Am2901's allow the PCU to generate addresses with the flexibility of an ALU chip, to increment the Program Counter by two in one microcycle, and to provide the stack pointer registers for in main memory stack operations. The registers of these Am2901's are defined as shown in Figure 6 Register 0 holds the program counter and Registers 4 and 5 hold constants for incrementing. Byte addressing requires the address to be incremented by two every time 16 bits of instruction data are fetched.

\section*{The Arithmetic and Logic Unit (ALU)}

The ALU shown in Figure 7 is organized around four Am2903's. The Am2903 performs all of the functions performed by the Am2901A but also provides the computer with separate DA bus and DB bus input ports as well as additional instructions to implement multıplication and division. Three major buses connect to the ALU. DA, DB and \(Y\) buses. The memory data from the \(Z_{0}\) Regıster and microcode immediates are brought into the Am2903 through the DA port while Program Status Bits 16-23 enter via the DB port. The Am2903's output or recelve data on the \(Y\) bus for loading into the RAM registers. The Am2903's zero decode logic detects zero on the Y port whether or not the Y port is receiving or sendıng data.
To implement the defined instruction set, the RAM register selection controls are sent from the Instruction (I) Register to the Am2903's. \(\mathrm{I}_{0-3}\) (used with instructions with the \(\mathrm{R}_{2}\) or \(\mathrm{X}_{2}\) field) are
\begin{tabular}{|l|l|}
\hline Register Number & Register Assignment \\
\hline 0 & Program Counter \\
\hline 1 & Stack Poınter \\
\hline 2 & Stack Lower Limit \\
\hline 3 & Stack Upper Limıt \\
\hline 4 & +2 \\
\hline 5 & +4 \\
\hline 6 & Not used - avalable \\
\hline 7 & Not used - avallable \\
\hline \(8-15\) & Not used (wired dısable) \\
\hline
\end{tabular}

Figure 6. PCU Register Assignments.
connected to the A address inputs on the Am2903 while \(\mathrm{I}_{4-7}\) are connected to the \(B\) address inputs. The ALU operatoons performed are controlled by microcode bits \(\mathrm{M}_{78-86}\) which are connected to the Am2903 \(\mathrm{I}_{0-8}\) inputs.

The Am2904 provides the microcode and machine status registers holding the carry, negative, zero and overflow status The machine status bits C, N, Z and OVR are defined as PSW bits 16-23. Logic in the Am2904 includes a condition code multiplexer to select the true or complement of any of the four status bits and combinations of status bits from ether the machine or microstatus registers or directly from the ALU. This condition code multiplexer is controlled by Instruction Register bits \(\mathrm{I}_{4-7}\) which are gated to the Am2904 \(\mathrm{I}_{0-3}\) inputs during the execution of a conditional branch. The output of the multiplexer, labeled TEST is routed to the test tree for input into the Am2910. The Am2904 also provides the shift linkages and shift linkage control and selection of the type of carry signal to the ALU and lookahead carry unit.
The ALU is designed to work with byte operations as well as 16 -bit operations. Byte operations operate only on the lower 8 bits of register data without affecting the upper 8 bits of data. During byte operations the WORD signal ( \(\mathrm{M}_{90}\) ) goes inactive disabling the Write Enable and Output Y Enable for ALU bit slices 3 and 4. The word/byte multiplexer circuit will select \(\mathrm{C}, \mathrm{N}\) and OVR status bits from ALU bit slice 2 and at the same time ALU bit slice 2 has its MSS input pulled LOW to indıcate most significant slice. The zero status bit being OR tied to all of the ALU bit slices cannot be multiplexed. Instead the Y bus signals \(8-15\) are forced to zero by gating zeroes from the PCU resulting in the \(Z\) signal line state being a function of ALU bit slices 1 and 2 only.

\section*{The Computer Control Unit}

The Computer Control Unit controls the sequence of execution of the microinstructions. The Am2910 Microprogram Controller provides the sequencer for the microprogram (see logic diagrams Sheet 5). Branch addresses and counter values loaded into the Am2910 \(D_{0-11}\) inputs, originate from the Pipeline Register ( \(\mathrm{M}_{0-11}\) ), the interrupt vector decoder, and the machine instruction decoder. The instruction decoder, also called Mapping ROM, (a \(512 \times 8\) PROM) uses the Instruction Regıster \(I_{8-15}\) as address bits with the PROM outputs being the starting address of the microcode sequence that executes each machine instruction. In this design the Am29775 Registered PROM's are used to provide both the microprogram memory ( \(512 \times 96\) bits wide) and the Pipeline Register. The mıcrocode bits \(\mathrm{M}_{16-20}\) are output from Am29774 because these signals require open collector outputs rather than the standard tri-state outputs to allow the Am2910 inputs \(\mathrm{I}_{0-3}\) to be pulled to zero.

The starting address generation for the interrupt service routine and initialization routine is accomplished with a minımum of extra logic. During the last microcode cycle of the previous machine instruction, the MAPEN signal is activated to enable the output of the Mapping ROM. However, if an interrupt request is pending, the Mapping ROM is disabled and the pull-up resistors force the eight least significant microprogram branch address lines to all ones, vectoring the microprogram to the interrupt service routine. After a reset, the microprogram should be vectored to address zero, the starting address of the initialization routine. This is accomplished by having the reset signal force zeroes into the Am2910 I \({ }_{0-3}\) inputs which causes the Am2910 to output address zero.

\section*{Clock and Memory Control}

The architecture of this computer achieves its high throughput by being able to execute machine instructions in as little as one microcycle. This is accomplished by overlapping (also called pipelining) the fetch and decode with the execute microcycles. An essential part of this design is the memory control section. The clock and memory control circuits shown in Sheet 6 of the logic diagrams work together to provide a very efficient mechanism for integrating memory operations with the computer. The memory interface timing is a clocked handshaked protocol shown in Figure 8. Each memory transfer consists of a Bus Request, Bus Acknowledge response, Memory Request, Address Accept response, Data Request and a Data Sync response. At the maximum rate a memory interface response can occur 50 ns after the computer activates a control line. This makes it possible to read from main memory once every microcycle ( \(4 \times 50 \mathrm{~ns}=\) 200 ns ); however should a particular memory board require a longer cycle, it can delay sending Data Sync to the computer to extend the cycle.

The read and write timing are shown in more detail in Figures 9 and 10 . Note that if a memory read is taking place during microcycle N, the Bus Request, Bus Acknowledge and the start of memory address are output from the computer in the previous \(\mathrm{N}-1\) cycle, and the data is sent to the computer during the first half of the following \(\mathrm{N}+1\) cycle. Now consider the case of back-to-back main memory read cycles. In this case, in the microcycle that the computer sends the address to the memory board, the memory board is sending data to the computer; but this is not the data associated with the address being received but the data associated with the address received during the previous microcycle.
A free running or uncontrolled 20 MHz clock on the backplane is connected to all of the devices which effect memory transfers (CPU, bus controller, and memory modules). All of the signal handshaking that is required by the memory interface protocol is clocked with the same 20 MHz clock to ensure no metastable conditions occur during memory transfer. Careful examination of this memory interface operation will reveal that not only does it solve the very serious metastable problem, but also that the clock synchronization and bus propagation delay occur during the memory read access time (or write time) and do not slow down the memory transfer rate.
The CPU clock generation is intımately related to the Memory Control Logic The CPU clock signals Phase \(1\left(\phi_{1}\right)\) and Phase 2 \(\left(\phi_{2}\right)\) are shown along with the memory interface signals in Figure 8. Phase 1 is a square wave set high at the beginning of the microcycle and has a period of 200ns. Almost all operations of the computer are clocked with the leading edge of \(\phi_{1}\). The clock control logic will enable the next cycle only if a Bus Request has received a Bus Acknowledge and only if a Memory Request has


Figure 7. ALU Block Diagram.


Figure 8. Clocked Handshaked Protocol.
received a Data Sync response. If the bus or memory resources of the system are temporarily being used by other processors, the computer will stop the clock and wait.

\section*{Data Path}

The Data Path logic incorporates 8-bit wide devices wherever possible. The D Register drives directly onto the external data bus. Both main memory and I/O data are received through the \(Z\) Registers. Registers \(Z, Z_{0}\) and \(Z_{1}\) are actually latches implemented with Am74S373's. The Z Register enable latch signal, LDZ is derived from the memory control logic and main memory board logic both of which are clocked with the uncontrolled 20 MHz clock ( \(20 \mathrm{MHzUNC} \mathrm{)} .\mathrm{Using} \mathrm{the} \mathrm{uncontrolled} \mathrm{clock} \mathrm{allows} \mathrm{{ }}^{2}\). \({ }^{2}\). the memory operation to go to completion at memory speed even when single stepping the microcode. This allows the system to use dynamic RAM's in the main memory since stopping the handshaking circuits during single step would prevent refresh operations from taking place.
Data from the main memory passes through the \(Z\) Register to the \(\mathrm{Z}_{0}\) and \(\mathrm{Z}_{1}\) Registers. The \(\mathrm{Z}_{0}\) and \(\mathrm{Z}_{1}\) Registers are enabled transparent at the beginning of the microcycle following the read main memory microcycle. This allows memory data to flow through the \(Z\) and \(Z_{0}\) Registers (actually latches) to the ALU or flow through the \(Z\) and \(Z_{1}\) Registers to the Instruction Decoder (Mapping ROM). The \(Z_{1}\) and \(Z_{0}\) Registers are locked down halfway through the microcycle guaranteeing the computer solid data and making it possible to send data from the D-Register out to the external Data Bus during the second half of the same microcycle. This is another example of how this design tightly dovetails data transfers in order to gain very high execution rates.

\section*{Interrupt and Input/Output}

The interrupt and I/O section is shown in Sheet 7 of the logic diagrams.

The basic interrupt handling is controlled by the Am2914. In this design the Am2914 is used to proritize and enable interrupts, provide the mask register, generate an Interrupt Request and Interrupt Vector. Interrupt nestıng is done in the machine software interrupt handler. The external interrupt request signals ( \(\mathrm{INT}_{0}{ }^{-}\) \(\mathrm{INT}_{7}\) ) are input into the Am2914 from the external Control Bus (C Bus). When a peripheral controller requests computer servicing, it activates its assigned interrupt line. If this interrupt level is unmasked and interrupts are enabled, the Am2914 activates the INTERRUPT REQ signal that goes to the Computer Control Unit which causes the microprogram to vector to the microcode interrupt service routine. This microcode routine pushes the PSW onto the main memory stack, then reads the interrupt vector from the Am2914 and uses this value to vector the computer to the machine software routine that services the interrupt.

The Am9519 MOS Universal Interrupt Controller is incorporated into the design and its Group Interrupt signal is connected to the least significant \(\mathrm{INT}_{0}\) input of the Am2914. The Am9519 handles an additional eight interrupt levels for low speed requesting devices. This MOS LSI component offers the computer comprehensive interrupt handling capabilities at low cost. One feature the Am9519 offers is the capability of software generated interrupts. The console function, single instruction stepping, is implemented using a microcode routine that uses the software generated interrupt capability.



The I/O protocol for the AMD 16-Bit Computer is similar to that required to control Am8080/9080 peripheral circuits. As shown in Figures 11 and 12, the computer outputs the address over the system address bus, activates a control line (e.g., IORD) and holds these outputs until receiving a response, IOACK, from the peripheral controller. Execution of the I/O operation is done almost entirely in microcode with the I/O Control Register, a single Am2920, being the only additional hardware required. This is an example of a design precept followed in this computer which is to implement all features in microcode wherever possible. This results in a low cost computer, although sometimes slower, and a design that is flexible and easily modifiable to meet new requirements.
The I/O section has two Am8251/9551 Programmable Communıcation Interface components giving the computer two serial I/O Ports, one of which is reserved for the console. The console can be any standard RS-232 interface termınal

\section*{Instruction Execution}

To execute instructions, the main steps performed by the computer are: (1) form memory address, (2) instruction fetch, (3) decode, (4) displacement fetch, (5) form operand address, (6) operand fetch, and (7) execute. Every instruction type is made up of microinstructoons that execute these basic steps, but most instructions require three steps or less. Instruction sequences for Register to Register (RR) and Register to Indexed Storage (RX) instructions are shown in Figures 13 and 14 to illustrate how the computer operates. These figures show the RR instruction requiring four microcycles and the typical RX instruction requiring
seven microcycles. However, as will be explained later, in actual operation the effective time for an RR instruction is one microcycle and three for the RX.

\section*{Form Instruction Address}

During this microcycle the instruction address is formed by having the Program Control Unit (PCU) under control of the microprogram increment the Program Counter by two. This address is then loaded into the MAR and back into the PC.

At the beginning of the cycle, Bus Request is activated causing the Bus Controller to respond with Bus Acknowledge. The address is then output from the MAR out on the Address Bus 50ns prior to the beginning of the next cycle.

\section*{Instruction Fetch}

During this cycle, the main memory is fetching the contents of the address previously generated. The computer is designed to work with high-speed main memory capable of reading a memory location in one microcycle so that the instruction will be sent back to the computer at the beginning of the next cycle.

\section*{Decode Cycle}

The instruction fetched from main memory during the previous cycle is sent to the computer at the beginning of the cycle. The instruction falls through the \(Z\) and \(Z_{1}\) Registers (actually transparent latches) and is routed to the Instruction Decoder (Mapping PROM). The Instruction Decoder translates the 8 -bit operation code of the instruction into an 8-bit address used as the starting address for the microprogram that will execute this instruction.


Figure 11. I/O Read Timing.


Figure 12. I/O Write Timing.
\begin{tabular}{|l|l|l|l|l|}
\hline \multicolumn{3}{|c|}{\begin{tabular}{c} 
Microinstruction \\
Operation
\end{tabular}} & \multicolumn{4}{|c|}{ Microcycle Time } \\
\cline { 2 - 5 } & \(\mathbf{T}_{0}\) & \(\mathbf{T}_{1}\) & \(\mathbf{T}_{\mathbf{2}}\) & \(\mathbf{T}_{3}\) \\
\hline Form Instruction Address & A & & & \\
Instruction Fetch & & A & & \\
Decode & & & A & \\
Displacement Fetch & & & & \\
Form Operand Address & & & & \\
Operand Fetch & & & & \\
Execute
\end{tabular}

Figure 13. RR Instruction Sequence.
\begin{tabular}{|l|c|c|c|c|c|c|c|}
\hline \multicolumn{2}{|c|}{\begin{tabular}{c} 
Microinstruction \\
Operation
\end{tabular}} & \multicolumn{6}{|c|}{ Microcycle Time } \\
\cline { 2 - 7 } & \(\mathrm{T}_{0}\) & \(\mathrm{~T}_{1}\) & \(\mathrm{~T}_{2}\) & \(\mathrm{~T}_{3}\) & \(\mathrm{~T}_{4}\) & \(\mathrm{~T}_{5}\) & \(\mathbf{T}_{6}\) \\
\hline Form Instruction Address & B & & & & & & \\
Instructıon Fetch & & B & & & & & \\
\begin{tabular}{l} 
Decode
\end{tabular} & & & B & & & & \\
Displacement Fetch & & & & B & & & \\
Form Operand Address & & & & & B & & \\
\begin{tabular}{l} 
Operand Fetch \\
Execute
\end{tabular} & & & & & & B & \\
\hline
\end{tabular}

Figure 14. RX Instruction Sequence.

\section*{Displacement Fetch Cycle}

After every instruction fetch another read cycle takes place. The second memory read will be another instruction fetch or an operand displacement fetch. The computer does not know what kind of a read out it is until the instruction decode is finished. For an RX instruction, after the memory read is completed, the computer identifies it as a displacement.

\section*{Form Operand Address Cycle}

The memory word is sent from the main memory at the beginning of this cycle and then passes through the Z and \(\mathrm{Z}_{0}\) Register and goes to the ALU (Am2903's). The ALU adds the displacement and the contents of the register specified by \(\mathrm{X}_{2}\) field in the opcode and forms an operand address which is then loaded into the MAR. This has to be completed 50 ns before the end of the cycle.

\section*{Operand Fetch Cycle}

The memory read cycle is performed and the operand is sent to the computer at the beginning of the next cycle.

\section*{Execute Cycles}

As the name implies, these are the microcycles that perform the task of the instruction but with the Am2903's normally only one execute cycle is required; however, some instructions (e.g., I/O instructions) take as many as seven execute cycles.

Simultaneously with the last execute cycle the Instruction Decoder is enabled.

\section*{Pipelined Operations}

If the architecture of the computer executed each of the instructions and each microstep sequentially, this computer would be just another computer relying on a high-speed clock to gain high throughput. However, the 16-Bit Computer becomes an exceptional machine by using pipelinıng techniques. In this approach, the instruction steps for the following instructions are done during the decode and execute steps of the current instruction. The pipelining operation for a Register to Register class of instructions is shown in Figure 15. With the pipeline full, note that when instruction \(A\) is being executed, instruction \(B\) is being decoded, instruction \(C\) is being fetched from Main Memory and the MAR is being loaded with the address for instruction \(D\). In the following cycle, RR instruction B is executed and RR instructions C, D and \(E\) proceed through the pipeline. The pipelining technique results in an RR instruction effectively being executed in one microcycle. As illustrated in Figure 16, a new RX instruction can be executed every three microcycles.
Pipelining is great for throughput, but it is a bear to microcode especially the first time through since during any one cycle up to four instruction sequences have to be considered. It is not as bad as it first appears. Note that an instruction decode cannot take place until the last execute cycle of the current instruction. The major pipelinıng takes place during the first three steps: form memory address, instruction fetch, and decode. Execute and operand fetch steps allow full overlapped operation only during the last execute cycle. Instructions that require many execute microcycles (e.g., I/O instructions) cause the computer performance to drop down to nearly that of a non-pipelined machine.

\section*{Pipeline Operation with Regard to Branching and Interrupts}

Pipeline operations greatly reduce instruction execution time if machine instructions are executed in sequential order; however, if a branch is taken this advantage is lost because the steps set up in preparation for a decode cycle become useless. The pipeline is said to be "flushed out" when a branch is taken. The RX Branch on Condition instruction has the form:
\begin{tabular}{|c|c|c|c|}
\hline WORD 1 & OP & M & \(\mathrm{x}_{2}\) \\
\hline WORD 2 & \multicolumn{3}{|c|}{DISPLACEMENT} \\
\hline
\end{tabular}

Where: M is a 4 -bit field specifying the conditions for the jump.
\(\left(\mathrm{X}_{2}\right)+\) displacement is the branch address
Figure 17 shows the sequence chart for a RX Branch on Condition instruction. During the microcycle \(A_{1}\) the target address K for the branch is formed and loaded into the MAR and also the instruction \(B\) is fetched for the no branch case. By microcycle \(A_{2}\), it has been determined to take or not take the branch. If the branch is not taken, the MAR is loaded with address \(B+2\), while if the branch is taken, an instruction fetch is performed for \(K\) and the MAR is loaded with \(K+2\). Finally in \(\mathrm{A}_{3}\) the next instruction is decoded. By proper microcoding, the conditional branch is executed in only three microsteps even though the pipeline was "flushed out".
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline Action & \multicolumn{14}{|r|}{A, B, C, D are RR instructions} \\
\hline Form Instruction Address & A & B & C & D & & & & & & & & & & \\
\hline Fetch Instruction & & A & B & C & D & & & & & & & & & \\
\hline Decode & & & A & B & C & D & & & & & & & & \\
\hline Fetch Displacement & & & & & & & & & & & & & & \\
\hline Form Operand Address & & & & & & & & & & & & & & \\
\hline Fetch Operand & & & & & & & & & & & & & & \\
\hline Execute & & & & A & B & C & D & & & & & & & \\
\hline & & & & & & & & & & & & & & \\
\hline & & & & & & & & & & & & & & \\
\hline
\end{tabular}

Figure 15. Register-to-Register Pipeline Operation.
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline Action & \multicolumn{13}{|r|}{A, B, C, D are RX instructions} \\
\hline Form Instruction Address & A & & B & & & C & & & & & & & \\
\hline Fetch Instruction & & A & & B & & & C & & & & & & \\
\hline Decode & & & A & & & B & & & C & & & & \\
\hline Fetch Displacement & & & A & & & B & & & C & & & & \\
\hline Form Operand Address & & & & A & & & B & & & C & & & \\
\hline Fetch Operand & & & & & A & & & B & & & C & & \\
\hline Execute & & & & & & A & & & B & & & C & \\
\hline & & & & & & & & & & & & & \\
\hline & & & & & & & & & & & & & \\
\hline
\end{tabular}

Figure 16. Register-to-Indexed Storage Pipeline Operation.


Figure 17. Branch on Condition RX Pipeline Operation.

As with branching, an interrupt response alters the sequence of execution and "flushes" the pipeline. As was discussed previously in the Interrupt and Input/Output section, an interrupt request blocks the decoding of the next machine instruction and causes the Computer Control Unit to vector to the interrupt service routine. This microcode service routine pushes the PSW consisting of flags and Program Counter (PC) value onto the stack. The PC value is the current PC value minus 4. It is necessary to back the PC up to two instruction words ( 4 bytes), because the fetch instruction and form instruction address steps in the pipeline at the time of the jump to the interrupt microcode sequence have to be repeated when returning to the main machine program.

\section*{MICROINSTRUCTION FORMAT}

All operations of the AMD 16-Bit Computer are under control of the microinstruction. Each microinstruction is 96 bits in length. The microinstruction format is summarized in Figure 18. The microinstruction definition is summarized in Figures 19a and 19b and is detailed in Table 2.

Figure 20 illustrates the AMDASM \({ }^{\circledR}\) Definition file for the 16 -Bit Computer. AMDASM \({ }^{\circledR}\) is a meta-assembler developed by AMD
for writing microprogams. The definition file defines microword length (WORD statement), formats (DEF statements) and constants (EQU statements) for the use of the actual microprogram (Figure 31).

The definition file is divided into 8 parts:
1. Am2910 sequencer opcode definitions
2. Am2903 ALU opcode definitions
3. Am2901A PCU opcode definitions
4. Am2904 shift mux and status control definitions
5. Datapath control bits definitions
6. Memory control bits definitions
7. Control strobe and control bits definitions
8. Immediate operand field definition

\section*{Am2910 Sequencer}

Bit 91 of the microword is the input of CCEN of the Am2910. When bit 91 is a logical 1, the conditional operations are forced to unconditional operations. Bits 19-16 are the input to the instruction inputs to the Am2910. Bits 11-0 are the jump address field for instructions that need an address operand.

\begin{tabular}{|c|c|c|c|c|c|}
\hline \begin{tabular}{c} 
CONTROL \\
STROBES (6)
\end{tabular} & \begin{tabular}{c} 
CONTROL \\
BITS (8)
\end{tabular} & STATUS (9) & TEST (6) & \begin{tabular}{c} 
SEQUENCE \\
CONTROL (5)
\end{tabular} & \begin{tabular}{c} 
NEXT MICRO \\
ADDRS/IMMEDIATE (16)
\end{tabular} \\
\hline
\end{tabular}

Figure 18. Summary of Microinstruction Word Fields.
\begin{tabular}{|c|c|c|c|c|c|}
\hline ROUTE TO B & \(\overline{\mathrm{RTB}}\) & \％ & \multirow[b]{2}{*}{증} & \multirow{6}{*}{} & \multirow{6}{*}{} \\
\hline \begin{tabular}{l}
TRANSFER Z TO ZI \\
Am2910
\end{tabular} & \[
\frac{(\mathrm{BP})}{\mathrm{CCEN}} \mathrm{Z} \rightarrow \mathrm{ZI}
\] & \[
\begin{array}{|l|}
\hline \stackrel{\circ}{0} \\
\hline
\end{array}
\] & & & \\
\hline \begin{tabular}{l}
Am2903 IEU WORD／BYTE \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903 \\
Am2903
\end{tabular} & \(\overline{W O R D}\)
\(\overline{E A}\)
\(\overline{O E Y}\)
\(\overline{O E B}\)
\(I_{8}\)
\(I_{7}\)
\(I_{6}\)
\(I_{5}\)
\(I_{4}\)
\(I_{3}\)
\(I_{2}\)
\(1_{1}\)
\(I_{0}\) & \begin{tabular}{l}
0 \\
\hline 0 \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\(\infty\) \\
\hline \\
\hline \\
\hline
\end{tabular} & 気 \({ }^{\text {r }}\) & & \\
\hline \begin{tabular}{l}
ENABLE TRANSFER REG \\
LOAD TRANSFER REG． \\
I－REG EN CTR \\
－REG INC／\(\overline{D E C}\) \\
PCU TRANS CHIP DISABLE \\
PCU TRANSFER REG． \\
LOAD MEMORY ADDR．REG \\
LOAD D－REG \\
LOAD ZI INTO I REG． \\
ENABLE ZO \(\rightarrow\) DA \\
ENABLE PSW \\
SHIFT CNT Am2910 ADDR． \\
BRANCH INSTR．EN
\end{tabular} & \begin{tabular}{l}
\(\overline{\text { ENTREG }}\) \\
LDTREG \\
ENCTR \\
INC \\
PCUCD \\
\(\mathrm{PCU} \rightarrow \mathrm{Y}\) \\
LDMAR \\
LDD \\
\(\mathrm{ZI} \rightarrow 1\) \\
PSW \\
SHTCNTEN \\
BRIEN
\end{tabular} &  & 彦节滒 & & \\
\hline \begin{tabular}{ll} 
Am2901 \(\quad \mathrm{F} \rightarrow \mathrm{B} / \overline{\mathrm{Q}}\) \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 & \\
Am2901 &
\end{tabular} & \(\mathrm{PCUI}_{7}\) \(\mathrm{PCUI}_{3}\) \(\mathrm{PCUI}_{2}\) \(\mathrm{PCUI}_{1}\) \(\mathrm{PCUI}_{0}\) \(\mathrm{PCUA}_{2}\) \(\mathrm{PCUA}_{1}\) \(P C U A_{0}\) \(\mathrm{PCUB}_{2}\) \(\mathrm{PCUB}_{1}\) &  &  & & \\
\hline BUS REQUEST MEMORY REQUEST HOLD REQUEST MEMORY WRITE／READ MEMORY WORD／BYTE & REQB MREQ HREQ WRITE MWORD &  &  & & \\
\hline
\end{tabular}

Figure 19a．Micro Control Word Bit Definitions．
\begin{tabular}{|c|c|c|c|c|c|}
\hline \begin{tabular}{l}
EN IMMEDIATE \(\rightarrow\) DA BUS \\
ROM/IREGEN \\
I/O CONTROL REG. EN \\
Am2914 INTERRUPTS DISABLE \\
Am2914 \(\mathrm{ENI}_{0}-\mathrm{ENI}_{3}\) \\
Am2904 SHIFT EN
\end{tabular} & \(\overline{\text { IMMD }}\)
ROM \(/\) I
\(\overline{\text { IOEN }}\)
\begin{tabular}{l} 
INTDIS \\
\hline INTRIEN \\
\hline SHFTEN
\end{tabular} &  & 气 & & \\
\hline GENERAL USE CONTROL BITS & \begin{tabular}{l}
\(\mathrm{CNTLB}_{7}\) \\
CNTLB \({ }_{6}\) \\
\(\mathrm{CNTLB}_{5}\) \\
\(\mathrm{CNTLB}_{4}\) \\
\(\mathrm{CNTLB}_{3}\) \\
\(\mathrm{CNTLB}_{2}\) \\
\(\mathrm{CNTLB}_{1}\) \\
CNTLB \({ }_{0}\)
\end{tabular} &  & @ & & \\
\hline \begin{tabular}{l}
Am2904 OUT EN CONDITIONAL TEST \\
Am2904 EN ZERO \\
Am2904 EN CARRY \\
Am2904 EN SIGN \\
Am2904 EN OVERFLOW \\
Am2904 EN MACHINE STATUS \\
Am2904 EN MICRO STATUS \\
Am2904 \(\mathrm{I}_{12}\) CARRY OUT CNTL \\
Am2904 \(\mathrm{I}_{11}\) CARRY OUT CNTL
\end{tabular} & \begin{tabular}{l}
\(\overline{O E C T}\) \\
\(\overline{E Z}\) \\
\(\overline{E C}\) \\
\(\overline{E S}\) \\
\hline\(\overline{E O V R}\) \\
\(\overline{C E M}\) \\
\(\overline{C E \mu}\) \\
\(I_{12}\) \\
\(I_{11}\)
\end{tabular} & \(\times\)
\(\times\)
\(\times\)
\(\omega\)
\(\omega\)
\(\omega\)
\(\omega\)
\(\omega\)
\(\omega\)
\(\omega\)
\(\omega\)
\(N\)
\(N\)
\(N\)
0
\(N\)
\(N\)
\(N\) & అ &  & \[
\begin{aligned}
& \vec{\phi} \\
& \text { 㐫 } \\
& \overrightarrow{1} \\
& \underline{0}
\end{aligned}
\] \\
\hline \begin{tabular}{l}
Am2904 \\
Am2904 \\
Am2904 \\
Am2904 \& Am25LS251 \\
Am2904 \& Am25LS251 \\
Am2904 \& Am25LS251
\end{tabular} & TEST 5 TEST \(_{4}\) \(\mathrm{TEST}_{3}\) TEST 2 TEST \(_{1}\) TEST 0 & \[
\begin{aligned}
& N \\
& N \\
& N \\
& N \\
& N \\
& N \\
& N \\
& N
\end{aligned}
\] & O) \({ }_{\text {O }}^{\text {- }}\) &  & \[
\begin{aligned}
& 3 \\
& 0 \\
& \frac{1}{c} \\
& \text { m }
\end{aligned}
\] \\
\hline \begin{tabular}{l}
Am2910 I \({ }_{3}\) \\
Am2910 I 2 \\
Am2910 I \\
Am2910 Io
\end{tabular} & \begin{tabular}{l}
\(\mathrm{NAC}_{3}\) \\
\(\mathrm{NAC}_{2}\) \\
\(\mathrm{NAC}_{1}\) \\
\(\mathrm{NAC}_{0}\)
\end{tabular} & \[
\begin{aligned}
& \stackrel{\rightharpoonup}{\bullet} \\
& \stackrel{\rightharpoonup}{\infty} \\
& \stackrel{\rightharpoonup}{\square} \\
& \stackrel{\rightharpoonup}{2}
\end{aligned}
\] &  & & \\
\hline & \begin{tabular}{l}
\(\mathrm{M}_{15}\) \\
\(\mathrm{M}_{14}\) \\
\(\mathrm{M}_{13}\) \\
\(\mathrm{M}_{12}\) \\
\(\mathrm{M}_{11}\) \\
\(\mathrm{M}_{10}\) \\
\(\mathrm{M}_{9}\) \\
\(\mathrm{M}_{8}\) \\
\(\mathrm{M}_{7}\) \\
\(\mathrm{M}_{6}\) \\
\(\mathrm{M}_{5}\) \\
\(\mathrm{M}_{4}\) \\
\(\mathrm{M}_{3}\) \\
\(\mathrm{M}_{2}\) \\
\(\mathrm{M}_{1}\) \\
\(M_{0}\)
\end{tabular} &  &  & & \\
\hline
\end{tabular}

Figure 19a. Micro Control Word Bit Definitions (Cont.)
\begin{tabular}{|c|c|c|c|c|}
\hline Control
Control
Bits (35-42) & ROM/IREGEN Bit 47 & VO Control Register Bit 46 & \[
\begin{gathered}
\text { Am2914 } I_{0}-I_{3} \\
\text { Bit } 44 \\
\hline
\end{gathered}
\] & Am2904 Shift Enable Bit 43 \\
\hline CNTLB \({ }_{7}\) & \(\mathrm{B}_{3}\) & I/O7 & & \\
\hline \(\mathrm{CNTLB}_{6}\) & \(\mathrm{B}_{2}\) & 1/06 & & \\
\hline \(\mathrm{CNTLB}_{5}\) & \(\mathrm{B}_{1}\) & I/O5 & & \\
\hline \(\mathrm{CNTLB}_{4}\) & \(\mathrm{B}_{0}\) & I/O4 & & \(l_{10}\) \\
\hline CNTLB \({ }_{3}\) & \(\mathrm{A}_{3}\) & 1/O3 & \(I_{3}\) & 19 \\
\hline \(\mathrm{CNTLB}_{2}\) & \(\mathrm{A}_{2}\) & 1/02 & \(\mathrm{I}_{2}\) & \(\mathrm{I}_{8}\) \\
\hline \(\mathrm{CNTLB}_{1}\) & \(\mathrm{A}_{1}\) & I/O1 & \(\mathrm{I}_{1}\) & \(\mathrm{I}_{7}\) \\
\hline CNTLB \({ }_{0}\) & \(\mathrm{A}_{0}\) & I/O0 & \(\mathrm{I}_{0}\) & \(\mathrm{I}_{6}\) \\
\hline
\end{tabular}

Figure 19b. Detailed Description of Bits 34 through 47.

Table 2. Microinstruction Definition.
\begin{tabular}{|c|c|c|}
\hline & & Definition \\
\hline \[
\begin{aligned}
& 95 \\
& 92 \\
& 91
\end{aligned}
\] & \[
\begin{aligned}
& \overline{\overline{R T B}} \\
& z \rightarrow Z_{1} \\
& \overline{\mathrm{CCEN}}
\end{aligned}
\] & \begin{tabular}{l}
Routes second register field to B-RAM of Am2903. \\
Loads the value in the \(Z\) register into the \(Z_{1}\) Register at the beginning of the microcycle. Enables the CC input of the Am2910.
\end{tabular} \\
\hline \multicolumn{3}{|l|}{ALU} \\
\hline \[
\begin{aligned}
& \hline 90 \\
& 89 \\
& 88 \\
& 87 \\
& 86 \\
& 85 \\
& 84 \\
& 84 \\
& 83 \\
& 82 \\
& 81 \\
& 80 \\
& 79 \\
& 78
\end{aligned}
\] &  & \begin{tabular}{l}
These bits control the four Am2903's. The function of EA, OEY, OEB, and \(\mathrm{I}_{8-0}\) is listed in Figure 20. WORD when enabled (LOW) causes the Am2903's to operate on words (16-bits). When disabled (HIGH) the ALU operates on bytes (the least sıgnificant byte). This bit disabled blocks WE to the upper two Am2903's and turns off their Y outputs. \\
Zeroes should be forced to the upper 8 bits of the Y bus via the PCU to allow the zero status to operate correctly when the WORD bit is disabled. Also, when disabled the status (C, OVR, S) sent to the Am2904 is taken from the second Am2903 (numbering 0-3 least significant to most significant slice) instead of the most significant Am2903.
\end{tabular} \\
\hline 77 & ENTREG & Enable Transfer Register - enables the Transfer Register onto the DA input bus of the Am2901A's and Am2903's. \\
\hline 76 & LDTREG & Load Transfer Register - loads the Transfer Register from the Y bus. \\
\hline 75 & ENCTR & Enable I Regıster Counter - enables the I Register Counter ( \(1_{7-14}\) ) to count. This value is used to address the general registers during stack instructions and by incrementing or decrementing this value the microprogram can read or write successive registers. \\
\hline 74 & INC & I Register INC/DEC - the value in \(\mathrm{I}_{7-14}\) can be either incremented (if this bit is HIGH) or decremented. \\
\hline 73 & PCUCD & PCU Transceiver Disable - when HIGH this bit disables the PCU Transceivers from receiving or transmittung data. \\
\hline 72 & \(\mathrm{PCU} \rightarrow \mathrm{Y}\) & PCU Transceiver Control - when HIGH this bit allows the PCU Transceivers to pass data from PCU to the \(Y\) bus. [WORD high (microbit 90) disables the least significant 8 bits of these transceivers.] When LOW data passes from the \(Y\) bus to the MAR. \\
\hline 71 & LDMAR & Load Memory Address Regıster (MAR) - this bit loads the Memory Address Register. \\
\hline 70 & LDD & Load D Register - this bit loads the D Register with data from the Y bus. \\
\hline 69 & \(\mathrm{Z}_{1} \rightarrow \mathrm{l}\) & Load \(Z_{1}\) into I Register - this bit loads data from \(Z_{1}\) into the I Register. The I Register holds only the upper 16 bits of the instruction. \\
\hline 68 & \(\overline{E N Z}{ }_{0}\) & Enable \(\mathrm{Z}_{0} \rightarrow\) DA - this bit LOW enables the \(\mathrm{Z}_{0}\) Register onto the ALU DA. \\
\hline
\end{tabular}

Table 2. Microinstruction Definition. (Cont.)
\begin{tabular}{|c|c|c|}
\hline & & Definition \\
\hline \begin{tabular}{l}
67 \\
66
\[
65
\]
\end{tabular} & \begin{tabular}{l}
\(\overline{\text { PSW }}\) \\
SHTCNTEN \\
\(\overline{\text { BRIEN }}\)
\end{tabular} & \begin{tabular}{l}
Enable PSW - this bit LOW enables the PSW onto the ALU DA. \\
Shift Count to Am2910 - this bit LOW enables the least significant four bits of the instruction ( \(\mathrm{I}_{0-3}\) ) onto the D input to the Am2910 sequencer. This allows the value to be entered into the Am2910 internal counter to be used during shift instructions. \\
Branch Instruction Enable - this bit LOW enables \(\mathrm{I}_{4-7}\) of the Instruction Register onto the Am2904 \(\mathrm{I}_{0-3}\) input. The \(\mathrm{I}_{0-3}\) inputs control the tests of the status register.
\end{tabular} \\
\hline \multicolumn{3}{|l|}{PCU} \\
\hline \[
\begin{aligned}
& 64 \\
& 63 \\
& 62 \\
& 61 \\
& 60 \\
& 59 \\
& 58 \\
& 57 \\
& 56 \\
& 55 \\
& 54
\end{aligned}
\] & \(\mathrm{PCUI}_{7}\) \(\mathrm{PCUI}_{3}\) \(\mathrm{PCUI}_{2}\) \(\mathrm{PCUI}_{1}\) \(\mathrm{PCUI}_{0}\) \(\mathrm{PCUA}_{2}\) \(\mathrm{PCUA}_{1}\) \(\mathrm{PCUA}_{0}\) \(\mathrm{PCUB}_{2}\) \(\mathrm{PCUB}_{1}\) \(\mathrm{PCUB}_{0}\) & These bits control the PCU which is designed around four Am2901's. The \(\mathrm{PCUI}_{7}, \mathrm{PCUI}_{3}\), \(\mathrm{PCUI}_{2}, \mathrm{PCUI}_{1}\) and \(\mathrm{PCUI} \mathrm{I}_{0}\) bits connect directly to the Am2901 \(\mathrm{I}_{7}, \mathrm{I}_{3}, \mathrm{I}_{2}, \mathrm{I}_{1}\) and \(\mathrm{I}_{0}\) respectively. The \(\mathrm{PCUA}_{2}-\mathrm{PCUA}_{0}\) and \(\mathrm{PCUB}_{2}-\mathrm{PCUB}_{0}\) connect to the A and B Address inputs of the Am2901. \(I_{4}, I_{5}, I_{8}, A_{3}\) and \(B_{3}\) are tied to ground. \(I_{6}\) is tied to \(I_{7}\). \\
\hline 53
52
51 & \begin{tabular}{l} 
REQB \\
MREQ \\
\hline HREQ
\end{tabular} & \begin{tabular}{l}
Request Bus - this bit requests use of the system bus. This request is made the microcycle preceding a Memory Request or use of the bus for an I/O transfer. If the request is not honored, the processing of the next microinstruction is halted until the acknowledge is issued. \\
Memory Request - this bit requests the memory to do a read or write operation. \\
Hold Request - this bit LOW blocks the bus controller from releasing the system bus to another device. Normally a Bus Request is cleared as soon as the Bus Acknowledge is issued. HREQ holds Bus Request and prevents any other device from using the bus.
\end{tabular} \\
\hline 50 & WRITE & Memory Write/READ - this bit indicates to the memory the MREQ is for a write operation (if HIGH) and a read operation (if LOW). \\
\hline 49
48 & MWORD

\(\overline{\text { IMMD }}\) & \begin{tabular}{l}
Memory Word/BYTE - the Memory Word/BYTE microbit specifies whether the memory operation will be a word operation or a byte operation. If the operation specified is a byte operation the least significant address bit determınes which byte of the two byte pair in memory is affected. If the LSBit is a zero, the most significant byte is read or written, and the LSBit is a one, the least significant byte is read or written. \\
EN Immedıate DA Bus - this bit LOW enables the 16 -bit immediate value (least significant 16 bits of the microinstruction) to the ALU DA bus.
\end{tabular} \\
\hline 47 & ROM/I & ROM/I REG Enable - this bit enables either the ROM bits 42-35 or the 1 register bits \(\mathrm{I}_{0-7}\) onto the \(A / B\) address inputs of the ALU according to the following: \\
\hline 46 & \(\overline{\text { IOEN }}\) & I/O Control Register Enable - this bit loads the I/O Control Register with microbits 42-35. \\
\hline 45 & \(\overline{\text { INTDIS }}\) & Am2914 Interrupt Disable - this bit disables the Am2914 Interrupt Controller from recognızing interrupt requests. \\
\hline 44 & \(\overline{\text { INTRIEN }}\) & Am2914 \(\mathrm{ENI}_{0}-\mathrm{ENI}_{3}\) - this bit is the instruction enable for the Am2914. The instruction inputs \(\mathrm{I}_{0-3}\) are connected to microbits \(35-38\) respectively. \\
\hline 43 & SHFTEN & Am2904 Shift Enable - this bit is connected to the shift enable of the Am2904. The shift controls \(I_{6-10}\) are connected to microbits \(35-39\) respectively. \\
\hline
\end{tabular}

Table 2. Microinstruction Definition. (Cont.)
\begin{tabular}{|c|c|c|c|c|}
\hline & & \multicolumn{3}{|r|}{Definition} \\
\hline 42
41
40
39
38
37
36
35 & \begin{tabular}{l}
\(\mathrm{CNTLB}_{7}\) \\
CNTLB \({ }_{6}\) \\
\(\mathrm{CNTLB}_{5}\) \\
\(\mathrm{CNTLB}_{4}\) \\
\(\mathrm{CNTLB}_{3}\) \\
\(\mathrm{CNTLB}_{2}\) \\
\(\mathrm{CNTLB}_{1}\) \\
CNTLB \({ }_{0}\)
\end{tabular} & \multicolumn{3}{|l|}{This control field is used to provide several different functions as defined by the previously described control strobes (microbits 47-43).} \\
\hline \[
\begin{aligned}
& 34 \\
& 33 \\
& 32 \\
& 31 \\
& 30 \\
& 29 \\
& 28 \\
& 27 \\
& 26
\end{aligned}
\] & \(\overline{O E C T}\)
\(\overline{E Z}\)
\(\overline{E C}\)
\(\overline{E S}\)
\(\overline{E O V R}\)
\(\overline{C E M}\)
\(\overline{C E}\)
\(I_{12}\)
\(I_{11}\) & \multicolumn{2}{|l|}{\begin{tabular}{l}
OUT EN CONDITIONAL TEST \\
EN ZERO \\
EN CARRY \\
EN SIGN \\
EN OVERFLOW \\
EN MACRO STATUS \\
EN MICRO STATUS \\
CARRY OUT CONTROL \\
CARRY OUT CONTROL
\end{tabular}} & These bits are used to control the Am2904. Their functions are defined in Figure 21. OECT is used to enable the test output of the Am2904 to the CC input of the Am2910. \\
\hline \[
\begin{aligned}
& 25 \\
& 24 \\
& 23 \\
& 22 \\
& 21 \\
& 20
\end{aligned}
\] & \[
\begin{aligned}
& \text { TEST }_{5} \\
& \text { TEST }_{4} \\
& \text { TEST }_{3} \\
& \text { TEST }_{2} \\
& \text { TEST }_{1} \\
& \text { TEST }_{2}
\end{aligned}
\] & \multicolumn{3}{|l|}{These bits determine which test is to be performed for the conditional branch and stack functoons. The varıous tests are listed in Figure 25. The testing is done both in the Am2904 and an 8 to 1 multiplexer.} \\
\hline \[
\begin{aligned}
& 19 \\
& 18 \\
& 17 \\
& 16
\end{aligned}
\] & \(\mathrm{NAC}_{3}\) \(\mathrm{NAC}_{2}\) \(\mathrm{NAC}_{1}\) \(\mathrm{NAC}_{0}\) & \[
\begin{aligned}
& 291013 \\
& 291012 \\
& 291011 \\
& 291010
\end{aligned}
\] & \multicolumn{2}{|l|}{These bits are connected to the \(I_{3-0}\) inputs of the Am2910 to control the sequencing of the microprogram. Their definitions are listed in Figure 26.} \\
\hline 15
14
13
12
11
10
9
8
7
6
5
4
3
2
1 & \begin{tabular}{l}
\(M_{15}\) \\
\(\mathrm{M}_{14}\) \\
\(\mathrm{M}_{13}\) \\
\(\mathrm{M}_{12}\) \\
\(\mathrm{M}_{11}\) \\
\(\mathrm{M}_{10}\) \\
\(\mathrm{M}_{9}\) \\
\(\mathrm{M}_{8}\) \\
\(\mathrm{M}_{7}\) \\
\(M_{6}\) \\
\(\mathrm{M}_{5}\) \\
\(\mathrm{M}_{4}\) \\
\(\mathrm{M}_{3}\) \\
\(\mathrm{M}_{2}\) \\
\(M_{1}\) \\
\(\mathrm{M}_{0}\)
\end{tabular} & \multicolumn{3}{|l|}{These bits provide the branch address for the Am2910 and the 16-bit immediate field.} \\
\hline
\end{tabular}

\section*{Am2903 ALU}

The first 16 equates assign mnemonics for the \(18-15\) of the Am2903 which controls the destination of the ALU result. The next 16 equates assign mnemonics for 14-11 of the Am2903 which control the operations of the ALU. The ALU definition indicates the default is the Y bus forced to zero with no operation on destination. The next group of definition selects the source operand, followed by the special function definitions of the Am2903.

\section*{Am2901A PCU}

The PCU defintions include a group of often used PC instructions such as PCU. NEXT, PCU. JUMP etc. The PCU defintion itself
allows a not predefined instruction be accessible to the microprogrammer.

\section*{AM2904 Shift Linkage Multiplexer and Status Register}

The group of equates control the updating of the status register and the TEST definition controls the shift linkage multiplexer. The carry control controls the carry into the least sıgnificant Am2903 slice.

\section*{Datapath Control}

The data control equates assign mnemonics to different datapath control bits.


Figure 20. Definition File for \(\mathbf{1 6 - B i t}\) Computer.


Figure 20. Definition File for 16-Bit Computer (Cont.).

\section*{Memory Control}

The memory control equates assign mnemonics to different memory control bits.

\section*{Control Strobe and Control Bits}

The control strobe equates assign mnemonics to the control bit strobe signals. The control bit definition defines a hexadecimal bit pattern for the 8 control bits.

\section*{Immediate Operand}

When the Am2910 sequencer is executing an instruction which does not require an address operand, bits 15-0 in the microword can be used as a 16 -bit constant to load ALU, PCU etc. This is accomplished by putting the constant in bits 15-0 and force bit 48 to logic 0 .

\section*{MICROCODE}

\section*{Flowcharts}

The flowcharts of the major instruction types are shown in the following figures.
Figure 21 illustrates the basic microprogram flowchart and demonstrates how the pipelining is done in microcode. This figure illustrates the sequencing of the computer starting with no instructions in the pipeline. By the fourth microinstruction, the pipeline is full and the CPU can execute for example a macroinstruction every microcycle.
Figure 22 illustrates the execution of an RR instruction. During an RR instruction, PC+6 is loaded into the MAR and a bus request is issured for the content of PC +6 . The contents of \(P C+4\) are read into the \(Z\) register. The \(Z_{1}\) and I Registers are loaded with the contents of PC +2 . The instruction at PC is executed. The input to the mapping PROM is loaded with the contents of PC+2. Thus in a stream of RR instructions, four instructions are in progress concurrently.
Figure 23 illustrates the execution of an RX instruction. In this figure the decode operation takes the microprogram to the microstep where the form address operation is done. Since the decode of the instruction has been completed in the previous step, the form address microinstructions are unique to each RX instruction in spite of the fact the operation performed is identical.


Figure 21. Microprogram Start Up Flow Chart.

From the form address step, the microprogram jumps to FETCHOP where the operand is fetched. This step returns to where the instruction is actually executed.
Figure 24 illustrates the execution of an RSI instruction. At the first microstep, the immediate operand is already in the \(\mathrm{Z}_{0}\) register. So the instruction is executed in the first step. The microprogram is then jumped to START2 to refill the pipeline.


Figure 22. RR Instruction Flow Chart.


\section*{RX INSTRUCTIONS IMPLEMENTED}
\begin{tabular}{|c|c|c|}
\hline OPCODE & R1, X2 (DISP) & \\
\hline LD & R1, X2 (D) & \(\mathrm{R} 1=(\mathrm{X} 2)+\mathrm{D}\) \\
\hline ST & R1, X2 (D) & \((\mathrm{X} 2)+\mathrm{D}=(\mathrm{R} 1)\) \\
\hline ADD & R1, X2 (D) & \(\mathrm{R} 1=(\mathrm{R} 1)+[(\mathrm{X} 2)+\mathrm{D}]\), Set CC \\
\hline SUB & R1, X2 (D) & \(R 1=(R 1)-[(X 2)+D]\), Set CC \\
\hline N & R1, X2 (D) & R1 - (R1) AND [(X2) + D], Set CC \\
\hline 0 & R1, X2 (D) & R1 = (R1) OR [(X2) +D\(]\), Set CC \\
\hline CMP & R1, X2 (D) & Set CC FOR (R1) - [(X2) + D] \\
\hline
\end{tabular}

Figure 23. RX Type Instruction.


Figure 24. Immediate Instructions.

Figure 25 illustrates the execution of an unconditional branch instruction. At the first microstep the displacement is already in the \(Z_{0}\) register. The branch address is formed by adding the contents of the \(Z_{0}\) register to the contents of the index register \(X_{1}\). The MAR is loaded with the branch address and a bus request is issued for the contents of the branch address. The branch address is also loaded into the transfer register for subsequent loading of PC. In the next step, the contents of the transfer register +2 is loaded into the PC and MAR. A bus request is issued to \(B A+2\). The content of \(B A\) is read. The microprogram is then transferred to START2 to fill up the pipeline.
Figure 26 illustrates the Conditional Branch instruction. In step 1, unlike the Unconditional Branch instruction, the contents of the memory (instruction \(\mathrm{N}+1\) ) is read, in case the test condition fails and the macro program falls through. The condition test is enabled in this step. If the test passes, the microprogram transfers to Unconditional Branch routine. If the test falls, the microprogram proceeds to fill the pipeline and continue.

Figure 27 illustrates the branch and link instruction. The flowchart is similar to Unconditional Branch except an extra step (STEP 2) is inserted. This step saves PC in \(\mathrm{R}_{1}\).
Figure 28 illustrates a shift or rotate instruction. In STEP 1 the opcode of the next instruction is loaded into \(Z_{1}\) registers and the shift count of the shift instruction is loaded into the loop counter of Am2910. STEP 2 executes the shift instruction \(\mathrm{N}+1\) times, where N is the shift count in the instruction. It should be noted that since Am2910 detects - 1 as the stop condition, the shift count loaded should be one less than the desired count. Step 3 is the same as the RNI (request next instruction). It is duplicated because the fail condition of RPCT in Am2910 can only fall through.


Figure 25. Unconditional Branch.


Figure 26. Conditional Branch.

Figure 29 illustrates the input instruction. In STEP 1, the I/O Port Address is formed by adding \(\mathrm{Z}_{0}\) and \(\mathrm{X}_{2}\). Bus request is issued for the I/O Port. The desired width of the I/O read pulse is loaded into the Am2910 Loop Counter. The width of the I/O read pulse is \((N+2) X\) cycle time where \(N\) is the number loaded. The I/O read signal is turned on. In STEP 2, the bus is held for the I/O address and the loop counter is decremented until it becomes -1. In STEP 3 , I/O read pulse is turned off but I/O address is held for possible address hold time requirement of the I/O device. On the trailing edge of the I/O read pulse, the content of the I/O Port is strobed into the \(Z_{0}\) register. In STEP 4, the content of \(Z_{0}\) register is loaded into \(R_{1}\), thus completing the I/O read. Bus request is issued for the next instruction and microprogram jumps to START1 to refill the pipeline.

Figure 30 illustrates the output instruction. In STEP 1, bus request is ussued for the I/O Port Address. In STEP 1, the content of \(R_{1}\) is transferred to the D register for outputting to the data bus. The I/O write pulse is set and the width of the write pulse is loaded into the Am2910 Loop Counter as in the input instruction. In STEP 3, the I/O address is held until loop counter becomes -1 . In STEP 4, the content of the D register is strobed into the I/O Port by turning off the I/O Write Pulse. The microprogram jumps to START to refill the pipeline.

The Figures 21-30 illustrate the major instruction types implemented. These are by no means the only possible instructions for the 16 -bit computer described. Some other instructions such as stack instructions are shown in the microcode but not in the figures and should be easily understood with the above examples as a guide.

Figure 31 illustrates the implementation of some typical instructions. Instruction 0 is the restart instruction. It jumps to INIT which is located in location H\#180 because the mapping PROM maps only into the first 256 locations. So it is desirable to preserve these locations for Macro instructions. The initialization routine does the following:
1. Turn on I/O reset signal and jump (Inst H\#O)
2. Set \(\mathrm{R}_{0}\) in ALU to 0 (Inst H\#180)
3. Set \(\mathrm{R}_{0}\) in PCU (PC) to 0 (Inst H\#181)
4. Set \(\mathrm{R}_{1}\) in PCU (SP) to H\#4000 (Inst H\#182)
5. Set \(\mathrm{R}_{4}\) in PCU to 2 (Inst H\#183)
6. Set \(\mathrm{R}_{5}\) in PCU to 4 (Inst H\#184)
7. Turn off I/O reset signal (Inst H\#185)
8. Initialize console USART (Inst H\#186-H\# 190)

The microinstruction that executes macroinstructions are grouped as follows:
\begin{tabular}{lcc}
\multicolumn{1}{c}{ Type } & Figure & \begin{tabular}{c} 
Microinst \# \\
(Hex)
\end{tabular} \\
RR Instructions & 22 & \(005-00 \mathrm{~B}\) \\
RX Instructions & 23 & \(00 \mathrm{C}-01 \mathrm{~B}\) \\
RSI Instructions & 24 & \(01 \mathrm{C}-022\) \\
Branch Instructions & \(25-27\) & \(023-02 \mathrm{~A}\) \\
Shift Instructions & 28 & \(02 \mathrm{~B}-042\) \\
Input Instruction & 29 & \(043-046\) \\
Output Instruction & 30 & \(047-04 \mathrm{~A}\) \\
Stack Instructions & - & \(04 \mathrm{~B}-059\) \\
Interrupt Instructions & - & \(05 \mathrm{~A}-061\)
\end{tabular}


Figure 27. Branch and Link.


Figure 28. Shift and Rotate Instructions.


Figure 29. Input Instruction.


Figure 30. Output Instruction.

Upon an interrupt, the 16-Bit Computer finishes its current instruction and jumps to microinstruction H\#1FF. The interrupt handler works as follows:
1. Current PSW is stored in DREG and SP \(=\mathrm{SP}-2\) (Inst H\#1FF).
2. The content of PSW is written onto the stack in memory. \(\mathrm{PC}=\) PC-4 to flush out the pipeline (Inst H\#1FO).
3. \(\mathrm{SP}=\mathrm{SP}-2\) (Inst H\#1F1).
4. The content of the adjusted PC is written to the DREG (Inst H\#1F2).
5. The content of the PC is written onto the stack in memory and the vector in the Am2914 is output to the interrupt vector PROM. A vector jump is made following this instruction depending on the interrupt number (Inst H\#1F3).
6. The vector jump directs to 1 of 8 locations labelled \(\mathrm{INT}_{0}-\mathrm{INT}_{7}\). For \(\operatorname{INT} T_{1}-\mathbb{N T} T_{7}\), the first instruction disables interrupt in the Am2914 and forces new PC value into PC. INT requires an extra instruction to clear the Am9519. The interrupt vector in the Am9519 is to be determined by the macro interrupt handler.
7. This next instruction is the same as the START instruction. The previous instruction cannot jump to START directly because the immediate operand uses the jump address field. The macroprogram resumes at the new PC value.

The instructions implemented cover only a small portion of all possible instructions. Only 137 or 512 microinstructions are used. The rest of the instruction space could be used to vastly enhance the instruction set such as byte operations, storage to storage instructions, etc.


EXCLUSIVE OR REGISTERS \(\quad\) RR CC: CSVZ ALU REG,EXOR \& AB \& CARRICTL \& OEY \& WORD \&
 mX TYPE INSTRUCTIONS

LOAD \(\quad R_{1}=[(x 2)+D] \quad 58 \quad R X \quad C C: N O N B\)
\(\triangle L U\) YBUS, ADD \& DAB \& CARRYCTL \& OEY \& WORD \& CONTROL \& RTB \&
 AM2904 \& PCU.NOP \& JSB FETCHOP

ALU REG, PASSR \& DAB \& CARRYCTL \& OEY \& WORD \& CONTROL \&
 STORF
ST R1, X2 (D) ( X 2 ) \(+\mathrm{D}=(\mathrm{R} 1)^{50}\) RX CC: NONE
ALU YBUS, ADD \& DAB \& CARRYCTL \& OET \& WORD \& CONTROL \& RTB \&
 M2S04 \& PCU.NOP \& CONT

ALi Y Y

ALU YBUS, PASS \& AB \& CARRYCTL \& OEY \& WORD \& CONTROL \&

\(\operatorname{ADD~}_{\mathrm{R} 1, \mathrm{X} 2(\mathrm{D}) \quad \mathrm{K} 1=(\mathrm{R} 1)+\left[\left(\mathrm{X}_{2}\right)^{5 \mathrm{~A}}+\mathrm{D}\right] \quad \mathrm{RX} \quad \mathrm{CC:} \mathrm{CSv} 2}\)
ALU YBUS, ADD \& DAB \& CARRYCTL \& OEY \& WORD \& CONTROL \& RTB \&


ALU REG,ADD \& DAB \& CARRYCTL \& OEY \& HORD \& CONTROL \&


SUBTRACT
\(\operatorname{SUB} 1, X 2(D) \quad R 1=(B 1)-\left[(X 2)^{5 B}+D\right] \quad R X \quad C C: C S V Z\)
ALU YBUS,ADD \& DAB \& CARRYCTL \& OEY \& WORD \& CONTROL \& RTB \&

 AM2964, , BZ', 'OC,ES, EOVR,CEM, \& PCU.NEXT S JMAP
\(\mathrm{N}_{\mathrm{N}}^{\mathrm{AND}, \mathrm{X} 2(\mathrm{D}) \quad \mathrm{R} 1=(\mathrm{R} 1) \mathrm{AND}[(\mathrm{X} 2)+\mathrm{D}]} \mathrm{RX} \quad \mathrm{CC}: \operatorname{CSv} 2\)
ALU YBus, ADD \& DAB \(\delta\) CARRYCTL \& OEY \& WORD \& CONTROL \& RTB \&
 ALU HEG, AND \& DAB \& CARRICTL \& OEY \& WORD \& CONTROL \& DATAPATH \(, \ldots, \ldots, L D M A R, Z I I, E N Z \sigma, \ldots\) \& MEM. CONT REQB,MREQ, , MVORD \& \(O_{R 1, X 2(D)}^{O R} \quad R_{1}=(R 1) O R\left[\left(\mathrm{X}_{2}\right)+\mathrm{D}\right] \quad \mathrm{RX} \quad \mathrm{CC:} \mathrm{CSVZ}\) ALU YBUS, \(A D D\) \& DAB \& CARRYCTL \& OEY \& YORD \& CONTROL \& RTB \&
 ALJ REG, OR \& DAB \& CARRYCTL 6 OEI \& WORD \& CONTROL \(\&\)


 ALU YBUS, ADD \& DAB \& CARRYCTL \& OEY \& YORD \& CONTROL \& RTB \&



Figure 31. Microprogram for 16-Bit Computer (Cont.)



Figure 31. Microprogram for 16-Bit Computer (Cont.)

\section*{MICROCODE TRANSLATION}

It is often convenient for the microprogrammer to assign mıcroword fields such that they occupy positions that differ from those in the actual hardware implementation. This is often the case when the microprogrammer, for convenience, allocates bits according to the functions to be performed and then needs to translate the object code produced by AMDASM \({ }^{\circledR}\) to be consistent with the hardware microprogram memory design.

There is another instance where the ability to shift bit assignment is important to the engineer. As a given product evolves, bits may be added or deleted from the original microword format. When this occurs, a mapping function is desired to minimize hardware changes.
The program in SYSTEM/29 \({ }^{\text {® }}\) that performs such a mapping function is called AMSCRM. The AMSCRM maps the output of AMDASM (logical bit pattern) into the bit pattern that is consistent with the 16 -bit computer hardware. A table of the logical to physical mapping is shown in Table 3.

\section*{ENGINEERING MODEL AND MACROCODE}

With the proper tools - designing, microprogramming, prototyping, and checking out a new computer design is not overly difficult. The major tools used for the high-speed 16-bit design described in this application note was System \(29^{(1)}\). System 29 is a software driven hardware prototyping system which allows microprogramming, hardware design/checkout, and macroprogramming (programming in the language of the target machine) to occur simultaneously. At the point where the design is reasonably rigid, and the hardware is mostly fabricated, System 29 allows the engineer to create "instant" microprograms to check out the new computers' internal data paths. Microprogram software support features of System 29 also allow the engineer to single cycle, single instruction step, instruction trace, and trap on pre-specified events coming true. Simultaneously with this initial internal check-out, the microcode for some very simple machine instruction should be written (i.e., load register, add register, or register, etc.). The next step is to check out the main memory paths with load and store instructions. At this point, a.reasonable

Table 3.

instruction sub-set should be microprogrammed (a phase 1 instruction set) that will allow a simple monitor to be written in the target machines's language. This monitor should run on the target machine and provide commands for: memory display, memory store and jump to memory location. The phase 1 instruction set and simple monitor now provides the basic foundation for completing the full computer design.
The standard System 29 configuration provides automatically for microcode and hardware development. In order to efficiently develop and implement the target machine's software, a target machine assembler and a mechanism for loading the machine's main memory must be provided. System 29 uses an Am9080A mıcroprocessor, dual floppy disks, and a full function disk operating system to support microprogrammed hardware and firmware development. The Am9080A microprocessor can address 64 k bytes of memory. The disk operating system uses only the first 32 k bytes and the remaning 32 k is used to memory map (page) functions from the hardware development side. Through this mechanism, the designer has the ability to directly load and manipulate microprograms, monitor hardware functions, etc. There are extra enable lines from the page register which allow the System 29 user to map other functions into the support processor's upper 32k of memory.
The main memory of this 16 -bit high-speed computer design was mapped into the support processors upper 32 k via one of the unused page register enable lines. Besides the normal 16-bit interface, a simple 8 -bit interface was added to the main memory thus making it a simple two port memory. When the 16 -bit computer is halted (via a System 29 command) location 0 of 16 -bit main memory would be addressed as location 8000 hex of System 29 support processor memory. Location 1 would be 8001, 2 would be 8002, etc. This affected a mechanical link between the 16-bit prototype design and System 29.

In order to efficiently write a reasonably complex piece of software (such as a simple monitor), an assembler for the target instruction set is needed. Since this 16 -bit computer design is not exactly like any other 16 -bit computer, ready to run software tools are not available. A macro assembler is available as an optıonal enhancement to the System 29 software base. Even though this macro assembler is for programming in Am9080A assembly language, there is a user installable patch which will disable all of the Am9080A operation codes (Figure 32). With this patch installed, the user may now write a macro library defining the target machine's instruction set. It is not necessary to code the entire instruction set, as the first level of programming for the new machine (simple monitor, etc.) will be using only the phase 1 instruction set. A complete macro library of the AMD highspeed 16 -bit computer phase 1 instruction set is contained in Appendix B.

Now that the tools are in place, it is relatively simple to code and implement a simple monitor for the target machine. Appendix C contains the complete simple monitor listing for the AMD highspeed 16 -bit computer. Only the phase 1 instruction set was used which does not include byte instruction, call and return instructions, stack instructions, any special instructions, etc. This sımple monitor understands three commands: Display (D), Store (S), and Jump (J). Typing D followed by an address value will display 256 bytes of main memory beginning on the address given (rounded back to the nearest eight word boundary). Typıng an S followed by an address, followed by data, will store the data consecutively, on a nibble basis begınning at the given address. Typing in J followed by an address will cause the processor to begin execution at the main memory location given by the address. Commands, addresses, and data must be separated by at least one delimiter (space, comma, or period).

The change file shown below can be integrated into MAC to produce a new program, which we will call MAC29. The MAC29 program will not recognize 8080 mnemonics, but will recognize all the MAC pseudo operators and arithmetic functions.


Figure 32. Macro Assembler Disable Opcode Patch.

After writing the monitor, and putting it onto floppy disks via the System 29 editor, it must be assembled using the modified macro assember (described earlier). The result of the assembly is a hex file which is suitable for loading into the 16 -bit computer's main memory. This hex file is now loaded into support processor memory beginning at location 8000 hex. As discussed previously, this is mapped at location zero in the 16 -bit computer's main memory. Assuming the microcode is loaded and a terminal is connected to the 16 -bit computer, the monitor in 16 -bit main memory may now be executed. The complete System 29 session from editing and assembling the monitor to loading and executing it is given in Appendix D.

\section*{SUMMARY}

As can be seen throughout these application notes, designing a high performance Bipolar microprocessor system is a straightforward task. The Am2900 Family is ideally suited to provide building blocks for the various elements of the computer. These include the Computer Control Unit, the Central Processing Unit, the Program Control Unit, the Interrupt Structure and the various bus controls. Together, these elements allow the designer to
build computers using the current state-of-the-art architecture with LSI building blocks.
As technology improves, Advanced Micro Devices has been able to redesign these building blocks to offer increased performance. Thus, the Am2901 has evolved through an Am2901A, then an Am2901B and now an Am2901C is in the planning. In addition, the Am2903 offers additional architectural advantages and soon an Am29103 will provide additional speed and performance features. Similarly, the microprogram sequencer area began with the Am2909 and Am2911; then was followed by the larger Am2910. Soon, the Am2909A and Am2911A will provide higher speed in the microprogram sequencer area and will be followed by an Am2910A.
Thus, the future for Bipolar LSI building blocks includes not only more advanced product designs offering higher levels of integration and new functions for new architectures, but also offers higher performance versions of the already existing products. Advanced Micro Dievices is committed to providing high performance Bipolar LSI circuits utilizing proven technology designed to operate over the full military operating range as well as the commercial operating range. As always, these products continue to meet the performance requirements of MIL-STD-883.

\section*{APPENDIX A}

\section*{Complete Description of Instructions}

\begin{tabular}{|c|l|l|l|}
\hline \multicolumn{1}{|c|}{ RX, RSI } \\
\hline OP & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The second operand is loaded into the general register specified by \(\mathrm{R}_{1}\).

STORE


RX, RSI
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand specified by \(\mathrm{R}_{1}\) is stored at the location specified by the second operand.


RX, RSI
\begin{tabular}{|c|c|c|c|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand is added to the second operand and replaces the first operand.

ADD WITH CARRY


RX
\begin{tabular}{|c|c|c|c|}
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand (16 bits) with carry is added to the second operand and replaces the first operand.

SUBTRACT


RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The second operand is subtracted from the first operand and replaces the first operand.

SUBTRACT WITH CARRY
\begin{tabular}{|l|l|l|}
\hline \multicolumn{1}{l}{ OP } & \(\mathrm{R}_{1}\) & \(\mathrm{R}_{2}\) \\
\hline
\end{tabular}
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The second operand (16 bits) with carry is subtracted from the first operand and replaces the first operand.

AND


RX, RSI
\begin{tabular}{|c|c|c|c|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The AND of the first operand and the second operand replaces the first operand.

OR


RX, RSI
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The OR of the first operand and the second operand replaces the first operand.

XOR


RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline OP & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The logical difference of the first operand and the second operand replaces the first operand.

TEST IMMEDIATE
\begin{tabular}{|c|l|l|ll|}
\hline \multicolumn{1}{l}{} & \(R X, R S I\) \\
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand and the second operand are logically ANDed. The contents of \(R_{1}\) and \(X_{2}\) are unchanged.

\section*{COMPARE}


RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand is algebracally compared with the second operand. The result is indicated by the condition code.

COMPARE LOGICAL
RR, RS, SS


The first operand is compared logically to the second operand. The result is indicated by the condition code.

MULTIPLY

\begin{tabular}{|l|l|l|l|}
\hline \multicolumn{1}{r}{} & \multicolumn{1}{r|}{ RX } \\
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand \(\left(R_{1}+1\right)\) is multiplied by the second operand and the 32 -bit product is contained in \(\mathrm{R}_{1}\) and \(\mathrm{R}_{1}+1\) registers. \(\mathrm{R}_{1}\) must be an even address The sign of the product is determined by the rules of algebra.

MULTIPLY UNSIGNED


RX
\begin{tabular}{|c|c|c|c|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The first operand \(\left(R_{1}+1\right)\) is multiplied by the second operand and the 32-bit product is contaned in \(R_{1}\) and ( \(R_{1}+1\) ). \(R_{1}\) must be even.

LOAD BYTE


RX, RXI
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The 8-bit byte stored in the low order byte of the second operand location is stored in the low order byte of \(R_{1}\). The high order byte of the \(R_{1}\) is set to zero.

INSERT CHARACTER
RR, RS


RX, RSI


The byte at the second operand location is loaded into the low order byte of \(\mathrm{R}_{1}\) without changing the contents of the high order byte of \(\mathrm{R}_{1}\).

STORE CHARACTER
STORE BYTE


The least significant byte of the first operand is stored in the location specified by the second operand. The other byte of the second location is unchanged.

EXCHANGE BYTE
RR, RS
\begin{tabular}{|l|l|l|}
\hline OP & \(R_{1}\) & \(R_{2}\) \\
\hline
\end{tabular}
\begin{tabular}{|l|l|l|l|}
\hline \multicolumn{1}{l}{} \\
\hline\(O P\) & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The bytes specified by the first and second operands are exchanged. When the operand specifies a register (i.e. \(\mathrm{R}_{1}, \mathrm{R}_{2}\) ) only the low order byte is exchanged.


RX
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The two bytes of the second operand are swapped and loaded into the register specified by the first operand.

COMPARE LOGICAL BYTE


RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The low order byte of the first and second operands are compared. The result is indicated in the condition code.

\section*{AND BYTE}


RX, RSI
\begin{tabular}{|c|c|c|c|}
\hline\(O P\) & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The AND of the low order bytes specified by the first second operands replace the first operand low order byte. The high order byte of \(R_{1}\) is set to zeros

OR BYTE
\begin{tabular}{|l|l|l|}
\hline \multicolumn{2}{r}{} & \multicolumn{1}{r}{ RR, RS } \\
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) \\
\hline
\end{tabular}

RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

The OR of the low order bytes specified by the first and second operands replace the first operand low order byte. The high order byte of \(R_{1}\) is set to zero.

\section*{XOR BYTE}


RX, RSI
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The XOR of the low order bytes specified by the first and second operands replace the first operand low order byte. The high order byte of \(\mathrm{R}_{1}\) is set to zero.

LOAD PROGRAM STATUS WORD


A 32-bit new PSW is loaded from the memory location specified by the second operand as the current PSW.

EXCHANGE PROGRAM STATUS
\begin{tabular}{|l|l|l|}
\hline & \multicolumn{1}{r}{} & \multicolumn{1}{r}{} \\
\hline & \(R_{1}\) & \(R_{2}\) \\
\hline
\end{tabular}

PSW (0:15) \(\rightarrow\left(\mathrm{R}_{1}\right)\)
R2 \(\rightarrow\) PSW \((0: 15)\)

STORE PROGRAM STATUS WORD
\begin{tabular}{|c|c|c|}
\hline OP & \(R_{2}\) & dX \\
\hline
\end{tabular}

The 32-bit PSW is stored at the location specified by the second operand.

SUPERVISOR CALL
\begin{tabular}{|l|l|l|l|}
\hline OP & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

OLD PSW \(\rightarrow\left[\left(X_{2}\right)+d\right]\)
\(\left[\left(X_{2}\right)+d\right]+4 \rightarrow\) NEW PSW

SET, CLR, COMPLEMENT, TEST BIT PSW


The condition flags in the current PSW are set, cleared, complemented, or tested. N defines the bit(s) to be affected or tested.

\section*{CALL}

RX
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

Jump to the memory location specified by the second operand and push PSW (16:31) onto stack.

RETURN


POP STACK STACK \(\rightarrow\) PSW (16:31)

PUSH

\(\left.\begin{array}{l}\text { PSW } \\ \mathrm{R}_{0}-\mathrm{R}_{15}\end{array}\right\} \rightarrow\) STACK

POP


STACK \(\rightarrow\left\{\begin{array}{l}\text { PSW } \\ R_{0}-R_{15}\end{array}\right.\)

P/PUSH


LOAD STACK POINTER
LOAD STACK LIMIT LOWER LOAD STACK LIMIT UPPER

RX
\begin{tabular}{|c|c|c|}
\hline\(O P\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

STORE STACK POINTER
STORE STACK LIMIT LOWER
STORE STACK LIMIT UPPER
RX
\begin{tabular}{|c|c|c|}
\hline\(O P\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The stack point, stack limit lower or upper is read from or written into the address defined by the second operand.

TRANSLATE


The addresses specified by \(\mathrm{R}_{1}+1\) and \(\mathrm{R}_{2}\) define two tables, \(\mathrm{R}_{1}\) +1 address is the top location of a table to be translated, \(R_{2}\) address the first location of the translation table. The value (one byte) pointed to be the \(R_{1}+1\) address is indexed by (added to) the address value of \(R_{2}\) to find the translation code. This translation code replaces the value pointed to by the \(R_{1}+1\) address. After one byte is translated, the length is decremented and the address of \(\mathrm{R}_{1}+1\) incremented and the instruction repeated, until the length equals zero. This instruction is interruptable. If this instruction is interrupted, the PC is left pointing to this instruction so that this instruction can be resumed after the interrupt service is complete.

TRANSLATE AND TEST


This instruction proceeds like translate except that the bytes of the first operand (defined by \(\mathrm{R}_{1}\) ) are not changed in storage. When the bytes of the translate table \(\left(\mathrm{R}_{2}\right)\) the instruction proceeds to the next byte of the first operand. If the byte of the translate table is not zero, the instruction is halted with the address pointed to last in the translate table held in register 1.

MOVE LONG


Moves bytes defined by \(\mathrm{R}_{1}\) to \(\mathrm{R}_{2}\). Both adresses incremented after each transfer. This instruction is interruptable.


Compares the first operand against the second operand. The length is decremented and the address incremented after each compare. When length = zero of the bytes compared are not equal, the instruction is halted.

EXECUTE
\begin{tabular}{|c|c|c|c|}
\hline\(O P\) & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The upper 16 bits of the instruction at the second operand is 'OR'ed with \(\mathrm{R}_{1}\) and executed.

DECIMAL ADD

\begin{tabular}{|l|l|l|l|}
\hline OP & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

Nibbles in operand 1 and operand 2 are added. The result is placed in operand one.

DECIMAL SUBTRACT


Nibbles in operand 2 are subtracted from nibbles in operand 1 and the result is placed in operand 1.

DECREMENT INDEXES

\(\mathrm{R}_{1}-1 \rightarrow \mathrm{R}_{1}\)
\(R_{2}-1 \rightarrow R_{2}\)
One is subtracted from \(R_{1}\) and the result placed back into \(R_{1}\). One is subtracted from \(R_{2}\) and the result placed back into \(R_{2}\). \(R_{1}\) and \(R_{2}\) may specify the same register with will effectively subtract two from that register.

SHIFT RIGHT ARITHMETIC SHIFT RIGHT DOUBLE ARITHMETIC

RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shifts and \(R_{1}, R_{1}+1\) for double shifts are shifted the number of places specified by the second operand. The sign bit is unchanged. Bits shifted in are set equal to the sign bit. Bits shifted out are shifted through the carry bit.

ROTATE RIGHT
ROTATE RIGHT DOUBLE
RX, RS
\begin{tabular}{|c|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shitts and \(R_{1}, R_{1}+1\) for double shifts are rotated right the number of places specified by the second operand.

\section*{SHIFT LEFT ARITHMETIC SHIFT LEFT DOUBLE ARITHMETIC}

RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shifts and \(R_{1}+1\) for double shifts are shifted left the number of places specified by the second operand. The high order bit (sign bit) of the regıster a register pair is unaffected by the shift. Low order bits are filled with zeros. If a bit unlike the sign bit is shifted out of the position adjacent to the sign bit, the overflow flag is set.

ROTATE LEFT
RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(N\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shifts and \(R_{1}, R_{1}+1\) for double shifts are rotated left, the number of places specified by the second operand.

SHIFT RIGHT LOGICAL
SHIFT RIGHT DOUBLE LOGICAL
RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shifts and \(R_{1}+1\) for double shifts are shifted right the number of places specified by the second operand. High order bits shifted in are zeros, low order bits shifted out are shifted through the carry bit.

\section*{SHIFT LEFT LOGICAL SHIFT LEFT DOUBLE LOGICAL}

RX, RSI
\begin{tabular}{|l|l|l|l|}
\hline\(O P\) & \(R_{1}\) & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

The contents of \(R_{1}\) for single shifts and \(R_{1}, R_{1}+1\) for double shifts are shifted left the number of positions specified by the second operand. High order bits shifted out are shifted through the carry bit. Zeros are shifted in. \(\mathrm{R}_{1}\) for double shifts must be even.

\section*{INPUT WORD}


One 16 -bit word of data is read into the first operand from the device which is addressed by the contents of the second operand.

INPUT BYTE


RX
\begin{tabular}{|c|l|l|l|}
\hline OP & \(R_{1}\) & \(X_{2}\) & \(d\) \\
\hline
\end{tabular}

One byte of data is read into the low order 8 bits of the first operand from the device which is addressed by the contents of the second operand.

OUTPUT WORD

\begin{tabular}{|c|c|c|c|}
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}

The 16 bits of \(R_{1}\) is sent to the device which is addressed by the contents of the second operand.

\section*{OUTPUT BYTE}


RX


The low order 8 bits of \(\mathrm{R}_{1}\) is sent to the device which is addressed by the contents of the second operand.


Unconditionally branch to the location specified by the second operand. The first operand is not used.

\section*{BRANCH ON CONDITION}

\begin{tabular}{|c|c|c|c|}
\hline OP & CC & \(R_{2}\) & \(d\) \\
\hline
\end{tabular}

Branch to the location specified by the second operand if the condition code specified in the first operand postion is equal to the current PSW status bits.

\section*{Condition codes are:}
\begin{tabular}{ll} 
Carry & \(=\mathrm{B}\) \\
No Carry & \(=\mathrm{A}\) \\
Zero & \(=5\) \\
Not Zero & \(=4\) \\
2's Comp \(>\) & \(=0\) \\
2's Comp \(<\) & \(=3\) \\
2's Comp \(>\) & \(=2\) \\
2's Comp \(<\) & \(=1\) \\
Plus & \(=\mathrm{E}\)
\end{tabular}
\begin{tabular}{|c|c|}
\hline (Sign=0) & \\
\hline Minus & = F \\
\hline (Sıgn=1) & \\
\hline 1's Comp> & = 9 \\
\hline 1's Comp< & = 8 \\
\hline 1's Comp> & = \\
\hline 1's Comp< & = \\
\hline Overflow & \(=7\) \\
\hline Not Overflow & \\
\hline
\end{tabular}

BRANCH AND LINK


The address of the next sequential instruction is saved in \(R_{1}\), and an unconditional branch to the jump address is taken.

BRANCH ON INDEX
\begin{tabular}{|c|l|l|l|}
\hline BXH & \multicolumn{1}{l}{ RIGH } \\
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}
\begin{tabular}{l} 
BXLE \\
\hline
\end{tabular} \begin{tabular}{|c|c|c|c|}
\hline OOW OR EQUAL & RX \\
\hline OP & \(R_{1}\) & \(x_{2}\) & \(d\) \\
\hline
\end{tabular}
\(R_{1}\) is incremented by the value in \(R_{1}+1\), and logically compared to the index limit held in \(\mathrm{R}_{1}+2\).

\section*{INDEX HIGH}
\(\left(R_{1}\right)+\left(R_{1}+1\right) \rightarrow\left(R_{1}\right)\)
\(\left(R_{1}\right):\left(R_{1}+2\right)\)
IF \(\left(R_{1}\right)>\left(R_{1}+2\right)\) THEN \(d+\left(X_{2}\right) \rightarrow\) PSW (16:31)
IF \(\left(\mathrm{R}_{1}\right) \leqslant\left(\mathrm{R}_{1}+2\right)\) THEN PSW \((16: 31)+2 \rightarrow\) PSW (16:31)
INDEX LOW OR EQUAL
\(\left(R_{1}\right)+\left(R_{1}+1\right) \rightarrow\left(R_{1}\right)\)
\(\left(R_{1}\right):\left(R_{1}+2\right)\)
\(\left(\mathrm{R}_{1}\right) \leqslant\left(\mathrm{R}_{1}+2\right)\) THEN \(\mathrm{d}+\left(\mathrm{X}_{2}\right) \rightarrow\) PSW (16:31)
IF \(\left(R_{1}\right)>\left(R_{1}+2\right)\) THEN PSW (16:31) \(+2 \rightarrow\) PSW (16:31)

\section*{APPENDIX B}




\section*{APPENDIX C}

\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{5}{|l|}{} \\
\hline \multicolumn{2}{|l|}{\[
\begin{aligned}
& 619 \mathrm{~A}+95206026 \\
& 019 \mathrm{E}+473001 \mathrm{~A}
\end{aligned}
\]} & BC & LT?,R6, SETPER & ; SET PERIOD If true \\
\hline \(619 \mathrm{E}+473001 \mathrm{AA}\) & & CI & R2,007Fi & ;BELOW DEL? \\
\hline \multicolumn{5}{|l|}{} \\
\hline \(01 \mathrm{~A} 6+473 \mathrm{E} 0088\) & & BC & LT2,R14,8 & ; Ret if true (Char printable) \\
\hline \multirow[b]{2}{*}{\(61 \mathrm{~A}^{\text {a }}+4120002 \mathrm{E}\)} & \multirow[t]{2}{*}{SETPER:} & LI & R2, '.' & ; Set period as character to pri \\
\hline & & BR & R14 & ; \\
\hline \multicolumn{5}{|l|}{\(01 \mathrm{AE}+84 \mathrm{E} 6\)} \\
\hline & Store: & LI & R4, BUFOP1 & ; a (address fieid) \\
\hline \(0180+41400410\) & & \multirow[t]{2}{*}{bal} & \multirow[t]{2}{*}{R14,Re, CVADDR} & \multirow[t]{2}{*}{;ascil address to binary (in r6} \\
\hline \multirow[t]{2}{*}{61 B4 +45E00364} & & & & \\
\hline & & LD & R4, R6, Datad & ;Get current i/p data admess \\
\hline \(0188+58400406\) & & XR & R13,R13 & ; clear nibble count reg \\
\hline \multirow[t]{2}{*}{\[
\begin{aligned}
& 01 \mathrm{BC}+17 \mathrm{DD} \\
& 01 \mathrm{BE}+45 \mathrm{E} 001 \mathrm{E} 6
\end{aligned}
\]} & \multirow[t]{12}{*}{STLP:} & \({ }_{\text {BA }}\) & R14,Re, UPSTOR & ;upper byte first \\
\hline & & LD & \multirow[b]{2}{*}{R14,R0, eSCANER
R5, 000 DH} & \multirow[t]{2}{*}{;GET RET} \\
\hline 01C2+58E063F6 & & cI & & \\
\hline 01C6+9550006D & & & \multirow[b]{2}{*}{27,R14.8} & \multirow[b]{2}{*}{; Ret if true} \\
\hline \(01 \mathrm{Ca}+475 \mathrm{E0600}\) & & BC & & \\
\hline \multirow[t]{2}{*}{91CE+45E001Fa} & & baL & R14,R6,LOSTOR & ; LOWER byte \\
\hline & & LD & R14,Re, eSCANER & ; GET RET \\
\hline \(01 \mathrm{D} 2+58 \mathrm{E} 663 \mathrm{~F} 8\) & & CI & R5,000D \({ }^{\text {a }}\) & ; END \({ }^{\text {d }}\) \\
\hline 91- \(6+9550000 \mathrm{D}\) & & BC & 22,R14,0 & ; ret if true \\
\hline 61. 1 A +475 80008 & & AI & R4,0002 & ; TO NEXT YORD \\
\hline \(91 \mathrm{DE}+9 \mathrm{~A} 400002\) & & \multirow[b]{2}{*}{\({ }^{B X}\)} & \multirow[b]{2}{*}{Re, STLP} & \\
\hline \multirow[t]{2}{*}{\(01 \mathrm{E} 2+740001 \mathrm{BE}\)} & & & & ; COntinue storing data \\
\hline & \multirow[t]{7}{*}{UPSTOR:} & ST & R14,Re, ©UPSTOR & ; Save rep \\
\hline 01E6+58E603FA & & \multirow[t]{2}{*}{LD
SRL} & R5,R4,0 & ;GEt next data \\
\hline \(01 \mathrm{EA}+58540000\) & & & R5,8 & ; GET Hi byte \\
\hline \(01 \mathrm{EE}+8857\) & & BAL & R14,R0,STDATA & ;30 STORE BYTE \\
\hline \(017 \mathrm{P}+45 \mathrm{Eb0218}\) & & 1 & R14, RG, PUPSTOR & ; R \\
\hline \(0174+58 \mathrm{E} 063 \mathrm{FA}\) & & \multirow[b]{2}{*}{BR} & \multirow[b]{2}{*}{R14} & ; Restore ret \\
\hline \multirow[t]{2}{*}{\(0178+64 \mathrm{Ex}\)} & & & & ; \\
\hline & \multirow[t]{7}{*}{LOSTOR:} & ST & R14,R8.eUPSTOR & ; Save ret \\
\hline 01Fa+56E003Fa & & LD & R5, R4, 8 & ; GEt data \\
\hline 61FE +58540088 & & ni & R5, 00FFH & ; REEP Low byte \\
\hline 0262+945800FF & & \multirow[t]{2}{*}{bal} & \multirow[t]{2}{*}{R14,R6.STDATA} & \multirow[t]{2}{*}{;Go store byte} \\
\hline \(2206+45 \mathrm{E} 08218\) & & & & \\
\hline 926a +58 E ¢03Fa & & BR & R14 & ; RESTORE RET \\
\hline \multirow[t]{2}{*}{028E +64 E 0} & & BR & R14 & ; \\
\hline & \multirow[t]{10}{*}{stdata:} & st & R14,R6,estdata & ; Save ret \\
\hline 2218+50E003FC & & bal & R14, Rg, CIDEL & ; Check for delimitep \\
\hline \(0214+45\) Ever23C & & LD & \multirow[t]{2}{*}{R14,R6, eSt data} & ; GET RET \\
\hline \(9218+58 \mathrm{Ee063FC}\) & & & & \\
\hline \(921 \mathrm{C}+47580000\) & & \multirow[b]{2}{*}{BAL} & 22,814,0 & ; RET IF RC = \\
\hline 9220+45 E0025A & & & R14, & ;ascil byte to mex nibble \\
\hline \multirow[t]{2}{*}{9224+47300232} & & BC & LT\%, R6, SETND & ; .Nz. \\
\hline & & BAL & R14,R0,NIBBLE & ; STORE this nipble \\
\hline 9228+45800296 & & LD & R14,R0,estdata & ; Restore ret \\
\hline 022C+58E003FC & & BR & R14 & ; \\
\hline 9238+84E0 & SETND: & LI & R5, 000D \({ }^{\text {d }}\) & ; fase bof \\
\hline 9232+4150800D & & \(1{ }^{\text {d }}\) & & ; RESTORE RET \\
\hline 0236+58E003FC & & 1. & R14,R6, ©STdATA & ; Restore ret \\
\hline \multirow[t]{2}{*}{\(9234+8480\)} & & BR & R14 & ; \\
\hline & CKDEL: & CI & R5, \({ }^{\text {, }}\) & ; spacer \\
\hline \({ }^{2} 23 C+95500828\) & & BC & 27,R14,0 & ; Ret if true \\
\hline 0240+47580008 & & CI & R5, \({ }^{\prime}\) & ; PERIOD \\
\hline 6244+9550002E & & BC & 27, R14,8 & ; RET if true \\
\hline \(0248+47580008\) & & & & \\
\hline 024C+9550002C & & C1 & R5, , & ; Comma? \\
\hline 8250+47580008 & & BC & 22,R14,8 & ; Ret if true \\
\hline 8254+9550006D & & CI & R5, 600DH & ; Carriage ret? \\
\hline 0258+64E0 & & BR & R14 & ; let caller decide \\
\hline 0258+64E6 & ; & & & \\
\hline \multirow[t]{2}{*}{025A +945006FF} & ASCHEX: & NI & \({ }^{\text {R5, }}\) 807FH & ; LOV bite only \\
\hline & & cr & R5, '8' & ; lover tam '0' ? \\
\hline \(025 \mathrm{E}+95500838\) & & BC & LT?,R14,0 & ; ret if true \\
\hline 0262+473E0008 & & CI & R5, ':' & ; 0-9 ? \\
\hline 9266+9550003A & & BC & LT?,Rg, vNuM & ; numerical if true \\
\hline 0261+47308288 & & CI & R5, 'A' & ;Lover than 'a' ? \\
\hline 626E+95500041 & & & & \\
\hline 8272+473E0000 & & BC & LT7, R14,8 & ; Ret if true \\
\hline \multirow[t]{2}{*}{0276+95560847} & & cI & R5, 6847 \({ }^{\text {f }}\) & ; hex alphas \\
\hline & & вC & LTP, RG, VALPE & ; iex alpha if true \\
\hline \(027 \mathrm{~A}+47308284\) & & CI & R5,07FFF\% & ;SET .LT. CC \\
\hline \multirow[t]{2}{*}{\[
\begin{aligned}
& 027 \mathrm{E}+9550 \mathrm{FFFF} \\
& 0282+84 \mathrm{E} 6
\end{aligned}
\]} & & BR & R14 & ; \\
\hline & VALPE: & SI & R5, 6087 \({ }^{\text {\% }}\) & ; ASCII ADJUST \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|}
\hline 2284+9B500607 & VNUM: & NI & R5, ص00 FH & ;LOW NIBBLE ONLY \\
\hline \(0288+94500007\) & & CLR & R5, R5 & ; RC \(=2 \mathrm{ERO}\) \\
\hline \(028 \mathrm{C}+1555\) & & & & \\
\hline 028E+64E0 & & BR & R14 & ; \\
\hline & 'nibble: & LD & R7,R6,8 & ;GET OLD DATA \\
\hline 0296+58760000 & & ORR & R13,R13 & ; R13 \(=2 \mathrm{ERO}\) ? \\
\hline 0294+16DD & & BC & N2?,R6,NXNIB1 & ; TESt for one if not true \\
\hline \(0296+474002 \mathrm{AC}\) & & I & R13,0001 & ;bump nibble counter \\
\hline 629A+9AD86001 & & & & \\
\hline 02 & & SIL & R5, 12 & ; Position this nibble \\
\hline 29Etase & & NI & R7, ©FFFP & ; prepare old data for new mibil \\
\hline Ab+94700FFF & & ORR & R7,R5 & ; insert nek nibble \\
\hline 92A4+1675 & & ST & R7, R6,0 & ; data back to memory \\
\hline 92A6+50760000 & & BR & R14 & ; \\
\hline ¢2as+64ED & & & & \\
\hline & Nxnib1: & CI & R13,0001 & ; NEXT nibble? \\
\hline 02AC+95D00001 & & вC & NZ2,R0,NXNIB2 & ; TO NEXT IF NOT THIS \\
\hline 02B \(0+474082 \mathrm{C6}\) & & AI & R13.0001 & ; bump nibbie counter \\
\hline 62B4+5AD90061 & & & & \\
\hline \(92 \mathrm{B8}+8957\) & & SLL & R5,8 & ; Position this nibele \\
\hline 62 Ba+9476F9FF & & NI & R7,0Fbrfi & ; Prepare old data for new nibbl \\
\hline \(92 \mathrm{BE}+1675\) & & ORR & R7, R5 & ; insert new nibble \\
\hline & & ST & R7,R6,0 & ; data back to memory \\
\hline 02cb+50760008 & & BR & 14 & ; \\
\hline \(02 \mathrm{C4}+84 \mathrm{~EB}\) & & & & \\
\hline 92C6+95D90002 & NXNIB2: & CI & R13,0002 & ; NEXT NIbble? \\
\hline & & BC & n2?,Re,NXnib3 & ; to next if not this \\
\hline 62CA 474602 E 0 & & AI & R13,0061 & ; bump nibble count \\
\hline 02CE+9AD96ø日1 & & st & & ; POSITION THIS NIBBLE \\
\hline 92d2+8953 & & SLi & R5, & ;position this nibele \\
\hline 82D4+9476FF6F & & NI & R7, øFFORE & ; prepare old data for ney nibbl \\
\hline & & ORR & R7,R5 & ; insert new nibble \\
\hline 02D8+1675 & & ST & R7,R6,0 & ; data back to memory \\
\hline 02Da +50760000 & & Br & & \\
\hline 02dE +84 E ¢ & & & & \\
\hline & NXNIB3: & xR & R13,R13 & ; Last nibble (LSN) \\
\hline \(62 \mathrm{~Eb}+17 \mathrm{DD}\) & & NI & R7, ©FFFe日 & ; prepare old data for new nibbl \\
\hline 92E2+9478PFF6 & & ORR & R7, R5 & ; insert nev nibble \\
\hline 02E6+1675 & & & & ; DATA BACK TO MEMORY \\
\hline 82E8+58760000 & & ST & R7,R6, & jdata back to memory \\
\hline -2EC+9A600062 & & AI & R6,0002 & ; bump mem pointer \\
\hline \[
02 \mathrm{~F} 0+84 \mathrm{E} 0
\] & & BR & R14 & ; \\
\hline & jump : & LI & R4, BUFOP1 & ; A (ADDRESS) \\
\hline 0222+41400410 & & bal & R14,R0, CVADDR & ; \(\triangle\) SCII ADDRESS to binary addres \\
\hline 62F6+45E09364 & & LR & R15,R6 & ; ADDRESS TO R15 \\
\hline \(0274+1896\) & & BAL &  & ; JUMP... \\
\hline -2FC+45EF0600 & & BAL & R4,R15,0 & ; Jum. \\
\hline 0366+74006008 & & BX & R6, bEGIN & ; back to monitor if callee retu \\
\hline & cVaddr: & ST & R14,R0, ©CVADDR & ; SAve ret \\
\hline -304+50200402 & & XR & R6,R6 & ; clear r6 \\
\hline \(0308+1766\) & & LD & R5,R4,0 & ;GET TWO ADDRESS BYTES \\
\hline 9304+58540060 & & & & \\
\hline 0301+8857 & & SRL & R5,8 & ; UPPER PYTE FIRST \\
\hline 0310+45E00254 & & BAL & R14, Re, ASCHEX & ;ascil byte to hex nibble \\
\hline & & вC & Nz?, Re, CVhout & ; stop if not hex data \\
\hline 9314+47406358 & & ORR & R6,R5 & ; FIRST ADDRESS NIBBLE TO R6 \\
\hline 0318+1665 & & ID & R5, R4, 8 & ;GET address bytes again \\
\hline \(031 \mathrm{~A}+58540000\) & & & & \\
\hline 031E+45E0025A & & bal & R14,R6,ASCHEX & ;ascil byte to hex nibble \\
\hline & & BC & Nz2,Re, CVBout & ; Stop if not mex data \\
\hline 6322+4740835 & & SLL & R6,4 & ; POSITION ADDRESS FOR NEXT NIBb \\
\hline 0326+8963 & & ORR & R6, R5 & ; INSERT NEXT ADDRESS Nibble \\
\hline 0328+1665 & & AI & R4,0002 & ; bump memory ptr to next word \\
\hline 6324+94400602 & & & & \\
\hline 632E+58540000 & & LD & R5,R4,0 & ; next ascil address data \\
\hline \(0332+8857\) & & SRL & R5,8 & ; High bype first \\
\hline +45 EO 02 & & baL & R14,R0, ASCHEX - & ;ascil byte to hex nibble \\
\hline 0334+45E06254 & & BC & N27,Re, CVHot1 & ; stop if not hex data \\
\hline \(8338+47406354\) & & SLL & R6,4 & ; POSition addres for next nibe \\
\hline 933 \({ }^{\text {c }} 8963\) & & ORR & , & ; INSERT MEXT NIBBLE \\
\hline \(033 \mathrm{E}+1665\) & & -na & & \\
\hline \(8348+58540008\) & & LD & R5,R4,8 & ; Get address data again \\
\hline 0344+45E0825 & & BAL & R14,Re, ASCREX & ; ascil byte to bex nibble \\
\hline \(9348+47489358\) & & BC & n2?, Re, cviout & ; stop if not hex data \\
\hline & & SLl & R6,4 & ; position address for next nibb \\
\hline 634 \(4+8963\) & & ORR & R6, R5 & ;insert next nibble \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|}
\hline \multirow{2}{*}{0350+94400002} & CVHOUT: & AI & R4,0002 & ; TO NEXT MEMORY WORD \\
\hline & CVHOT1: & ST & R4, R6, DATAD & ; Save as data address \\
\hline \multirow[t]{2}{*}{0354+50408406} & & & & \\
\hline & & LD & R14,Re, OCVADDR & ; RESTORE RET \\
\hline \(0358+58 \mathrm{E} 00402\) & & BR & R14 & ; \\
\hline \multirow[t]{2}{*}{e \(35 \mathrm{C}+64 \mathrm{E} 0\)} & & & & \\
\hline & \[
\begin{aligned}
& \text { BINOUT: } \\
& \text { Bin }
\end{aligned}
\] & ST & R14, Re, ©BINOUT & ; SAVE RET \\
\hline 035E +50 E 003 FE & & & & \\
\hline \multirow[t]{2}{*}{\(0362+1826\)} & & LR & R2, R6 & ;0/P BYTE TO R2 \\
\hline & & SRL & R2,4 & ; UPPEK NIBBLE FIRST \\
\hline 9364+8823 & & NI & R2,0007H & ; EEEP Only good data \\
\hline \multirow[t]{2}{*}{9366+94200007} & & & & \\
\hline & & bal & R14,R0, HEXEX & ; binary nibble to ascil byte \\
\hline 936A+45E00386 & & bal & R14, Re, CRTOUT & ;Nibble (byte) OUT to crt \\
\hline \multirow[t]{2}{*}{036E+45E003D8} & & & & \\
\hline & & LR & R2,R6 & ; O/P Data TO R2 \\
\hline \(9372+1826\) & & NI & R2,000FR & ; \&EEP Only Low nibble \\
\hline 0374+94200007 & & BAL & R14,R0, HEXEX & ; binary nibble to ascil byte \\
\hline \multirow[t]{2}{*}{\(0378+45 \mathrm{E} 0038 \mathrm{E}\)} & & & & \\
\hline & & BAL & R14, Re, CRTOUT & ; AND OUT TO CR \\
\hline -37C+45E003D8 & & LD & R14,Ro, QBINOUT & ; Restore ret \\
\hline \(0380+58 \mathrm{E} 063 \mathrm{FE}\) & & & R14 & \\
\hline \multirow[t]{2}{*}{0384+64E6} & & er & R14 & ; \\
\hline & ; & CI & R2,000A & ; A-F ? \\
\hline \(0386+95200006\) & & & & \\
\hline \multirow[t]{2}{*}{\(038 \mathrm{~A}+47 \mathrm{P00392}\)} & & BC & MI P,R0,CON & ; br if not true \\
\hline & & AI & R2,0007 H & ; ADJUST FOR A-F \\
\hline \(038 \mathrm{E}+9 \mathrm{~A} 200007\) & COR: & AI & R2,0030日 & ; make ascil \\
\hline \multirow[t]{2}{*}{0392+9A206030} & & & & \\
\hline & & BR & R14 & ; \\
\hline 0396+04E6 & ; & ST & R14,R0, @GETCER & ; SAve ret \\
\hline \multirow[t]{2}{*}{\(8398+50208400\)} & & & & \\
\hline & RDCER: & IN & R1,R0, STATUS & ; STRIP PARITY \\
\hline 639C+A010FFFB & & NI & R1,0002 & ;1/P READY? \\
\hline \(83 \mathrm{AC}+94100002\) & & BC & 27, R6, RDCHR & ; loop until cearacter ready \\
\hline 23A4+4750839C & & & & \\
\hline \multirow[t]{2}{*}{\(0348+\triangle 010\) FFFA} & & IN & R1,Re, DATA & ; Read data \\
\hline & & NI & R1,007F\% & ; KEEP ONLY data byte \\
\hline 03AC+9410067\% & & LR & R2,R1 & ; data to rz \\
\hline \(83 \mathrm{~B} 0+1821\) & & & & \\
\hline 93B2+45E603D8 & & BAL & R14,Re,CRTOUT & ; ECHO I/P \\
\hline \multirow[t]{2}{*}{\(03 \mathrm{BC}+1812\)} & & LR & R1,R2 & ; data back to ri \\
\hline & & LD & R14, Re, @GETCHR & ; GET RET \\
\hline \(63 \mathrm{B8}+58 \mathrm{E} 00400\) & & LI & R15,-1 & ; SET R15 .N2. \\
\hline \(03 \mathrm{BC}+41 \mathrm{FeFFFF}\) & & LI & R2,000AH & ;lf Code in case of cr \\
\hline
\end{tabular}


\section*{APPENDIX D}

The System 29 operating system manages two floppy disk drives, \(A\) and \(B\). The system will prompt with \(a>\) or \(B>\) depending upon which disk the operator selects as the default. Generally, most system programs (editors, debuggers, compilers, etc.) are on the

A disk and most user generated programs (source programs, user libraries, special assemblers, etc.) are on the B disk. In the following session, lower case letters are what the user typed-in, upper case letters are what System 29 responded, and comments (added as a tutorial) are in curly brackets.
A>ed b: amd16bit.asm
\({ }^{*}\)
\({ }^{*} e\)
A \(>\) b:
B \(>\) mac29 amd16bit \$ab hb pb sb
\{call the editor to edit AMD16BIT.ASM from the \(B\) dısk \}
\{any program additions, changes, and/or deletions go here\}
\{exit the editor and save the new AMD16BIT.ASM on the B disk\}
\{switch to the B disk as default\}
\{use the modified macro assembler (MAC29) to assemble AMD16BIT.ASM and put the HEX, PRINT and SYMBOL files back on to the B disk \}

\section*{ASM29 VER. 10}

0490
03BH USE FACTOR
END OF ASSEMBLY
\(B>a\) :
A>ddt29 h e
A>set pa 3d
A \(>\) ddt
\{switch back to the A disk \}
\{run DDT29, Halt the 16-bit computer's clock and Exit DDT29\}
\{set the page regıster bit to enable the 16-bit computer main memory as 9080 upper 32 k \}
\{load 9080 DDT \(\}\)
DDT VERS 1.4
\#ib:amd16bit.hex
\#r8000
NEXT PC END
840E 0100 577F
\# \(\uparrow\) C
A>lbpm m29 wcs cl ul dc 1
\{exit DDT via control-C \(\}\)
\{load the 16 -bit computer's microcode (phase-1 instruction set) \(\}\)
LOADING: M29.OBJ
TITLE: MICROPROGRAM FOR 16-BIT COMPUTER
VERIFYING: M29.OBJ
TITLE: MICROPROGRAM FOR 16-BIT COMPUTER VERIFY COMPLETE
A \(>\) ddt29 ir 0 j r
\{run DDT29, set the instruction address register to zero (IR 0 ), jam the address on to the microprogram address bus (J), and run the 16-bit computer's clock (R) \}

At this point, the AMD 16-bit high speed computer is running phase 1 instruction set in microcode and the simple monitor in target machine language in 16-bit main memory. A CRT terminal
set to 9600 baud and connected to console USART can now exercise the simple monitor.

\section*{APPENDIX E}

\section*{Memory Board}

The 16-Bit Computer Main Memory board was organized with 8 k by 16 -bit RAM section and a \(2 k\) by 16 -bit ROM section. The RAM section occupies address \(0-8 \mathrm{k}\) while the ROMs are assigned addresses 8 k through 10k. The memory word consists of two bytes. The least significant address line specified whether high or low byte but is not used in the word mode. The address value from the computer is captured in a register at the beginning of the cycle; however, the most significant address lines are routed straight from the bus to the clock decode logic to make an early decision as to whether the memory board has been selected.
In the word mode, the read and write transfers are straight forward. For the byte read mode, data is output on bus bits \(\mathrm{BD}_{0-7}\) while \(\mathrm{BD}_{8-15}\) are forced to zero. During byte write mode bus bits \(\mathrm{BD}_{0-7}\) are duplicated internally on lines \(\mathrm{D}_{0-7}\) and lines \(\mathrm{D}_{8-15}\). The signals WRHIGH or WRLOW select which byte in the RAM memory is effected.
The control logic generates the bus control line sequencing required by the 16 -Bit Computer. The memory read and write timing is shown in Figures E1 and E2. The bus controller function is simulated for the purposes of the prototype. Bus Request is clocked into a flip-flop and Bus Acknowledge is returned to the
computer. The Memory Request signal from the computer initiates a memory cycle. Fifty nanoseconds later the memory board responds with Address Accept. The computer then follows this with Data Request. The memory board responds with Data Sync and 50 nanoseconds later the data read out of the memory is clocked into the output registers and output on the data bus. Looking at the memory read timing diagram, it is seen that a read cycle is initiated with Memory Request but the data is not sent back to the computer until the beginning of the next microcycle.
The write cycle is extended one oscillator cycle. This is necessary with the Am9124 RAMs because the data are not sent to the memory board untıl Data Request goes active (see Figure E2), which is 100 nanoseconds into the write cycle. With the clocked handshaked memory protocol of the 16 -Bit Computer, this is easily done by delaying Data Sync one oscillator cycle. Since normally a computer performs many more read than writes, this impacts throughput only slightly.
Additional logic was appended to allow the memory to be accessed by the System 29 microprogramming development system. The Map Page (MAPP) of System 29 was used to specify the memory. The logic interfaces the control signals required by System 29 and the 16-Bit Computer Memory board. With this logic, the System 29 user can readily read or write into the memory.



Figure E2. Memory Write Timing.


Block Diagram 16-Bit Computer.




16-Bit Computer PCU Memory Address Register.



16-Bit Computer Data Path.


M FIELD GENERATES Am2904 TEST IO-3 SIGNALS
trpically
R1 FIELD \(\rightarrow\) Am2903 - B INPUTS
R2 FIELD \(\rightarrow\) Am2903 - A INPUTS



16-Bit Computer Microprogram Memory.


APPENDIX F


16-Bit Computer Memory and Clock Control.


\section*{APPENDIX F}




16-Bit Computer Memory Board.



16-Bit Machine Memory Board.




THE COMPONENTS ON THIS PAGE ARE USED TO INTERFACE TO THE WRITEABLE CONTROL
STORE OF THE PROTOTYPING SYSTEM (S/29) AND ARE NOT PART OF THE FINAL COMPUTER
DESIGN. DELETE THESE COMPONENTS FROM THE COMPONENT COUNT.


AC performance, worst case, 106
AC propagation delay paths, longest, 33
Access time of memory, read, 5
Accumulator configuration, ALU, 96-99
Accumulators, 3
Add, BCD, 157, 159
microcode, 166-167
ADD DIRECT microword, 123-125
ADD IMMEDIATE microword, 123, 124
ADD instruction:
description of, 349
steps for, 98
ADD microword, 123, 124
ADD RR1 microword, 123, 125
ADD WITH CARRY instruction description, 349
Adder:
carry lookahead, four-bit, 94-95
full (see Full adder)
Address, effective, 7
Address-based microinstruction cycle, 35
Address compare mode:
Am2940, 242
Am2942, 251
Address counter:
Am2940, 241
Am2942, 250
Address-data-based microinstruction cycle, 35, 37
Address-instruction-based microinstruction cycle, 35, 36
Address-instruction-data-based microinstruction cycle, 35-37
Address line control, 240
Address maintenance, 240
Address multiplexer, 15
Address output buffers, Am2940, 242
Address register:
Am2940, 242
Am2942, 250

Addressing, 7
direct, 7, 265
immediate, 7, 265
indexed, 264,281
relative, 4-5
stack, 265
Addressing modes, HEX-29, 264-265
ALU (see Arithmetic \& Logic Unit)
Am25LS168 decade counter, 47
Am25LS2521, 306
Am25LS377 eight-bit registers, 39
Am26S02, 306
Am27S29 512 \(\times 8\) fuse-link PROM, 288-290
Am2900 family, 7, 61
Am2900 Learning and Evaluation Kit, 260
Am2900 system, Am2914 in, 215-220, 229-231
Am2901A arithmetic logic unit/function generator,
93, 98-101
Am2903 compared with, 106, 113
architecture, 99-101
arithmetic operation speed computations, 108
logic operation with shift speed computations, 109
logic operation speed computations, 108
magnitude only arithmetic operation with shift down speed computations, 110
microinstruction control, 101
as program control unit, 203, 204
propagation delays, combinational, 126
set-up and hold times, 126
switching characteristics of, 125, 126
timing analysis summary, 144
two's complement arithmetic operation with shift down speed computations, 109
Am2901B ALU/register sets, 290-298
input bus, 298, 300, 301
output bus, 298, 299

Am2902 carry lookahead generator, 214-215 logic diagram, 216
Am2902A carry lookahead generator, 93-96, 201
Am2903 four-bit expandable bipolar microprocessor slice, 101
ALU destination control, 180
Am2901A compared with, 106, 113
architecture, 101-104
ALU, 102
ALU destination control summary, 104
ALU functions, 103, 180
ALU shifter, 102-103
instruction decoder, 104
output buffers, 104
Q Register, 103-104
special functions, 103, 180
two-port RAM, 102
arithmetic operation with 16 -bit speed computations, 111
block diagram, 102
data bus cascading, 106
example, 220, 221
general description, 101
- logic operation with shift speed computations, 111
logic operation speed computations, 110
magnitude only arithmetic operation with shift down speed computations, 112
microprogram control bits, 115, 122
microroutines with, 146-159
microword ADD, 123, 124
microword ADD DIRECT, 123-125
microword ADD IMMEDIATE, 123, 124
microword ADD RR1, 123, 125
microword FETCH, 122-124
microword INIT, 122, 123
mnemonics, 180
programming, 104
propagation delays, combinational, 127
RAM address cascading, 106
registers, expanding the number of, 105-106
sample microroutines, 122-125
set-up and hold times, 127
16-bit design, 113-125
switching characteristics of, 125, 127
timing analysis summary, 144
two's complement arithmetic operation with shift down and 16 -bit speed computations, 112
Am2904 status and shift control unit, 131
AMDASM Phase 1 and Phase 2 listings of microprograms, 168-179
arithmetic operation and 16 -bit speed computations, 141
arithmetic operation speed computations, 138 arithmetic operation two's complement with shift down and 16-bit speed computations, 142
assumed set-up time, 135

Am2904 status and shift control unit (Cont.):
BCD add, 157, 159, 166-167
BCD hardware additions, 154-155
BCD to binary conversion, 155-156, 164-165
Binary to \(B C D\) conversion, 156-158, 164-165
bit operations, 132
block diagram, 131
carry-in control multiplexer instruction codes, 134
carry-in multiplexer, 132
condition code multiplexer, 132
condition code output instruction codes, 134
CPU hardware diagram, 182-187
criteria for comparing two numbers following " \(A\) minus B'" operations, 134
double-length normalize command, 146-149, 162-163
load operations, 132
logic operation with shift speed computations, 139, 141
logic operation speed computations, 138, 140
machine register condition code output, 181
machine status register, 131
machine status register instruction codes, 132, 181
magnitude only arithmetic operation with shift down speed computations, 140, 142
microprogram structure, 144-145
microregister condition code output, 181
microstatus register, 131
microstatus register instruction codes, 132, 181
mnemonics, 181
non-restoring binary roots, 153-154, 162-163
normalization microroutine, 146-147
pipelined microprogram bits, 144-146
preliminary switching characteristics, 135
register operations, 132
sample microroutines, 146-159
shift linkage multiplexer instruction codes, 133
shift linkage multiplexers, 131-132
16-bit, 135, 144
standard device Schottky speeds, 135
status registers, 131
timing analysis, 135
two's complement arithmetic operation with shift down speed computations, 139
two's complement division, 150-153, 160-161
two's complement multiplication, 150, 160-161
unsigned multiply, 147, 149, 160-161
Y output instruction codes, 135
Am2909 microprogram control sequencer, 33, 38
computer control unit using, 42-45
Am2910 microprogram sequencer, \(9,17,229,231\), 232,330
architecture, 17-18
block diagram, 231
calculations on, 24-27
complete CCU using, 43, 48-49

Am2910 microprogram sequencer (Cont.):
computer control unit architecture using, 22-38
initializing, 38
instruction set, 19-22
table, 18
Am2911 microprogram sequencer, 15-17, 220, 222
in computer control unit, 39-42
propagation delay calculations for, 24, 28-31
Am2914 Vectored Priority Interrupt controller, 211-235
in Am2900 system, 215-220, 229-231
block diagram, 212-214
cascading, 214-215
logic symbol, 212
microinstruction set, 212-213
programming, 212-213
registers, 213-214
time delay using, 220, 224-229
Am2918 four-bit register, 10, 42, 47
Am2922 eight-input multiplexer, 24-32, 47
Am2930 program control unit, 201-204
block diagram, 201
parallel look-ahead expansion scheme for, 202
ripple expansion scheme for, 202
Am2940 DMA address generator, 241-249
architecture, 241-247
control modes, 242-243
example design, 247-249
general description, 241
instructions, 243
timing, 243-247
Am2942 Programmable Timer/Counter, DMA Address Generator, 249-256
architecture, 249-251
block diagram, 250
control modes, 25
example designs, 252, 254-256
function table, 253
general description, 249
instructions, 251-253
Am29705 16-word by 4-bit two-port RAM, 105-106 block diagram, 105
Am29761 265-word 4-bit PROM, 39
Am29803A 16-way branch control unit, 43
computer control unit using, 42-45
function table, 46
Am29811A next address control unit, 15-17
in computer control unit, 39-42
instruction set, 16
instruction set difference from Am2910 instruction set, 22
propagation delay calculations for, 24, 28-31
Am74S138,306
Am74S158 Two-Input Multiplexer, 42
Am74S175, 42
Am74S181 four-bit arithmetic logic unit/ function generator, 96, 97

Am74S240,306
Am74S373,306
Am74S374 registers, 290, 291, 301, 305
Am9080A type microprocessor, 10, 239
CRT controller and, 62
AMD 16-Bit Computer design (see Super Sixteen)
AMD System/29 (see System/29)
AMDASM, 10
definition file, 330, 336-337 and assembly file, 79-87
Phase 1 and Phase 2 listings of Am2904 microprograms, 168-179
AND BYTE instruction description, 351
AND instruction description, 349
Applications, microprogram state machine, 33-34
Architecture:
Am2901A, 99-101
Am2903 (see Am2903, architecture)
Am2940, 241-247
Am2942, 249-251
computer, 2-10
microcomputer, block diagram of, 10
microprogrammed, 10
possible state machines, 34-37 figure, 35-37
subroutine stack, 200
Super Sixteen CPU, 321-330 logic diagrams, 365-384
Arithmetic \& Logic Unit (ALU), 3-6, 93
Am2903, 102
destination control, Am2903, 180
destination control summary, Am2903, 104
functions, Am2903, 180
multi-register, 98
shifter, 102-103
Super Sixteen, 321-323, 366-367
Arithmetic \& Logic Unit/accumulator configuration, 96-99
Arithmetic \& Logic Unit/register sets, Am2901B, 290-298
Arithmetic \& Logic Unit/three register machine, operation of, 98-99
Arithmetic operation with 16 -bit speed computations:
Am2903, 111
Am2904, 141
Arithmetic operation speed computations:
Am2901A, 108
Am2904, 138
Arithmetic operation two's complement with shift down and 16-bit speed computations, Am2904, 142
Arithmetic operations, 3
HEX-29, 266-267
Arithmetic Processor Unit, 6
Asynchronous events, 207

BCD add, 157, 159
microcode, 166-167
BCD conversion, binary to, 156-158
microcode, 164-165
BCD hardware additions, Am2904, 154-155
BCD to binary conversion, 155-156
microcode, 164-165
Binary conversion, BCD to, 155-156 microcode, 164-165
Binary roots, non-restoring, 153-154 microcode, 162-163
Binary to BCD conversion, Am2904, 156-158 microcode, 164-165
Bipolar microprogram controller, 17
Bit operations, Am2904, 132
Bit slice timing, 106-113
Block diagram:
Am2903, 102
Am2904, 131
Am2910, 231
Am2914, 212-214
Am2930, 201
Am2942, 250
Am29705, 105
DMA peripheral controller, 248
HEX-29, 278, 280, 282-285
microcomputer architecture, 10
Branch and link flow chart, 339, 341
BRANCH AND LINK instruction description, 355
Branch and stack instruction, 200
Branch code, HEX-29 microprogram sequencer, 288
BRANCH instruction, 14
BRANCH instruction description, 355
BRANCH ON CONDITION instruction description, 355
BRANCH ON INDEX instruction description, 355
BRANCH operation, 4
Branches, conditional, 261-262
Bulk Memory, 6
BYTE SWAP instruction description, 351
Bytes, 8-bit, 191

Call executive instruction, 275
CALL instruction description, 352
Call interrupt service routine microprogram, 220, 223
Carry flag, 263
Carry generate, 94
Carry-in, 94
Carry-in control multiplexer instruction codes, 134
Carry-in multiplexer, Am2904, 132
Carry lookahead adder, four-bit, 94-95
Carry lookahead generator, Am2902A, 93-96, 201
Carry method usages, ripple, 95
Carry output flag, 99

Carry propagate, 94
Cascading the Am2914, 214-215
CCU (see Computer Control Unit)
Central processing unit architecture, Super Sixteen, 321-330
logic diagrams, 365-384
Central Processor Unit (CPU), 3-7
hardware diagram, Am2904, 182-187
HEX-29 (see HEX-29 CPU)
with internal high speed registers, 3, 4
read timing, Super Sixteen, 325
Centralized DAM, 240
Channels, 275-276
Clock, system:
Am2940, 242
Am2942, 251
8-phase, 8
HEX-29, 278, 281, 286
Clock control, Super Sixteen, 322, 324, 374-375
Clock pulse (CP), 262
numbered, 66
COMPARE instruction description, 350
COMPARE LOGICAL BYTE instruction description, 351
COMPARE LOGICAL instruction description, 350
COMPARE LONG instruction description, 353
Computer, stored-program, 3-7
Computer architecture, 2-10
(See also Architecture)
Computer basics, 3-5
Computer control flow diagram, 34
Computer control function flow diagram, 9
Computer Control Unit (CCU), 5-10
architecture, 15-17
using Am2909 and Am29803A, 42-45
using Am2910, 22-38, 43, 48-49
using Am 2911 and Am29811A, 39
using Super Sixteen, 322
set-up for high-speed micro-level interrupt handling, 232
timing, 23-33
Computer data path, three register, 97
Computer design, AMD 16-bit (see Super Sixteen)
Condition code input, 202
Condition code multiplexer, 15
Am2904, 132
Condition code output instruction codes, Am2904, 134
Condition code register, HEX-29, 263-264
Condition select multiplexer, 261
Conditional branch flow chart, 339, 340
Conditional branches, 261-262
CONDITIONAL JUMP PIPELINE instruction, 19, 21-22
figure, 20
CONDITIONAL JUMP REGISTER/COUNTER or PIPELINE instruction, 21

\section*{CONDITIONAL JUMP REGISTER/COUNTER or PIPELINE instruction (Cont.):}
figure, 20
Conditional jump speed computations, \(24,25,28,29\)
CONDITIONAL JUMP-TO-SUBROUTINE instruction, 19, 21
figure, 20
CONDITIONAL JUMP VECTOR instruction, 21
figure, 20
Conditional jumping, 13-15
Conditional operation, 4
Conditional push speed computations, 32
CONTINUE instruction, 22
figure, 20
CONTINUE (CONT) statement, 13
Control modes, Am2940, 242-243
Control register:
Am2940, 241
Am2942, 250
Control store, 260
Counter register, microprogram, 15
CP (clock pulse), 66, 262
CPU (see Central Processor Unit)
CRT controller:
AMDASM definition and assembly files, 79-87
complete wiring diagram for, 50,52-53
design of, 47, 50-61
display formats accommodated on, 78-87
logic diagram, 47, 51
of interface circuit for, 62, 63
microprogram for, 50, 54-56
principle of operation, 47,50
software emulation, 66-67
of System 29 universal card, 88
timing considerations, 50, 57-61
wiring diagrams, 62, 64-65
Cycle steal method, 240

D bus, 97
D input, 101
Data-Based microinstruction cycle, 35, 36
Data bus cascading, Am2903, 106
Data formats, 191
Data movement, HEX-29 mnemonics, 267
Data movement capabilities, 7
Data multiplexer:
Am2940, 242
Am2942, 251
Data path, 92-187
Super Sixteen, 324, 370-371
three register computer, 97
Data routing, 261-262
Data transfer control, 240
Dead page, 274
DECIMAL ADD instruction description, 353

DECIMAL SUBTRACT instruction description, 353
DECREMENT INDEXES instruction description, 353
Defined register class of instructions, 265
Definition file, AMDASM, 330, 336-337
Delay, time, using Am2914, 220, 224-229
Delay path, longest signal, 23-24
Delays, 6
Depth, memory, 5
Depth-over-width ( \(\mathrm{d} / \mathrm{w}\) ) ratio of memory, 10
Design, microprogrammed, 12-61
Direct addressing, 7, 265
Direct memory access (DMA), 6, 238-256
centralized, 240
control, HEX-29, 301, 305
distributed, 240
implementation, 240
I/O system, 240
peripheral controller block diagram, 248 repetitive, 240
Direct memory access (DMA) Address Generator, Am2940, 241-249
Direct memory access (DMA) Controller, 240
Disk drive management, System 29 operating system, 361
Distributed DMA, 240
Dividend in divide operations, 151
Division, two's complement, 150-153 microcode, 160-161
Divisor in divide operations, 151
DMA (see Direct memory access)
Double-length normalize command, 146-149
microcode, 162-163
Double words, 32-bit, 191
d/w (depth-over-width) ratio of memory, 10

\section*{Effective address, 7}

Emulation, software, of CRT controller, 67-77
Enable control, extended, 42
Enable stack signal (FILE ENABLE), 15
Engineering model Super Sixteen, 346-348
EXCHANGE BYTE instruction description, 351
EXCHANGE PROGRAM STATUS instruction description, 351
EXECUTE instruction description, 353
Execution of microinstructions, 13
Executive interrupts, 207
Exponent, signed, 191
Extended enable control, 42

F bus, 97
FETCH, overlapping or pipelining, 14-15
FETCH instruction, 4

FETCH microword, 122-124
Fetch routine, 220, 223, 231
Fields, microinstruction, 13
FILE ENABLE (enable stack signal), 15
Fixed point numbers, 191
Floating point numbers, 191
Flow diagram:
computer control, 34
computer control function, 9
read control, 249
FORMATTING, 13
Fraction, signed, 191
Full adder:
basic, understanding, 93-99
four-bit ripple-carry, 93
truth table, 93
Full adder cells, cascaded, 93
Full stack, 19
Function logic, 262

General purpose (GP) computer, 8
General register class of instructions, 265
Generate, carry, 94
GP (general purpose) computer, 8

HAL (HEX-29 Assembly Language), 264
Half sign flag, 263
HEX-29 Assembly Language (HAL), 264
HEX-29 CPU, 258-315
addressing modes, 264-265
arithmetic operations, 266-267
block diagram, 278, 280, 282-285
carry-in control, 290
condition code control, 290, 298
condition code register, 263-264
DMA control, 301, 305
DMA/refresh control, 275-276
features, 259-260
general specifications, 263-275
instruction matrix, 273-274
instruction set, 265-274
internal CPU registers, 263
interrupt control, 301-304
interrupt structure, 275
macro instructions, 268-272
microcode, 307-315
microprogram control, 281, 287-288
microprogram sequence branch code, 288
microword memory, 288-290
operating system for timesharing (HOST), 275
shift and rotate linkage, 290
system bus, 276-277
system clock, 278, 281,286
system design goals, 259

HEX-29 CPU (Cont.):
system timing, 277-281
HEX-64KBS static memory card, 306

ICU (interrupt control units), 215, 219, 220
Immediate addressing, 7, 265
Immediate instruction flow chart, 337, 339
Indexed addressing, 264, 281
Indirect addressing, 7
INIT microword, 122, 123
Initializing the Am2910, 38
Input, vector, 42-43
Input bus, Am2901B, 298, 300, 301
INPUT BYTE instruction description, 354
Input instruction flow chart, 340, 341
Input/output (see I/O)
INPUT WORD instruction description, 354
INSERT CHARACTER instruction description, 350
Instruction-Based microinstruction cycle, 35
Instruction control speed computations, 27, 30
Instruction-data-based microinstruction cycle, 35, 36
Instruction Decoder, 9
Am2903, 104
Am2940, 242
Am2942, 251
Instruction descriptions, Super Sixteen, 349-355
Instruction Enable pin, 202
Instruction formats, 191
HEX-29, 264
Super Sixteen, 321
Instruction matrix, HEX-29, 273-274
Instruction register, 7
Instruction set, 7
Am29811A, 16
HEX-29, 265-274
Instruction types, 191-197
memory immediate instruction, 197
memory to memory indexed instruction, 196
memory to memory instruction, 194
register immediate instruction, 196
register to indexed memory instruction, 195
register to memory immediate instruction, 195196
register-to-memory-reference instruction, 193194
register-to-register (RR) instructions, 191-193
register with short-immediate instruction, 194195
Instructions, 3
Am2940, 243
Am2942, 251-253
defined register class of, 265
executed sequentially, 198
executing, 4-5

Instructions (Cont.):
general register class of, 265
Super Sixteen, 319-321
(See also Microinstructions)
Interface circuit for CRT controller, 62, 63
Intermediate slice (IS), 102
Internal high speed registers, CPU with, 3, 4
Interprocessor interrupts, 207
Interrupt, 206-235
Super Sixteen, 324, 376-377
Interrupt acknowledge, 208
Interrupt control, HEX-29, 301-304
Interrupt control units (ICU), 215, 219, 220
Interrupt Controller, 6
Interrupt driven I/O, 239
Interrupt example, microprogram level, 229-235
Interrupt handling:
computer control unit set-up for high-speed micro-level, 232
sequence of events for, 207-208
Interrupt masking, 207-208
Interrupt microprogram, return, 220, 223, 231
Interrupt nesting, 210
Interrupt priority encoder, 210-211
Interrupt recognition, 207
Interrupt registers, HEX-29, 264
Interrupt request:
instruction flow during, 220, 223
multiple, 209
daisy chain acknowledge, 209
single: daisy chain acknowledge, 208 multiple poll, 208
Interrupt request clearing, 210
Interrupt request handling, multiple, 210
Interrupt request masking, dynamic, 210
Interrupt request prioritization, 210
Interrupt request priority threshold, 211
Interrupt Return instruction, 208
Interrupt sequence timing, 234, 235
Interrupt service routine, 208
Interrupt service routine microprogram, call, 220, 223
Interrupt service routine nesting, 211
Interrupt structure, 208-209
general purpose, 210-211
HEX-29, 275
Interrupts:
machine versus microprogram level, 207
microprogram, 43
priority schemes in, 209-210
types of, 207
Intraprocessor interrupts, 207
Intrasystem interrupts, 207
Invalid access block, 274
Invalid instruction trap, 275
I/O (input/output), 239
devices, 3, 6

I/O (input/output)(Cont.):
DMA, 240
Super Sixteen, 324, 376-377 write timing, 328
IS (intermediate slice), 102

JMP (JUMP) instruction, 13
JUMP, UNCONDITIONAL, 14-15
JUMP and ZERO (JZ) instruction, 19 figure, 20
JUMP (JMP) instruction, 13
JUMP MAP instruction, 19
figure, 20
Jump map speed computations, 26, 29, 30
JUMP operation, 4
JUMP-TO-ONE-OF-TWO-BRANCH-ADDRESSES instruction, 17
JUMP-TO-ONE-OF-TWO-SUBROUTINES instruction, 15, 17
JUMP-TO-SUBROUTINE instruction, 15, 200
Jumping:
conditional, 13-15
microprogram, 13
JZ (JUMP and ZERO) instruction, 19, 20

Last-in first-out (LIFO) stacking arrangement, 199
Latch bypass, 220
Latency times, 6
Least significant slice (LSS), 102
LIFO (last-in first-out) stacking arrangement, 199
LOAD BYTE instruction description, 350
LOAD COUNTER AND CONTINUE instruction, 22
figure, 20
LOAD instruction description, 349
Load operations, Am2904, 132
LOAD PROGRAM STATUS WORD instruction description, 351
Load select control function, 14
LOAD STACK instructions, description of, 352
Logic diagram:
Am2902, 216
Super Sixteen CPU, 365-384
Logic operation with shift speed computations:
Am2901A, 109
Am2903, 111
Am2904, 139, 141
Logic operation speed computations:
Am2901A, 108
Am2903, 110
Am2904, 138, 140
Logic symbol, Am2914, 212
Logical address, 273-274
Logical data, 191

Logical operations, 3
HEX-29 mnemonics, 267
Lookahead adder, four-bit carry, 94-95
LSS (least significant slice), 102

Machine interrupts versus microprogram level interrupts, 207
Machine level instructions, microprogram instructions versus, \(9-10\)
Machine register condition code output, Am2904, 181
Machine status register, Am2904, 131
Machine status register instruction codes, Am2904, 132, 181
Machines, microprogrammed, 260-262
versus non-microprogrammed, 8-9
Macro assembler disable opcode patch, 347
Macro instructions, HEX-29, 268-272
(See also Instructions)
Macro library, 347, 356-357
Magnitude only arithmetic operation with shift down speed computations:
Am2903, 110, 112
Am2904, 140, 142
Mapping PROM, 15, 17
MAR (Memory Address Register), 3-6
Mask bus, 220
MDR (Memory Data Register), 3-6
Memory:
Bulk, 6
depth-over-width ratio of, 10
microprogram, 8, 13
program to write into, 88-89
random access (see Random Access Memory) read access time of, 5
Memory access:
direct (see Direct memory access)
random (see Random Access Memory)
Memory Address Register (MAR), 3-6
Memory addressing scheme: with PC in ALU, 193
with PC outside ALU, 197
Memory board, Super Sixteen, 362, 378-381, 384
Memory control, Super Sixteen, 322, 324, 374-375
Memory Data Register (MDR), 3-6
Memory depth, 5
Memory immediate instruction, 197
Memory management, HEX-29, 273-274
Memory management registers, HEX-29, 264
Memory mapped I/O, 239
Memory mapping program address, 273-274
Memory read timing, Super Sixteen, 363
Memory to memory indexed instruction, 196
Memory to memory instruction, 194
Memory width, 5

Memory write timing, Super Sixteen, 364
Microcode:
branch and stack instruction, 200
memory immediate instruction, 197
memory to memory indexed instruction, 196
memory to memory instruction, 194
register immediate instruction, 196
register short-immediate instruction, 195
register to indexed memory instruction, 195
register to memory immediate instruction, 195
register to memory immediate instruction improved, 198
register-to-memory-reference instruction, 194
register-to-register instruction, 193
register-to-register instruction with overlap of execute and PC control, 198
return-from-subroutine instruction, 200
Super Sixteen, 337-345
Microcode translation, Super Sixteen, 345, 346
Microcomputer:
HEX-29 (see HEX-29 CPU)
16-bit (see Super Sixteen)
Microinstruction control, Am2901A, 101
Microinstruction cycle, 35-37
Microinstruction fields, 13
Microinstruction format, Super Sixteen, 330-337
Microinstruction set, Am2914, 212-213
Microinstructions, 13
execution of, 13
(See also Instructions)
Micromachine, 7, 8-10
Microprogram:
Am2904 AMDASM Phase 1 and Phase 2 listing of, 168-179
AMDASM definition and assembly files, 79-87
CRT controller, 50,54-56
Super Sixteen, 340, 342-345
Microprogram control, 260-261
HEX-29, 281, 287-288
Microprogram control bits, Am2903, 115, 122
Microprogram controller, bipolar, 17
Microprogram counter (mPC), 18
Microprogram counter register, 15
Microprogram execution, timing diagram of, 37
Microprogram instructions, machine level instructions versus, 9-10
Microprogram interrupt, 43
Microprogram jumping, 13
Microprogram level interrupt: example of, 229-235
versus machine interrupts, 207
Microprogram memory, 8, 13
Microprogram sequencer, 260-261
Microprogram start-up flow chart, 337
Microprogram state machine applications, 33-34
Microprogram structure, Am2904, 144-145
Microprogrammed architecture, 10

Microprogrammed design, 12-61
key features of, 13
Microprogrammed machines, 260-262
Microprogramming, 259
Microprogramming control, subroutining in, 15
Microregister condition code output, Am2904, 181
Microroutines, sample:
Am2903, 122-125
Am2904, 146-159
Microstatus register, Am2904, 131
Microstatus register instruction codes, Am2904, 132, 181
Microword ADD, 123, 124
Microword ADD DIRECT, 123-125
Microword ADD IMMEDIATE, 123, 124
Microword ADD RR1, 123, 125
Microword FETCH, 122-124
Microword INIT, 122, 123
Microword memory, HEX-29, 288-290
Microword register, 14
Mode control, 240
Monitor listing, Super Sixteen, 347, 358-360
Most significant slice (MSS), 102
MOVE LONG instruction description, 353
mPC (microprogram counter), 18
MSS (most significant slice), 102
Multiplexer (MUX):
address, 15
condition code, 15
data, 242, 251
three-input, 99
Multiplication, two's complement, 150
microcode, 160-161
MULTIPLY instruction description, 350
Multiply unsigned instruction, 147, 149
microcode, 160-161
MULTIPLY UNSIGNED instruction description, 350
Multiprocessor, interrupts in, 207
MUX (see Multiplexer)

N output, 102
Negative numbers, 191
Negative single-length number, normalized and unnormalized, 147
Nested subroutine example, 199
Nesting:
of interrupt service routines, 211
of interrupts, 210
Non-polling versus polling systems, 207
Non-restoring binary roots, 153-154
microcode, 162-163
Normalization microroutine, 146-147
Normalized negative single-length number, 147
Normalized positive number, 146
Numerical value of zero, 191

OP CODE (operation code), 7, 9
Operands, 3
for operations, 7
Operating system, System 29, disk drive management, 361
Operation code (OP CODE), 7, 9
Operations, operands for, 7
OR BYTE instruction description, 351
OR instruction description, 349
Output buffers, Am2903, 104
Output bus, Am2901B, 298, 299
OUTPUT BYTE instruction description, 354
Output flags, 99
Output instruction flow chart, 340, 341
OUTPUT WORD instruction description, 354
Overflow, 95
Overflow detect output flag, 99
Overflow detection signal (OVR), 93, 101, 102
Overlap of execute and PC control, 198
Overlapping:
FETCH, 14-15
Super Sixteen, 322
OVR (overflow detection signal), 93, 101, 102

Parallel cascade mode, 214-217
Parallel look-ahead expansion scheme for Am2930, 202
PCU (see Program Control Unit)
Phase 1 and phase 2 periods, 277-278
Physical page zero, 273
Pin functions, 17
Pipeline registers, 14-15, 260-261, 290, 291
Pipelined microprogram bits, Am2904, 144-146
Pipelined operations, Super Sixteen, 329-330
Pipelining:
FETCH, 14-15
Super Sixteen, 322
Polling, 6
versus non-polling systems, 207
POP instruction description, 352
POP operation, 18
Positive numbers, 191
normalized and unnormalized, 146
P/POP instruction description, 352
P/PUSH instruction description, 352
Priority schemes in interrupt, 209-210
Program, 3
to write into character memory, 88-89
Program control, HEX-29 mnemonics, 267
Program Control Unit (PCU), 4-6, 190-204
Am2901A as, 203, 204
counter-type, 4
Super Sixteen, 321, 368-369
Program control unit performance, improving, 197-203

Program steps, 4
Programmed I/O, 239
Programming, 7
Am2914, 212-213
PROM, 8
mapping, 15, 17
Propagate, carry, 94
Propagation delay calculations:
on Am2910 microprogram sequencer, 24-27
for Am2911 and Am29811A design, 24, 28-31
Propagation delays, combinational:
Am2901A, 126
Am2903, 127
PUP (push/pop control), 15
PUSH/CONDITIONAL LOAD COUNTER instruction, 19-21
figure, 20
PUSH instruction description, 352
PUSH operation, 18
Push/pop control (PUP), 15

Q input, 101
Q register, 99
Am2903, 103-104

R input field, 99
Random Access Memory (RAM), 8, 99
address cascading, Am2903, 106
shift network, 99
two-port, 102
write enable (RAM EN), 99
Read access time of memory, 5
Read control flow chart, 249
Register immediate instruction, 196
Register operations, Am2904, 132
Register with short-immediate instruction, 194-195
Register to indexed memory (RX) instructions, 195 addressing, 7
flow chart of, 337, 338
sequence of, Super Sixteen 328
Register to memory immediate instruction, 195-196
Register-to-memory-reference instruction, 193-194
Register-to-register (RR) instructions, 191-193
addressing, 7
flow chart of, 337, 338
sequence of, Super Sixteen, 328
Registers:
Am2914, 213-214
pipeline, 14-15, 260-261, 290, 291
Working, 3
Relative addressing, 4-5
Reliability, system, 277
Remainder, true value of, 151

REPEAT LOOP, COUNTER \(\neq\) ZERO instruction, 21 figure, 20
REPEAT PIPELINE REGISTER, COUNTER \(\neq\) ZERO, 21
figure, 20
Repetitive DMA, 240
RESET instruction, 18, 19
figure, 20
Restore after interrupt service routine, 208
Return-from-interrupt sequence timing, 234, 235
Return-from-subroutine command, 198
RETURN-FROM-SUBROUTINE instruction, 15, 21 figure, 20
Return-from-subroutine instruction microcode, 200
RETURN instruction description, 352
Return interrupt microprogram, 220, 223, 231
Return register, 198
Ripple carry method usages, 95
Ripple cascade mode, 214-217
Ripple expansion scheme for Am2930, 202
Ripple propagation time, 93
ROTATE LEFT instruction description, 354
ROTATE RIGHT instructions, description of, 354
Rotating structure interrupt scheme, 209-210
RR instructions [see Register-to-register (RR) instructions]
RX instructions [see Register to indexed memory (RX) instructions]
\(S\) (sum output), 93
S input field, 99
S/29 (see System/29)
Save status, 207
Schottky speeds, standard device, 107
Am2904, 135
SET, CLR, COMPLEMENT, TEST BIT PSW instruction description, 352
Set-up and hold times:
Am2901A, 126
Am2903, 127
Shift and rotate instruction flow chart, 339, 341
Shift-down operations, 115
SHIFT LEFT instructions, description of, 354
Shift linkage multiplexer instruction codes, Am2904, 133
Shift linkage multiplexers, Am2904, 131-132
Shift network, RAM, 99
SHIFT RIGHT instructions, description of, 353
Shift-up operations, 115
Shifter, 3, 4
Sign bit output flag, 99
Signal delay path, longest, 23-24
Single-length normalize command, 146-148 microcode, 162-163

Slice:
intermediate (IS), 102
least significant (LSS), 102
most significant (MSS), 102
Software, 7
Software emulation of CRT controller, 66-77
SP (stack pointer), 4, 15, 18
SSI/MSI, 135, 143
Stack, full, 19
Stack addressing, 265
Stack and link, 198
Stack pointer (SP), 4, 15
built-in, 18
Standard device Schottky speeds, 107
Am2904, 135
Static structure interrupt scheme, 209
Status bus, 220
Status registers, Am2904, 131
STORE BYTE instruction description, 350
STORE CHARACTER instruction description, 350
STORE instruction description, 349
STORE PROGRAM STATUS WORD instruction description, 351
STORE STACK instructions, description of, 352
Stored-program computer, 3-7
Subroutine example, nested, 199
Subroutine stack architecture, 200
Subroutining, 198-202
in microprogramming control, 15
SUBTRACT instruction description, 349
SUBTRACT WITH CARRY instruction description, 349
Sum output (S), 93
Super Sixteen, 318-384
ALU, 321-323, 366-367
central processing unit architecture, 321-330
logic diagrams, 365-384
central processing unit read timing, 325
clock and memory control, 322, 324, 374-375
computer control unit, 322
data path, 324, 370-371
engineering model, 346-348
instruction descriptions, 349-355
instruction format, 321
instructions, 319-321
interrupt and I/O, 324, 376-377
I/O write timing, 328
macro library, 347, 356-357
memory board, 362, 378-381,384
memory read timing, 363
memory write timing, 364
microcode, 337-345
microcode translation, 345,346
microinstruction format, 330-337
microprogram, 340, 342-345
monitor listing, 347, 358-360
pipelined operations, 329-330

Super Sixteen (Cont.):
RR instruction sequence, 328
RX instruction sequence, 328
S/29 WCS interface, 382-384
system organization, 319
SUPERVISOR CALL instruction description, 352
Switching characteristics:
Am2901A, 125, 126
Am2903, 125, 127
preliminary Am2904, 135
Sync control logic, 232, 233
System reliability, 277
System/29, 10, 345, 347-348 operating system disk drive management, 361 universal card, 88
WCS interface, Super Sixteen, 382-384

TEST END-OF-LOOP instruction, 22
figure, 20
TEST IMMEDIATE instruction description, 350
THREE-WAY BRANCH instruction, 22
figure, 20
Time delay using Am2914, 220, 224-229
Timing:
Am2940, 243-247
bit slice, 106-113
CCU, 22-33
HEX-29 system, 277-281
interrupt sequence, 234, 235
return-from-interrupt sequence, 234. 235
Timing analysis, Am2904, 135
Timing analysis summary, 144
Timing considerations, CRT controller, 50, 57-61
Timing diagram of microprogram execution, 37
Transfer complete circuitry:
Am2940, 242
Am2942, 251
TRANSLATEAND TEST instruction description, 353
TRANSLATE instruction description, 352
Trap, 275
Truth table, full adder, 93
Two-port RAM, 102
Two's complement arithmetic operation with shift down and 16 -bit speed computations, Am2903, 112
Two's complement arithmetic operation with shift down speed computations:
Am2901A, 109
Am2904, 139
Two's complement division, 150-153 microcode, 160-161
Two's complement multiplication, 150 microcode, 160-161

Unconditional branch flow chart, 339
UNCONDITIONAL JUMP, 14-15
Unconditional operation, 4
Unnormalized negative single-length number, 147
Unnormalized positive number, 146
Unsigned multiply, 147, 149
microcode, 160-161

Word counter carry out mode:
Am2940, 243
Am2942, 251
Word lengths, 191
Words, 5
16-bit, 191
Working Registers, 3
Write-protect bit (WP), 273

Vector bus, 220
Vector generator, 232, 233
Vector input, 42-43

Width, memory, 5
Word count compare mode:
Am2940, 242
Am2942, 251
Word count equals zero mode, Am2942, 251
Word count maintenance, 240
Word count register:
Am2940, 242
Am2942, 251
Word counter:
Am2940, 242
Am2942, 251

Y output instruction codes, Am2904, 135
Y outputs, 101
three-state, 18

Zero, numerical value of, 191
Zero detect output flag, 99
ZERO instruction, 43

\section*{Other Important Books from McGraw-Hill}

\author{
CIRCUITS FOR ELECTRONICS ENGINEERS \\ Electronics magazine Edited by Samuel Weber 396 pp., illus.
}

Interested in saving time and money in circuit design? Whether it's an everyday situation or a special problem, this book will help you solve it. You'll find 364 practical circuits which have recently appeared in the Designer's Casebook of Electronics magazine, arranged in 51 categories and in alphabetical order. They pinpoint or inspire your answer in seconds. Selected for their innovation, usefulness, and accuracy, these circuit designs by your fellow engineers were used to solve the sort of problems you're running into today and some you may encounter in the future. The book is a gold mine of ideas from some of the top engineers in the world.

\author{
MEMORY DESIGN Microcomputers to Mainframes \\ Electronics magazine \\ Edited by Laurence E. Altman \\ 192 pp., illus.
}

Engineers and computer specialists will find here all the know-how they need for successful memory design. This easy-to-use guide includes the most needed and up-to-date information on memory techniques and devices. You'll learn exactly how to apply the new technology and components to meet specific design goals. And you'll be fully prepared to work with everything from small microcomputer-based systems to large memory-rich mainframes.

\title{
APPLYING MICROPROCESSORS \\ New Hardware, Software, and Applications Electronics magazine \\ Edited by Laurence Altman and Stephen Scrupski 200 pp., illus.
}

Microprocessors have developed from a promising new technology into one of the most versatile and powerful tools electronics engineers have ever had. Wherever a programmable, large-scale integrated circuit can be worked into a system, there's a remarkably inexpensive microprocessor for the job. This collection of recent articles from Electronics magazine will smooth your way to mastery of the new microprocessor design methods. It gives you both an overview of the state of the art and a wealth of design ideas, analyses, and practical applications. Plenty of diagrams, charts, tables, and photographs complement the text.

\section*{DESIGN TECHNIQUES FOR ELECTRONICS ENGINEERS}

\section*{Electronics magazine} 384 pp., illus.
Compiled from articles in the highly popular Engineer's Notebook section of Electronics magazine, this book offers you clear-cut answers and often strikingly simple engineering solutions for your day-to-day design work. Divided into 48 chapters and containing 293 articles, the book gives you tips and practical data that can shorten your design time, increase engineering productivity, and free your time for more creative work. This practical volume can help you in each of the assembly techniques of engineering a productfrom how to make your own small switches for pc boards to how to reduce noise pickup in ICs and operate a logic gate as a flip-flop. An invaluable, heavily illustrated, on-the-job resource.```


[^0]:    $L=$ LOW, $H=$ HIGH, $X=$ Don't care

[^1]:    *The actual set-up times where not available at the time this was written. See current data sheets for correct tıming on these signals.

