# SN74ACT8800 Family 

32-Bit CMOS Processor
Building Blocks

GIUNE= 0088LOVFLNE

## Overview

## SN74ACT8818 16-Bit Microsequencer

## 2

SN74ACT8832 32-Bit Registered ALU

SN74ACT8847 64-Bit Floating Point/Integer Processor 7

Support

Mechanical Data

# SN74ACT8800 Family 32-Bit CMOS Processor Building Blocks 

## Data Manual

## IMPORTANT NOTICE

Texas Instruments (TI) reserves the right to make changes to or to discontinue any semiconductor product or service identified in this publication without notice. TI advises its customers to obtain the latest version of the relevant information to verify, before placing orders, that the information being relied upon is current.

TI warrants performance of its semiconductor products to current specifications in accordance with TI's standard warranty. Testing and other quality control techniques are utilized to the extent Tl deems necessary to support this warranty. Unless mandated by government requirements, specific testing of all parameters of each device is not necessarily performed.

TI assumes no liability for TI applications assistance, customer product design, software performance, or infringement of patents or services described herein. Nor does TI warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right of TI covering or relating to any combination, machine, or process in which such semiconductor products or services might be or are used.

Copyright © 1988, Texas Instruments Incorporated First edition: March 1988
First revision: June 1988
Second revision: June 1989

## INTRODUCTION

In this manual, Texas Instruments presents technical information on the Tl SN74ACT8800 family of 32 -bit processor "building block" circuits. The SN74ACT8800 family is composed of single-chip VLSI processor functions, all of which are designed for high-complexity processing applications.

This manual includes specifications and operational information on the following highperformance advanced-CMOS devices:

- SN74ACT8818 16-bit microsequencer
- SN74ACT8832 32-bit registered ALU
- SN74ACT8836 32- $\times$ 32-bit parallel multiplier
- SN74ACT8837 64-bit floating point processor
- SN74ACT8841 Digital crossbar switch
- SN74ACT8847 64-bit floating point/integer processor

These high-speed devices operate at or above 20 MHz , while providing the low power consumption of TI's advanced one-micron EPIC ${ }^{\text {TM }}$ CMOS technology. The EPIC ${ }^{\text {TM }}$ CMOS process combines twin-well structures for increased density with one-micron gate lengths for increased speed.

The SN74ACT8800 Family Data Manual contains design and specification data for all five devices previously listed and includes additional programming and operational information for the '8818, '8832, and '8837/'8847. Two application notes, "'Chebyshev Routines for the SN74ACT8847'' and "High-speed Vector Math and 3D Graphics Using the SN74ACT8837/8847 Floating Point Unit" are also included.

Introductory sections of the manual include an overview of the ' 8800 family and a summary of the software tools and design support TI offers for the chip-set. The general information section includes an explanation of the function tables, parameter measurement information, and typical characteristics related to the products listed in this volume.

Package dimensions are given in the Mechanical Data section of the book in metric measurement (and parenthetically in inches).

Complete technical data for any Texas Instruments semicondutor product is available from your nearest TI field sales office, local authorized TI distributor, or by calling Texas Instruments at 1-800-232-3200.
$\square$
SN74ACT8818 16-Bit Microsequencer ..... 2
SN74ACT8832 32-Bit Registered ALU ..... 3
SN74ACT8836 $32 \times 32$-Bit Parallel Multiplier ..... 4
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support ..... 8
Mechanical Data

Overview

## Introduction

Texas Instruments SN74ACT8800 family of 32-bit processor building blocks has been developed to allow the easy, custom design of functionally sophisticated, highperformance processor systems. The ' 8800 family is composed of single-chip, VLSI devices, each of which represents an element of a CPU.

Geared for computationally intensive applications, SN74ACT8800 devices include highperformance ALUs, multipliers, microsequencers, and floating point processors.

The ' 8800 chip set provides the performance, functionality, and flexibility to fill the most demanding processing needs and is structured to reduce system design cost and effort. Most of these high-speed processor functions operate at 20 MHz and above, and, at the same time, provide the power savings of TI's advanced, $1 \mu \mathrm{~m}$ EPIC ${ }^{\text {mM }} \mathrm{CMOS}$ technology.

The family's building block approach allows the easy, "pick-and-choose" creation of customized processor systems, while the devices' high level of integration provides cost-effectiveness.

Designed especially for high-complexity processing, the devices in the ' 8800 family offer a range of functional options. Device features include three-port architecture, double-precision accuracy, optional pipelined operation, and built-in fault tolerance.

Array, digital signal, image, and graphics processing can be optimized with '8800 devices. Other applications are found in supermini and fault-tolerant computers, and I/O and network controllers.

In addition to the high-performance, CMOS processor functions featured in this data manual, the family includes several high-speed, low-power bipolar support chips. To reduce power dissipation and ensure reliabilty, these bipolar devices use TI's proprietary Schottky Transistor Logic (STL) internal circuitry.

At present, TI's '8800 32-bit processor building block family comprises the following functions:

- SN74ACT8818 16-bit microsequencer
- SN74ACT8832 32-bit registered ALU
- SN74ACT8836 32- $\times$ 32-bit parallel multiplier
- SN74ACT8837 64-bit floating point processor
- SN74ACT8841 digital crossbar switch
- SN74ACT8847 64-bit floating point and integer processor
- Bipolar Support Chips
- SN74AS8838 32-bit barrel shifter
- SN74AS8839 32-bit shuffle/exchange network
- SN74AS8840 $16 \times 4$ crossbar switch


## 20 MIPS and Low CMOS Power Consumption

With instruction cycle times of 50 ns or less and the low power consumption of EPIC ${ }^{\text {TM }}$ CMOS, the ' 8800 chip set offers an unrivaled speed/power combination. Unlike traditional microprocessors, which require multiple cycles to perform an operation, the 'ACT8800 processors typically can complete instructions in a single cycle.

The 'ACT8832 registered ALU and 'ACT8818 microsequencer together create a powerful $20-\mathrm{MHz}$ CPU. Because instructions can be performed in a single cycle, the $8832 / 8818$ combination is capable of executing over 20 million instructions per second (MIPS).

For math-intensive applications, the 'ACT8836 fixed-point multiplier/accumulator (MAC), 'ACT8837 64-bit floating point processor, and 'ACT8847 64-bit floating point and integer processor offer unprecedented computational power.

The exceptional performance of the 'ACT8800 family is made possible by $\mathrm{TI}^{\prime}$ S EPIC' ${ }^{\text {M }}$ CMOS technology. The EPIC ${ }^{\text {TM }}$ CMOS process combines twin-well structures for increased density with one-micron gate lengths for increased speed.

## Customized Solution

The ' 8800 family is designed with a variety of architectural and functional options to provide maximum design flexibility. These device features allow the creation of "customized" solutions with the ' 8800 chipset.

A building block approach to processing allows designers to match specialized hardware to their specific design needs. The '8818/8832 combination forms the basis of the system, a high-speed CPU. For applications requiring high-speed integer multiplication, the 'ACT8836 can be added. To provide the high precision and large dynamic range of floating point numbers, the 'ACT8837 or 'ACT8847 can be employed.

[^0]To ensure speed and flexibility, each component of the ' 8800 family has three data ports. Each data port accommodates 32 bits of data, plus four parity bits. This architecture eliminates many of the I/O bottlenecks associated with traditional singleI/O microprocessors.

The three-port architecture and functional partitioning of the ' 8800 chip-set opens the door to a variety of parallel processing applications. Placing the math and shifting functions in parallel with the ALU permits concurrent processing of data. Additional processors can be added when performance needs dictate.

The 'ACT8800 building block processors are microprogrammable, so that their instruction sets can be tailored to a specific application. This high degree of programmability offers greater speed and flexibility than a typical microprocessor and ensures the most efficient use of hardware.

A separate control bus eliminates the need for multiplexing instructions and data, further reducing processing bottlenecks. The microcode bus width is determined by the designer and the application.

Another source of design flexibility is provided by the pipelined/flowthrough operation option. Pipelining can dramatically reduce the time required to perform iterative, or sequential, calculations. On the other hand, random or nonsequential algorithms require fast flowthrough operations. The ' 8800 chip set allows the designer to select the mode (fully pipelined, partially pipelined, or nonpipelined) most suited to each design.

## Scientific Accuracy

The '8800 family is designed to support applications which require double-precision accuracy. Many scientific applications, such as those in the areas of high-end graphics, digital signal processing, and array processing, require such accuracy to maintain data integrity. In general-purpose computing applications, floating point processors must often support double-precision data formats to maintain compatibility with existing software.

To ensure data integrity, ' 8800 devices (excluding the barrel shifter and microsequencer) support parity checking and generation, as well as master/slave error detection. Byte parity checking is performed on the input ports, and a parity generator and a master/slave comparator are provided at the output. Fault tolerance is built into the processors, ensuring correct device operation without extra logic or costly software.

# The SN74ACT8800 Building Block Processor System 

Some of the high-performance ' 8800 devices are described in the following paragraphs.

## SN74ACT8818 16-Bit Microsequencer

In a high-performance microcoded system, a fast microcode controller is required to control the flow of instructions. The SN74ACT8818 is a high-speed, versatile 16-bit microsequencer capable of addressing 64 K words of microcode memory. The 'ACT8818 can address the next instruction fast enough to support a 50-ns system cycle time.

The 'ACT8818 65-word-deep by 16-bit-wide stack is useful for storing subroutine return addresses, top of loop addresses, and loop counts. Addresses can be sourced from eight different sources: the three I/O ports, the two register counters, the microprogram counter, the stack, and the 16 -way branch.

## SN74ACT8832 Registered ALU

The SN74ACT8832 is a 32-bit registered ALU that operates at approximately 20 MHz . Because instructions can be performed in a single cycle, the 'ACT8832 is capable of executing 20 million microinstructions per second. An on-board 64 -word register file is 36 -bits-wide to permit the storage of parity bits. The 3-operand register file increases performance by enabling the creation of an instruction and the storage of the previous result in a single cycle. To facilitate data transfer, operands stored in the register file can be accessed externally, while the ALU is executing. To support the parallel processing of data, the 'ACT8832 can be configured to operate as four 8-bit ALUs, two 16-bit ALUs, or a single 32-bit ALU. The 'ACT8832 incorporates 32-bit shifters for double-precision shift operations.

## SN74ACT8836 32- x 32-Bit Integer MAC

The SN74ACT8836 is a 32-bit integer multiplier/accumulator (MAC) that accepts two 32-bit inputs and computes a 64-bit product. The device can also operate as a 64-bit by 64-bit multiplier. An onboard adder is provided to add or subtract the product or the complement of the product from the accumulator.

When pipelined internally, the $1-\mu \mathrm{m}$ CMOS parallel MAC performs a full $32-\times 32$-bit multiply/accumulate in a single 36 -ns clock cycle. In flowthrough mode (without any pipelining), the 'ACT8836 takes 60 ns to multiply two 32-bit numbers. The 'ACT8836 performs a 64- $\times 64$-bit multiply/accumulate, outputting a 64-bit result, in 225 ns .

The 'ACT8836 can handle a wide variety of data types, including two's complement, signed, and mixed. Division is supported via the Newton-Raphson algorithm.

## SN74ACT8837 64-Bit Floating Point Unit

The SN74ACT8837 is a high-speed floating point processor. This single-chip device performs 32- or 64-bit floating point operations.

More than just a coprocessor, the 'ACT8837 integrates on one chip a double-precision floating point ALU and multiplier. Integrating these functions on a single chip reduces data routing problems and processing overhead. In addition, three data ports and a 64-bit internal bus architecture allow for single-cycle operations.

The 'ACT8837 can be pipelined for iterative calculations or can operate with input registers disabled for low latency.

## SN74ACT8841 Digital Crossbar Switch

The SN74ACT8841 is a single-chip digital crossbar switch. The high-performance device, cost-effectively eliminates bottlenecks to speed data through complex bus architecture.

The 'ACT8841 is ideal for multiprocessor applications, where memory bottlenecks tend to occur. The device has 64 bidirectional I/O ports that can be configured as 16 4-bit ports, 88 -bit ports, or 416 -bit ports. Each bidirectional port can be connected in any conceivable combination. Any single input port can be broadcast to any combination of output ports. The total time for data transfer is 20 ns .

The control sources for ten separate switching configurations are on-chip, including eight banks of programmable control flip-flops and two hard-wired control circuits.

The EPIC ${ }^{\text {™ }}$ CMOS SN74ACT8841 and its predecessor, SN74AS8840, are based on the same architecture, differing in power consumption, number of control registers, and pin-out. Microcode written for the 'AS8840 can be run on the 'ACT8841.

## SN74ACT8847 64-Bit Floating Point Unit

The SN74ACT8847 is a high-speed 64-bit floating point processor. The device is fully compatible with IEEE standard 754-1985 for addition, subtraction, multiplication, division, square root, and comparison. Division and square root operations are implemented via hardwired control.

The SN74ACT8847 FPU also performs integer arithmetic, logical operations, and logical shifts. Registers are provided at the inputs, outputs, and inside the ALU and multiplier to support multilevel pipelining. These registers can be bypassed for nonpipelined operations.

When fully pipelined, the 'ACT8847 can perform a double-precision floating point or 32 -bit integer operation in under 40 ns . When in flowthrough mode, the 'ACT8847 takes less than 100 ns to perform an operation.

## Bipolar Support Chips

1The SN74AS8838 high-speed, 32-bit barrel shifter can shift up to 32 bits in a single instruction cycle of under 25 ns. Five basic shifts can be programmed: circular left, circular right, logical left, logical right, and arithmetic right. The 'AS8838 offloads the responsibility for shifting operations from the ALU, which increases shifter functionality and system throughput.

The SN74AS8839 is a 32 -bit shuffle/exchange network. The high-speed device can perform data permutations on one 32 -bit, two 16 -bit, four 8 -bit, or eight 4 -bit data words in a single instruction cycle of under 25 ns . The shuffle/exchange network is designed primarily for use in digital signal processing applications.

## Overview

1

## SN74ACT8818 16-Bit Microsequencer <br> 2

SN74ACT8832 32-Bit Registered ALU 3

| SN74ACT8836 | $32-\times 32$-Bit Parallel Multiplier |
| :--- | :--- |

SN74ACT8837 64-Bit Floating Point Processor 5

SN74ACT8841 Digital Crossbar Switch 6

SN74ACT8847 64-Bit Floating Point/Integer Processor 7

Support

Mechanical Data

## SN74ACT8818 16-Bit Microsequencer

- Addresses Up to 64K Locations of Microprogram Memory
- CLK-to-Y $=30 \mathrm{~ns}\left(\mathrm{t}_{\mathrm{pd}}\right)$
- Low-Power EPIC ${ }^{\text {™ }}$ CMOS
- Addresses Selected from Eight Different Sources
- Performs Multiway Branching, Conditional Subroutine Calls, and Nested Loops
- Large 65-Word by 16-bit Stack
- Cascadable

Because they're microprogrammable, the ACT8800 building block processors provide greater speed and flexibility than does a typical microprocessor. In such a highperformance microcoded system, a fast microsequencer is required to control the flow of microinstructions.

The SN74ACT8818 is a high-speed, versatile 16-bit microsequencer capable of addressing 64 K words of microcode memory. The 'ACT8818 can address the next instruction fast enough to support a $50-\mathrm{ns}$ system cycle time.

The 'ACT8818 65-word-deep by 16 -bit-wide stack is useful for storing subroutine return addresses, top-of-loop addresses, and loop counts. For added flexibility, addresses can be selected from eight different sources: the three I/O ports, the two register/counters, the microprogram counter, the stack, and the 16-way branch input.

EPIC is a trademark of Texas Instruments Incorporated.

## Contents

Page
Introduction ..... 2-11
Understanding the 'ACT8818 Microsequencer ..... 2-11
Microprogramming the 'ACT8818 ..... 2-12
Design Support ..... 2-12
Systems Expertise ..... 2-13
'ACT8818 Pin Grid Allocation ..... 2-14
'ACT8818 Specification Tables ..... 2-21
Architecture ..... 2-25
Y Output Multiplexer ..... 2-28
Microprogram Counter ..... 2-28
Register/Counters ..... 2-28
Stack ..... 2-29
Stack Pointer ..... 2-29
Read Pointer ..... 2-29
Stack Warning/Read Error Pin ..... 2-29
Interrupt Return Register ..... 2-30
Microprogramming the 'ACT8818 ..... 2-31
Address Selection ..... 2-32
Stack Controls ..... 2-32
Register Controls ..... 2-33
Continue/Repeat Instructions ..... 2-34
Branch Instructions ..... 2-34
Conditional Branch Instructions ..... 2-35
Loop Instructions ..... 2-35
Subroutine Calls ..... 2-37
Subroutine Returns ..... 2-38
Reset ..... 2-39
Clear Pointers ..... 2-39
Read Stack ..... 2-39
Interrupts ..... 2-39

## Contents (Continued)

Page
2
Sample Microinstructions for the 'ACT8818 ..... 2-40
$\stackrel{0}{2}$ Continue ..... 2-40
Continue and Pop ..... 2-40
Continue and Push ..... 2-40
Branch (Example 1) ..... 2-42
Branch (Example 2) ..... 2-42
Sixteen-Way Branch ..... 2-42
Conditional Branch ..... 2-44
Three-Way Branch ..... 2-44
Thirty-Two-Way Branch ..... 2-44
Repeat ..... 2-46
Repeat on Stack ..... 2-46
Repeat Until $\overline{\mathrm{CC}}=\mathrm{H}$ ..... 2-48
Loop Until Zero ..... 2-48
Conditional Loop Until Zero ..... 2-50
Jump to Subroutine ..... 2-52
Conditional Jump to Subroutine ..... 2-52
Two-Way Jump to Subroutine ..... 2-52
Return from Subroutine ..... 2-54
Conditional Return from Subroutine ..... 2-54
Clear Pointers ..... 2-54
Reset ..... 2-54

## List of Illustrations

Figure Title Page
1 'ACT8818 GC Package ..... 2-14
2 'ACT8818 FN Package ..... 2-16
3 'ACT8818 Logic Symbol ..... 2-17
4 'ACT8818 Functional Block Diagram ..... 2-27
5 Continue ..... 2-41
6 Continue and Pop ..... 2-41
7 Continue and Push ..... 2-41
8 Branch Example 1 ..... 2-43
9 Branch Example 2 ..... 2-43
10 Sixteen-Way Branch ..... 2-43
11 Conditional Branch ..... 2-45
12 Three-Way Branch ..... 2-45
13 Thirty-Two Way Branch ..... 2-45
14 Repeat ..... 2-46
15 Repeat on Stack ..... 2-47
16 Repeat Until CC $=\mathrm{H}$ ..... 2-49
17 Loop Until Zero ..... 2-49
18 Conditional Loop Until Zero (Example 2) ..... 2-51
19 Jump to Subroutine ..... 2-53
20 Conditional Jump to Subroutine ..... 2-53
21 Two-Way Jump to Subroutine ..... 2-53
22 Return from Subroutine ..... 2-55
23 Conditional Return from Subroutine ..... 2-55
24 Clear Pointers ..... 2-56

2
8L881つももLNS

## List of Tables

Table Title Page
1 'ACT8818 Pin Grid Allocation ..... 2-15
2 'ACT8818 Pin Functional Description ..... 2-18
3 Response to Control Inputs ..... 2-26
4 Y Output Controls (MUX2-MUXO) ..... 2-32
5 Stack Controls (S2-SO) ..... 2-33
6 Register Controls (RC2-RC0) ..... 2-33
7 Decrement and Branch on Nonzero Encodings ..... 2-36
8 Call Encodings without Register Decrements ..... 2-37
9 Call Encodings with Register Decrements ..... 2-38
10 Return Encodings without Register Decrements ..... 2-38
11 Return Encodings with Register Decrements ..... 2-39

8L881J $\forall$ LLNS N

## Introduction

The SN74ACT8818 microsequencer is a low-power, high-performance microsequencer implemented in $\mathrm{TI}^{\prime} \mathrm{s}$ EPIC ${ }^{\mathrm{mM}}$ Advanced CMOS technology. The 16 -bit device addresses up to 64 K locations of microprogram memory and is compatible with the SN74AS890 microsequencer.

The 'ACT8818 performs a range of sequencing operations in support of TI's family of building block devices and special-purpose processors such as the SN74ACT8847 Floating Point Unit (FPU).

## Understanding the 'ACT8818 Microsequencer

The 'ACT8818 microsequencer is designed to control execution of microcode in a microprogrammed system. Basic architecture of such a system usually incorporates at least the microsequencer, one or more processing elements such as the 'ACT8847 FPU or the SN74ACT8832 Registered ALU, microprogram memory, microinstruction register, and status logic to monitor system states and provide status inputs to the microsequencer.

The 'ACT8818 combines flexibility and high speed in a microsequencer that performs multiway branching, conditional subroutine calls, nested loops, and a variety of other microprogrammable operations. The 'ACT8818 can also be cascaded for providing additional register/counters or addressing capability for more complex microcoded control functions.

In this microsequencer, several sources are available for microprogram address selection. The primary source is the 16-bit microprogram counter (MPC), although branch addresses may be input on the two 16 -bit address buses, DRA and DRB. An address input on the DRA bus can be pushed on the stack for later selection. Register/counters RCA and RCB can store either branch addresses or loop counts as needed, either for branch operations or for looping on the stack.

The selection of address source can be based on external status from the device being controlled, so that three-way or multiway branching is supported. Once selected, the address which is output on the Y bus passes to the microprogram memory, and the microinstruction from the selected location is clocked into the pipeline register at the beginning of the next cycle.

It is also possible to interrupt the 'ACT8818 by placing the $Y$ output bus in a highimpedance state and forcing an interrupt vector on the $Y$ bus. External logic is required to place the bus in high impedance and load the interrupt vector. The first

[^1]microinstruction of the interrupt handler subroutine can push the address from the Interrupt Return register on the stack so that proper linkage is preserved for the return from subroutine.

## Microprogramming the 'ACT8818

Microinstructions for the 'ACT8818 select the specific operations performed by the Y output multiplexer, the register/counters RCA and RCB, the stack, and the bidirectional DRA and DRB buses. Each set of inputs is represented as a separate field in the microinstructions, which control not only the microsequencer but also the ALU or other devices in the system.

The 3-port architecture of the 'ACT8818 facilitates both branch addressing and register/counter operations. Both register/counters can be used to hold either loop counts or branch addresses loaded from the DRA and DRB buses. Register/counter operations are selected by control inputs RC2-RCO.

Similarly, the 65 -word by 16 -bit stack can save addresses from the DRA bus, the microprogram counter (MPC), or the Interrupt Return register, depending on the settings of stack controls S2-SO and related control inputs. Flexible instructions such as Branch DRA else Branch to Stack else Continue can be coded to take advantage of the conditional branching capability of the 'ACT8818.

Multiway branching (16- or 32-way) uses the B3-B0 inputs to set up a 16-way branch address on DRA or DRB by concatenating B3-B0 with the upper 12 bits of the DRA or DRB bus. The resulting branch addresses DRA' (DRA15-DRA4::B3-B0) and DRB' (DRB15-DRB4::B3-B0) are selected by the Y output multiplexer controls MUX2-MUX0. A Branch DRB' else Branch DRA' instruction can select up to 32 branch addresses, as determined by the settings of B3-BO.

## Design Support

Tl's '8818 16-bit microsequencer is supported by a variety of tools developed to aid in design evaluation and verification. These tools will streamline all stages of the design process, from assessing the operation and performance of the ' 8818 to evaluating a total system application. The tools include a functional model, behavioral model, and microcode development software and hardware. Section 8 of this manual provides specific information on the design tools supporting TI's SN74ACT8800 Family.

## Systems Expertise

Texas Instruments VLSI Logic applications group is available to help designers analyze TI's high-performance VLSI products, such as the ' 8818 16-bit microsequencer. The group works directly with designers to provide ready answers to device-related questions and also prepares a variety of applications documentation.

The group may be reached in Dallas, at (214) 997-3970.

## 'ACT8818 Pin Grid Allocation



Figure 1. 'ACT8818 . . . . GC Package

Table 1. 'ACT8818 Pin Grid Allocation

| PIN |  | PIN |  | PIN |  | PIN |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| NO. | NAME | NO. | NAME | NO. | NAME | NO. | NAME |
| A2 | RC2 | C2 | RCO | F3 | $\overline{\mathrm{RBOE}}$ | J10 | S1 |
| A3 | Y1 | C3 | GND | F9 | BO | J11 | STKWRN/RER |
| A4 | Y3 | C5 | GND | F10 | B1 | K1 | DRBO |
| A5 | Y5 | C6 | Y7 | F11 | MUX2 | K2 | SELDR |
| A6 | Y6 | C7 | Y10 | G1 | DRB6 | K3 | DRA14 |
| A7 | Y8 | C9 | GND | G2 | DRB5 | K4 | DRA12 |
| A8 | Y11 | C10 | $V_{\text {CC }}$ | G3 | GND | K5 | DRA10 |
| A9 | Y13 | C11 | $\overline{\mathrm{RE}}$ | G9 | CLK | K6 | DRA7 |
| A10 | NC | D1 | DRB12 | G10 | MUXO | K7 | DRA5 |
| B1 | DRB15 | D2 | DRB13 | G11 | MUX1 | K8 | DRA3 |
| B2 | RC1 | D9 | GND | H1 | DRB4 | K9 | DRAO |
| B3 | YO | D10 | COUT | H2 | DRB3 | K10 | So |
| B4 | Y2 | D11 | INC | H10 | $\overline{\mathrm{CC}}$ | K11 | S2 |
| B5 | Y4 | E1 | DRB9 | H11 | ZEROUT | L2 | DRA15 |
| B6 | $\overline{\text { YOE }}$ | E2 | DRB10 | J1 | DRB2 | L3 | DRA13 |
| B7 | Y9 | E3 | DRB11 | J2 | DRB1 | L4 | DRA11 |
| B8 | Y12 | E9 | $\overline{\text { INT }}$ | J3 | $V_{\text {CC }}$ | L5 | DRA9 |
| B9 | Y14 | E10 | B3 | J5 | GND | L6 | DRA8 |
| B10 | Y15 | E11 | B2 | J6 | $\overline{\text { RAOE }}$ | L7 | DRA6 |
| B11 | ZEROIN | F1 | DRB7 | J8 | DRA1 | L8 | DRA4 |
| C1 | DRB14 | F2 | DRB8 | J9 | GND | L9 | DRA2 |
|  |  |  |  |  |  | L10 | OSEL. |

(TOP VIEW)


Figure 2. 'ACT8818 . . . FN Package


Figure 3. 'ACT8818 . . . Logic Symbol

Table 2．＇ACT8818 Pin Functional Description

| PIN NAME | $\begin{aligned} & \text { GC } \\ & \text { NO. } \end{aligned}$ | $\begin{aligned} & \text { FN } \\ & \text { NO. } \end{aligned}$ | 1／0 | DESCRIPTION |
| :---: | :---: | :---: | :---: | :---: |
| B0 | F9 | 22 |  |  |
| B1 | F10 | 23 | 1 | Input bits for branch addressing（see Table 3） |
| B2 | E11 | 24 |  |  |
| B3 | E10 | 25 |  |  |
| CLK | G9 | 18 |  | System clock |
| COUT | D10 | 28 | 0 | Incrementer carry－out．Goes high when an attempt is made to increment microprogram counter beyond addressable micromemory． |
| $\overline{C C}$ | H10 | 15 | 1 | Condition code |
| DRAO | K9 | 9 |  |  |
| DRA1 | J8 | 8 |  |  |
| DRA2 | L9 | 7 |  |  |
| DRA3 | K8 | 6 |  |  |
| DRA4 | L8 | 5 |  |  |
| DRA5 | K7 | 4 |  |  |
| DRA6 | L7 | 3 |  |  |
| DRA7 | K6 | 2 | I／O | Bidirectional DRA data port．Outputs data from |
| DRA8 | L6 | 84 |  | stack or register／counter $\mathrm{A}(\overline{\mathrm{RAOE}}=0)$ or inputs |
| DRA9 | L5 | 83 |  | external data（ $\overline{\mathrm{RAOE}}=1$ ）． |
| DRA10 | K5 | 82 |  |  |
| DRA11 | L4 | 80 |  |  |
| DRA12 | K4 | 79 |  |  |
| DRA13 | L3 | 78 |  |  |
| DRA14 | K3 | 77 |  |  |
| DRA15 | L2 | 76 |  |  |
| DRBO | K1 | 73 |  |  |
| DRB1 | J2 | 72 |  |  |
| DRB2 | J1 | 71 |  |  |
| DRB3 | H2 | 70 |  | Bidirectional DRB data port．Outputs data from |
| DRB4 | H1 | 69 | I／O | register／counter B |
| DRB5 | G2 | 67 | $1 / 0$ | $(\overline{\mathrm{RBOE}}=0)$ or inputs external data |
| DRB6 | G1 | 66 |  |  |
| DRB7 | F1 | 65 |  |  |
| DRB8 | F2 | 63 |  |  |
| DRB10 | E2 | 61 |  |  |

Table 2. 'ACT8818 Pin Functional Description (Continued)

| PIN NAME | $\begin{gathered} \text { GC } \\ \text { NO. } \end{gathered}$ | $\begin{aligned} & \text { FN } \\ & \text { NO. } \end{aligned}$ | 1/0 | DESCRIPTION |
| :---: | :---: | :---: | :---: | :---: |
| DRB11 | E3 | 60 |  |  |
| DRB12 | D1 | 59 |  | Bidirectional DRB data port. Outputs data from |
| DRB13 | D2 | 58 | 1/0 | register/counter $\mathrm{B}(\overline{\mathrm{RBOE}}=0)$ or inputs external data |
| DRB14 | C1 | 57 |  | $(\overline{\mathrm{RBOE}}=1)$. |
| DRB15 | B1 | 56 |  |  |
| GND | C3 | 10 |  |  |
| GND | C5 | 30 |  |  |
| GND | C9 | 33 |  |  |
| GND | D9 | 46 |  | Ground pins. All pins must be used. |
| GND | G3 | 52 |  |  |
| GND | J5 | 68 |  |  |
| GND | J9 | 81 |  |  |
| INC | D11 | 27 | 1 | Incrementer control pin |
| $\overline{\mathrm{INT}}$ | E9 | 26 | 1 | Selects INT RT register to stack, active low (see Table 3) |
| MUXO | G10 | 19 |  |  |
| MUX1 | G11 | 20 | 1 | MUX control for Y output bus (see Table 4) |
| MUX2 | F11 | 21 |  |  |
| OSEL | L10 | 11 | 1 | DRA output MUX select. Low selects RCA, high selects stack. |
| $\overline{\text { RAOE }}$ | J6 | 1 | 1 | DRA output enable, active low |
| $\overline{\text { RBOE }}$ | F3 | 64 | 1 | DRB output enable, active low |
| RCO | C2 | 55 |  |  |
| RC1 | B2 | 54 | 1 | Controls for register/counters $A$ and $B$ |
| RC2 | A2 | 53 |  |  |
| $\overline{\mathrm{RE}}$ | C11 | 29 | 1 | INT RT register enable, active low. A high input holds INT RT register while a low input passes Y to INT RT register (see Table 3). |
| SO | K10 | 12 |  |  |
| S1 | J10 | 13 | 1 | Stack controls |
| S2 | K11 | 14 |  |  |
| SELDR | K2 | 75 | 1 | Selects data source to DRA bus and DRB bus (See Table 3) |
| STKWRN/ RER | J11 | 16 | 0 | Stack warning signal flag |
| $\mathrm{V}_{\mathrm{CC}}$ | C10 | 31 |  | Supply voltage (5 V) |
| $\mathrm{V}_{\mathrm{CC}}$ | J3 | 74 |  | Supply volage (5 V) |

Table 2. 'ACT8818 Pin Functional Description (Concluded)

818810シtLNS N

| PIN <br> NAME | GC <br> NO. | FN <br> NO. | I/O |  |
| :--- | :---: | :---: | :--- | :--- |
| Y0 | B3 | 51 |  |  |
| Y1 | A3 | 50 |  |  |
| Y2 | B4 | 49 |  |  |
| Y3 | A4 | 48 |  |  |
| Y4 | B5 | 47 |  |  |
| Y5 | A5 | 45 |  |  |
| Y6 | A6 | 44 |  |  |
| Y7 | C6 | 43 | I/O | Bidirectional Y data port |
| Y8 | A7 | 41 |  |  |
| Y9 | B7 | 40 |  |  |
| Y10 | C7 | 39 |  |  |
| Y11 | A8 | 38 |  |  |
| Y12 | B8 | 37 |  |  |
| Y13 | A9 | 36 |  |  |
| Y14 | B9 | 35 |  |  |
| Y15 | B10 | 34 |  |  |
| $\overline{\text { YOE }}$ | B6 | 42 | I | Y output enable, active low |
| ZEROIN | B11 | 32 | I | Forces internal zero detect high |
| ZEROUT | H11 | 17 | O | Outputs register/counter zero detect signal |

## 'ACT8818 Specification Tables

## absolute maximum ratings over operating free air temperature range (unless otherwise noted) ${ }^{\dagger}$

$$
\begin{aligned}
& \text { Supply voltage, } \mathrm{V}_{\mathrm{CC}} \text {. . . . . . . . . . . . . . . . . . . . . . . . . . . }-0.5 \mathrm{~V} \text { to } 6 \mathrm{~V} \\
& \text { Input clamp current, } \mathrm{I}_{\mathrm{K}}\left(\mathrm{~V}_{\mathrm{I}}<0 \text { or } \mathrm{V}_{\mathrm{I}}>\mathrm{V}_{\mathrm{CC}} \text { ) . . . . . . . . . . . . . . } \pm 20 \mathrm{~mA}\right. \\
& \text { Output clamp current, } \mathrm{I}_{\mathrm{OK}}\left(\mathrm{~V}_{\mathrm{O}}<0 \text { or } \mathrm{V}_{\mathrm{O}}>\mathrm{V}_{\mathrm{CC}} \ldots \ldots . . . . . . . . \pm 50 \mathrm{~mA}\right. \\
& \text { Continuous output current, } \mathrm{I}_{\mathrm{O}}\left(\mathrm{~V}_{\mathrm{O}}=0 \text { to } \mathrm{V}_{\mathrm{C}} \text { ) . . . . . . . . . . . . } \pm 50 \mathrm{~mA}\right. \\
& \text { Continuous current through VCC or GND pins . . . . . . . . . . . . . } \pm 100 \mathrm{~mA} \\
& \text { Operating free-air temperature range . . . . . . . . . . . . . . . . . . } 0^{\circ} \mathrm{C} \text { to } 70^{\circ} \mathrm{C} \\
& \text { Storage temperature range . . . . . . . . . . . . . . . . . . . . . . . } 65^{\circ} \mathrm{C} \text { to } 150^{\circ} \mathrm{C} \\
& { }^{\dagger} \text { Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. } \\
& \text { These are stress ratings only and functional operation of the device at these or any other conditions beyond } \\
& \text { those indicated under "recommended operating conditions" is not implied. Exposure to absolute maximum } \\
& \text { rated conditions for extended periods may affect device reliability. }
\end{aligned}
$$

recommended operating conditions

| PARAMETER | MIN | NOM | MAX | UNIT |
| :--- | :--- | ---: | ---: | :---: |
| $\mathrm{V}_{\mathrm{CC}}$ | Supply voltage | 4.5 | 5 | 5.5 |
| $\mathrm{~V}_{\text {IH }}$ | High-level input voltage | 2 | V |  |
| $\mathrm{~V}_{\mathrm{IL}}$ | Low-level input voltage | 0 | $\mathrm{~V}_{\mathrm{CC}}$ | V |
| $\mathrm{I}_{\mathrm{OH}}$ | High-level output current | 0.8 | V |  |
| $\mathrm{I}_{\mathrm{OL}}$ | Low-level output current |  | -8 | mA |
| $\mathrm{~V}_{\mathrm{I}}$ | Input voltage |  | 8 | mA |
| $\mathrm{~V}_{\mathrm{O}}$ | Output voltage | 0 | $\mathrm{~V}_{\mathrm{CC}}$ | V |
| $\mathrm{dt} / \mathrm{dv}$ | Input transition rise or fall rate | 0 | $\mathrm{~V}_{\mathrm{CC}}$ | V |
| TA | Operating free-air temperature | 0 | 15 | $\mathrm{~ns} / \mathrm{V}$ |

electrical characteristics over recommended operating free-air temperature range (unless otherwise noted)

2

| PARAMETER | TEST CONDITIONS | $V_{\text {cc }}$ | $T A=20^{\circ} \mathrm{C}$ |  |  | MIN | TYP | MAX | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | MIN | TYP | MAX |  |  |  |  |
| $\mathrm{V}_{\mathrm{OH}}$ | ${ }^{\mathrm{I} O H}=-20 \mu \mathrm{~A}$ | 4.5 V | 4.48 |  |  |  |  |  | V |
|  |  | 5.5 V | 5.46 |  |  |  |  |  |  |
|  | $\mathrm{I}^{\mathrm{OH}}=-8 \mathrm{~mA}$ | 4.5 V | 4.15 |  |  | 3.76 |  |  |  |
|  |  | 5.5 V | 4.97 |  |  | 4.76 |  |  |  |
| $\mathrm{V}_{\mathrm{OL}}$ | ${ }^{\prime} \mathrm{OL}=20 \mu \mathrm{~A}$ | 4.5 V |  |  | 0.014 |  |  |  | V |
|  |  | 5.5 V |  |  | 0.014 |  |  |  |  |
|  | $\mathrm{I}^{\mathrm{OL}}=8 \mathrm{~mA}$ | 4.5 V |  |  | 0.15 |  |  | 0.45 |  |
|  |  | 5.5 V |  |  | 0.13 |  |  | 0.45 |  |
| 1 | $V_{1}=V_{\text {CC }}$ or 0 | 5.5 V |  |  |  |  |  | $\pm 1$ | $\mu \mathrm{A}$ |
| ${ }^{1} \mathrm{CC}$ | $\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}$ or 0 | 5.5 V |  |  | 98 |  |  | 200 | $\mu \mathrm{A}$ |
| $\mathrm{C}_{\mathrm{i}}$ | $V_{1}=V_{\text {CC }}$ or 0 | 5 V |  | 3 |  |  |  |  | pF |
| $\Delta^{\prime} \mathrm{CC}^{\dagger}$ | One input at 3.4 V , other inputs at 0 or $\mathrm{V}_{\mathrm{CC}}$ | 5.5 V |  |  |  |  |  | 1 | mA |

[^2]maximum switching characteristics

| PARAMETER | FROM (INPUT) | то (OUTPUT) |  |  |  |  |  | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Y | ZEROUT | DRA | DRB | STKWRN | cout |  |
| ${ }^{\text {tpd }}$ | $\overline{\mathrm{CC}}$ | 23 |  |  |  |  |  |  |
|  | CLK | $\begin{gathered} 27 \\ 30^{\dagger} \end{gathered}$ | $23^{\dagger}$ | 24 | 16 | 25 |  |  |
|  | DRA15-DRAO | 23 |  |  |  |  |  | ns |
|  | DRB15-DRB0 | 22 |  |  |  |  |  |  |
|  | MUX2-MUXO | 22 |  |  |  |  |  |  |
|  | RC2-RCO | 26 | 18 |  |  |  |  |  |
|  | S2-S0 | 25 |  | 19 |  |  |  |  |
|  | B3-B0 | 19 |  |  |  |  |  |  |
|  | OSEL | 25 |  | 20 |  |  |  |  |
|  | ZEROIN | 25 |  |  |  |  |  |  |
|  | SELDR | 23 |  |  |  |  |  |  |
|  | INC |  |  |  |  |  | 20 |  |
|  | Y |  |  |  |  |  | 16 |  |
| ${ }^{\text {ten }}$ | $\overline{\text { YOE }}$ | 16 |  |  |  |  |  | ns |
|  | RAOE |  |  | 18 |  |  |  |  |
|  | $\overline{\text { RBOE }}$ |  |  |  | 17 |  |  |  |
| ${ }^{\text {d }}$ dis | YOE | 14 |  |  |  |  |  | ns |
|  | RAOE |  |  | 13 |  |  |  |  |
|  | $\overline{\text { RBOE }}$ |  |  |  | 14 |  |  |  |

[^3]
## setup and hold times

| PARAMETER | FROM (INPUT) | TO (OUTPUT) | MIN | MAX | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: |
| ${ }^{\text {t }}$ su | $\overline{\mathrm{CC}}$ | Stack | 15 |  | ns |
|  | DRA15-DRAO | Stack | 9 |  |  |
|  |  | RCA | 6 |  |  |
|  |  | INT RT | 9 |  |  |
|  | DRB15-DRB0 | RCB | 7 |  |  |
|  |  | INT RT | 11 |  |  |
|  | INC | MPC | 7 |  |  |
|  | $\overline{\text { INT }}$ | Stack | 7 |  |  |
|  | RC2-RCO | Stack | 15 |  |  |
|  |  | RCA, RCB | 6 |  |  |
|  |  | INT RT | 16 |  |  |
|  | S2-S0 | Stack | 13 |  |  |
|  |  | INT RT | 13 |  |  |
|  | OSEL | Stack | 12 |  |  |
|  |  | INT RT | 13 |  |  |
|  | B3-B0 | Stack | 8 |  |  |
|  |  | INT RT | 14 |  |  |
|  | SELDR | Stack | 10 |  |  |
|  |  | INT RT | 10 |  |  |
|  | ZEROIN | Stack | 14 |  |  |
|  |  | INT RT | 13 |  |  |
|  | Y | MPC | 6 |  |  |
|  | $\overline{\mathrm{RE}}$ | INT RT (CLK) | 7 |  |  |
|  | MUX2-MUX0 | INT RT | 12 |  |  |
| $t^{\text {h }}$ | Any <br> Input | Any <br> Destination | 0 |  | ns |

## clock requirements

|  | PARAMETER | MIN | MAX |
| :--- | :--- | :---: | :---: |
| $\mathrm{t}_{\mathrm{w} 1}$ | UNIse duration, clock low | 7 | ns |
| $\mathrm{t}_{\mathrm{w} 2}$ | Pulse duration, clock high | 9 | ns |
| $\mathrm{t}_{\mathrm{C}}$ | Clock cycle time | 33 | ns |

## Architecture

The 'ACT8818 microsequencer is designed with a 3-port architecture similar to the bipolar SN74AS890 microsequencer. Figure 4 shows the architecture of the 'ACT8818. The device consists of the following principal functional groups:

1. A 16-bit microprogram counter (MPC) consisting of a register and incrementer which generates the next sequential microprogram address
2. Two register/counters (RCA and RCB) for counting loops and iterations, storing branch addresses, or driving external devices
3. A 65 -word by 16 -bit LIFO stack which allows subroutine calls and interrupts at the microprogram level and is expandable and readable by external hardware
4. An interrupt return register and $Y$ output enable for interrupt processing at the microinstruction level
5. A Y output multiplexer by which the next address can be selected from MPC, RCA, RCB, external buses DRA and DRB, or the stack.
'ACT8818 control signals are summarized in Table 3. Those signals, which typically originate from the instruction register, are Y output multiplexer controls, MUX2-MUX0. These select the source of the next address; stack operation controls, S2-SO; register/counter operation controls, RC2-RCO; OSEL, which allows the stack to be read for diagnostics; input MUX select, SELDR; DRA and DRB output enables, $\overline{\text { RAOE }}$ and $\overline{\mathrm{RBOE}}$; and $\overline{\mathrm{INT}}$, used during the first cycle of interrupt service routines to push the address in the interrupt return register address onto the stack.

Control and data signals that commonly originate from the microinstruction and from other hardware sources include INC, which determines whether to increment the MPC; DRA and DRB, used to load or read loop counters and/or next addresses; and $\overline{C C}$, the condition code input. The address being loaded into the MPC is not incremented if INC is low, allowing wait states and repeat until flag instructions to be implemented. If INC originates from status, repeat until flag instructions are possible.

The condition code input $\overline{\mathrm{CC}}$ typically originates from ALU status to permit test and branch instructions. However, it must also be asserted under microprogram control to implement other instructions such as continue or loop. Therefore, $\overline{C C}$ will generally be controlled by the output of a status multiplexer. In this case, whether $\overline{\mathrm{CC}}$ is to be forced high, forced low or taken from ALU status will be determined by a status MUX select field in the microinstruction.

Table 3．Response to Control Inputs

| SIGNAL NAME | LOGIC LEVEL |  |
| :---: | :---: | :---: |
|  | HIGH | LOW |
| $\mathrm{BO}^{+}$ | Load stack pointer from 7 least significant bits of DRA | No effect |
| $B 1{ }^{+}$ | Selects DRA contents as stack input（takes priority over $\overline{\mathrm{NT}}$ ） | No effect |
| $\overline{\mathrm{CC}}$ | Condition code input．May be microcoded or selected from external status results． | Condition code input．For branch operations，low active． |
| INC | Increment address from Y bus and load into MPC | Pass address from $Y$ bus to MPC unincremented． |
| $\overline{\mathrm{INT}} \ddagger$ | Selects MPC as input to stack | Selects interrupt return register as input to stack |
| OSEL | Selects stack as output from DRA output MUX | Selects RCA as output from DRA output MUX |
| MUX2－MUXO | See Table 4 | See Table 4 |
| $\overline{\text { RAOE }}$ | DRA output disabled（high－Z） | DRA output enabled |
| $\overline{\text { RBOE }}$ | DRB output disabled（high－Z） | DRB output enabled |
| RC2－RCO | See Table 6 | See Table 6 |
| $\overline{\mathrm{RE}}$ | Hold interrupt return register contents | Load address on $Y$ bus to interrupt return register |
| S2－S0 | See Table 5 | See Table 5 |
| SELDR | Selects DRA／DRB external data as inputs to DRA／DRB buses | Selects RCA（OSEL low）or stack （OSEL high）to DRA bus，RCB to DRB bus |
| $\overline{\text { YOE }}$ | Y output disabled（high－Z） | Y output enabled |
| ZEROIN | Sets ZEROUT to a high externally to set up conditional branch | No effect |

[^4] ${ }^{\ddagger}$ When B 1 is low or B 1 is not in control mode．

Control signals which may also originate from hardware are B3－BO，which can be used as a 4 －bit status input to support 16 －and 32 －way branches，and $\overline{\mathrm{YOE}}$ ，which allows interrupt hardware to force an interrupt vector on the microaddress bus．


Figure 4. 'ACT8818 Functional Block Diagram
Status from the 'ACT8818 is provided by ZEROUT, which is set at the beginning of a cycle in which either of the register/counters will decrement to zero, and STKWRN/RER, set at the beginning of the cycle in which the bottom of stack is read or in which the next to last location is written. In the latter case, STKWRN/RER remains high until the stack pointer is decremented from 64 to 63 .

## Y Output Multiplexer

Address selection is controlled by the Y output multiplexer and the $\overline{\mathrm{RAOE}}$ and $\overline{\mathrm{RBOE}}$ enables. Addresses can be selected from eight sources:

1. The microprogram counter register, used for repeat (INC off) and continue (INC on) instructions
2. The stack, which supports subroutine calls and returns as well as iterative loops and returns from interrupts
3. The DRA and DRB ports, which provide two additional paths from external hardware by which microprogram addresses can be generated
4. Register counters RCA and RCB, which can be used for additional address storage
5. B3-B0, whose contents can replace the four least significant bits of the DRA and DRB buses to support 16 -way and 32 -way branches
6. An external input onto the bidirectional $Y$ port to support external interrupts.

Use of controls MUX2-MUXO is explained further in the later section on microprogramming the 'ACT8818.

## Microprogram Counter

Based on system status and the current instruction, the microsequencer outputs the next execution address in the microprogram. Usually the incrementer adds one to the address on the Y bus to compute next address plus one. Next address plus one is stored in the microprogram register at the beginning of the subsequent instruction cycle. During the next instruction, this 'continue' address will be ready at the Y output MUX for possible selection as the source of the subsequent instruction. The incrementer thus looks two addresses ahead of the address in the instruction register to set up a continue (increment by one) or repeat (no increment) address.

Selecting INC from status is a convenient means of implementing instructions that must repeat until some condition is satisfied; for example, Shift ALU Until MSB $=1$, or Decrement ALU Until Zero. The MPC is also the standard path to the stack. The next address is pushed onto the stack during a subroutine call, so that the subroutine will return to the instruction following that from which it was called.

## Register/Counters

Addresses or loop counts may be loaded directly into register/counters RCA and RCB through the direct data ports DRA15-DRA0 and DRB15-DRB0. The values stored in these registers may either be held, decremented, or read. Independent control of both the registers during a single cycle is supported with the exception of a simultaneous decrement of both registers.

## Stack

The positive edge clocked 16-bit address stack allows multiple levels of nested calls or interrupts and can be used to support branching and looping. Seven stack operations are possible:

1. Reset, which pulls all $Y$ outputs low and clears the stack pointer and read pointer
2. Clear, which sets the stack pointer and read pointer to zero
3. Pop, which causes the stack pointer to be decremented
4. Push, which puts the contents of the MPC, interrupt return register, or DRA bus onto the stack and increments the stack pointer
5. Read, which makes the address indicated by the read pointer available at the DRA port
6. Hold, which causes the address of the stack and read pointers to remain unchanged
7. Load stack pointer, which inputs the seven least significant bits of DRA to the stack pointer.

## Stack Pointer

The stack pointer (SP) operates as an up/down counter; it increments whenever a push occurs and decrements whenever a pop occurs. Although push and pop are two event operations (store then increment SP, or decrement SP then read), the 'ACT8818 performs both events within a single cycle.

## Read Pointer

The read pointer (RP) is provided as a tool for debugging microcoded systems. It permits a nondestructive, sequential read of the stack contents from the DRA port. This capability provides the user with a method of backtracking through the address sequence to determine the cause of overflow without affecting program flow, the status of the stack pointer, or the internal data of the stack.

## Stack Warning/Read Error Pin

A high signal on the STKWRN/RER pin indicates a potential stack overflow or underflow condition. STKWRN/RER becomes active under two conditions. If 62 of the 65 stack locations ( $0-64$ ) are full (the stack pointer is at 62 ) and a push occurs, the STKWRN/RER pin outputs a high signal to warn that the stack is approaching its capacity and will be full after two more pushes.

The STKWRN/RER signal will remain high if hold, push or pop instructions occur, until the stack pointer is decremented to 62. If a push instruction is attempted when the stack is full, the new address will be ignored and the old address in stack location 64 will be retained.

The STKWRN／RER pin will go high when the stack pointer is less than or equal to one and a pop or read from stack is coded on the S2－SO pins．The pin will go high after reading the next to the bottom stack address（1）．When the S2－S0 pins are set to pop or read the last address（ 0 ）or to pop or read an empty stack，the STKWRN／RER pin will go high．The pin depends only on the setting of the S2－SO pins and the stack pointer， not on the clock．

Unlike the MPC register，which normally gets next address plus one，the interrupt return register simply gets next address．This permits interrupts to be serviced with zero latency，since the interrupt vector replaces the pending address．

The interrupting hardware disables the $Y$ output and forces the vector onto the microaddress bus．This event must be synchronized with the system clock．The first address of the service routine must program $\overline{\mathrm{INT}}$ low and perform a push to put the contents of the interrupt return register on the stack．

## Microprogramming the＇ACT8818

Microprogramming is unlike programming monolithic processors for several reasons． First，the width of the microinstuction word is only partially constrained by the basic signals required to control the sequencer．Since the main advantage of a microprogrammed processor is speed，many operations are often supported by or carried out in special purpose hardware．Lookup tables，extra registers，address generators，elastic memories，and data acquisition circuits may also be controlled by the microinstruction．

The number of slices in a bit－slice ALU is user－defined，which makes the microinstruction width even more application dependent．Types of instructions resulting from manipulation of the sequencer controls are discussed below．Examples of some commonly used instructions can be found in the later section of microinstructions and flow diagrams．The following abbreviations are used in the tables in this section：

| BR A | $Y-$ DRA |
| :---: | :---: |
| BR $\mathrm{A}^{\prime}$ | $Y-D^{\prime}{ }^{\prime}$ |
| BR B | $Y-$ DRB |
| BR B＇ | $Y-D R B^{\prime}$ |
| BR S | Y－STK |
| CALL A | $Y-D R A ; S T K-M P C ; S P-S P+1 ; R P-R P+1$ |
| CALL B | $Y-D R B ; S T K-M P C ; S P-S P+1 ; R P-R P+1$ |
| CALL ${ }^{\prime}$ | $Y-D R A ' ; S T K-M P C ; S P-S P+1 ; R P-R P+1$ |
| CALL B＇ | $Y-D R B ' ; S T K-M P C ; S P-S P+1 ; R P-R P+1$ |
| CALL S | Y－STK；STK－MPC；SP－SP＋1；RP－RP＋ 1 |
| CLR SP，RP | $S P-0 ; R P-$ points to TOS register |
| CONT／RPT | $\mathrm{Y}-\mathrm{MPC}+1$ if $\mathrm{INC}=\mathrm{H} ; \mathrm{Y}-\mathrm{MPC}$ if INC $=\mathrm{L}$ |
| DRA | Bidirectional data port（can be loaded externally or from RCA） |
| DRA ${ }^{\prime}$ | DRA15－DRA4：：B3－B0 |
| DRB | Bidirectional data port（can be loaded externally or from RCB） |
| DRB＇ | DRB15－DRB4：：B3－B0 |
| MPC | Microprogram counter |
| POP | $S P-S P-1 ; R P-R P-1$ |
| PUSH | STK－operand；SP－SP＋1；RP－RP＋ 1 |
| RCA | Register／counter A |
| RCB | Register／counter B |
| READ | DRA－STK；RP－RP－1；SP－SP－ 1 |
| RESET | Y－0；SP－0；RP－points to TOS register |
| RP | Read pointer |
| SP | Stack pointer |
| STK | Stack |

## Address Selection

Y－output multiplexer controls MUX2－MUXO select one of eight 3－source branches as shown in Table 4．The states of $\overline{C C}$ and ZERO determine which of the three sources is selected as the next address．ZERO is set at the beginning of any cycle in which a register／counter will decrement to zero．This applies to both internal ZERO and external ZEROUT signals．

Table 4．Output Controls（MUX2－MUXO）

| MUX2－ <br> MUXO | RESET | Y OUTPUT SOURCE |  |  |
| :---: | :---: | :---: | :---: | :---: |
|  |  | $\overline{\mathbf{C C}}=\mathbf{L}$ |  | $\overline{\mathbf{C C}}=\mathrm{H}$ |
|  |  | ZERO $=\mathrm{L}$ | ZERO $=\mathrm{H}$ |  |
| XXX | Yes | All Low | All Low | All Low |
| LLL | No | STK | MPC | DRA |
| LLH | No | STK | MPC | DRB |
| LHL | No | STK | DRA | MPC |
| LHH | No | STK | DRB | MPC |
| HLL | No | DRA | MPC | DRB |
| HLH | No | DRA ${ }^{\dagger}$ | MPC | DRB＇$\ddagger$ |
| HHL | No | DRA | STK | MPC |
| HHH | No | DRB | STK | MPC |

${ }^{\dagger}$ DRA15－DRA4：$:$ B3－B0
${ }^{\ddagger}$ DRB15－DRB4：：B3－B0

By programming $\overline{C C}$ high or low without decrementing registers，only one outcome is possible；thus，unconditional branches or continues can be implemented by forcing the condition code．Alternatively，$\overline{\mathrm{CC}}$ can be selected from status，in which case Branch $A$ on Condition Code Else Branch B instructions are possible，where $A$ and $B$ are the address sources determined by MUX2－MUXO．

Decrement and Branch on Nonzero instructions，creating loops that repeat until a terminal count is reached，can be implemented by programming $\overline{\mathrm{CC}}$ low and decrementing a register／counter．If $\overline{\mathrm{CC}}$ is selected from status and registers are decremented，more complex iñstructions such as Exit on Condition Code or End or Loop are possible．

When MUX2－MUXO＝HLH，the B3－BO inputs can replace the four least significant bits of DRA or DRB to create 16－Way branches or，when $\overline{\mathrm{CC}}$ is based on status，to create 32－way branches．

## Stack Controls

As in the case of the MUX controls，each stack－control coding is a three－way choice based on $\overline{\mathrm{CC}}$ and ZERO（see Table 5）．This allows push，pop，or hold stack operations to occur in parallel with the aforementioned branches．A subroutine call is accomplished by combining a branch and push，while returns result from coding a branch to stack with a pop．

Table 5. Stack Controls (S2-SO)

| S2-SO | OSEL | STACK OPERATION |  |  |
| :---: | :---: | :---: | :---: | :---: |
|  |  | $\overline{\text { CC }=\mathbf{L}}$ |  | $\overline{\text { CC }}=\mathbf{H}$ |
|  | ZERO $=\mathbf{L}$ | ZERO $=\mathbf{H}$ |  |  |
| LLL | $X$ | Reset/Clear | Reset/Clear | Reset/Clear |
| LLH | $X$ | Clear SP/RP | Hold | Hold |
| LHL | $X$ | Hold | Pop | Pop |
| LHH | $X$ | Pop | Hold | Hold |
| HLL | $X$ | Hold | Push | Push |
| HLH | $X$ | Push | Hold | Hold |
| HHL | $X$ | Push | Hold | Push |
| HHH | H | Read | Read | Read |
| HHH | $L$ | Hold | Hold | Hold |

Combining stack and MUX controls with status results and register decrements permits even greater complexity. For example: Return on Condition Code or End of Loop; Call A on Condition Code Else Branch to B; Decrement and Return on Nonzero; Call 16-Way.

Diagnostic stack dumps are possible using Read (S2-SO $=\mathrm{HHH}$ ) when OSEL is set high.

## Register Controls

Unlike stack and MUX controls, register control is not dependent upon $\overline{C C}$ and ZERO. Registers can be independently loaded, decremented, or held using register control inputs RC2-RCO (see Table 6). All combinations are supported with the exception of simultaneous register decrements. The register control inputs can be set to store branch addresses and loop counts or to decrement loop counts, facilitating the complex branching instructions described above.

Table 6. Register Controls (RC2-RC0)

| RC2-RC0 | REGISTER OPERATIONS |  |
| :---: | :---: | :---: |
|  | REG A | REG B |
| LLL | Hold | Hold |
| LLH | Decrement | Hold |
| LHL | Load | Hold |
| LHH | Decrement | Load |
| HLL | Load | Load |
| HLH | Hold | Decrement |
| HHL | Hold | Load |
| HHH | Load | Decrement |

The contents of RCA are accessible to the DRA port when OSEL is low and the output bus is enabled by $\overline{\operatorname{RAOE}}$ being low. Data from RCB is available when DRB is enabled by $\overline{\mathrm{RBOE}}$ being low.

## Continue/Repeat Instructions

The most commonly used instruction is a continue, implemented by selecting MPC at the Y output MUX and setting INC high. If MPC is selected and INC is off, the current instruction will simply be repeated.

A repeat instruction can be implemented in two ways. A programmed repeat (INC forced low) may be useful in generating wait states, for example, wait for interrupt.
2 A conditional repeat (INC originates from status) may be useful in implementing Do While operations. Several bit patterns in the MUX control field of the microinstruction will place MPC on the microaddress bus.

## Branch Instructions

A branch or jump to a given microaddress can also be coded several ways. RCA, DRA, RCB, DRB, and STK are possible sources for branch addresses (see Table 4). Branches to register or stack are useful whenever the branch address could be stored to reduce overhead.

The simplest branches are to DRA and DRB, since they require only one cycle and the branch address is supplied in the microinstruction. Use of registers or stack requires an initial load cycle (which may be combined with a preceding instruction), but may be more practical when an entry point is referenced over and over throughout the microprogram, for example, in error-handling routines. Branches to stack or register also enhance sequencing techniques in which a branch address is dynamically computed or multiple branches to a common entry point are used, but the entry point varies according to the system state. In this case, the state change might require reloading the stack or register.

In order to force a branch to DRA or DRB, $\overline{\mathrm{CC}}$ must be programmed high or low. A branch to stack is only possible when $\overline{C C}$ is forced low (see Table 4).

When $\overline{\mathrm{CC}}$ is low, the ZERO flag is tested, and if a register decrements to zero the branch will be transformed into a Decrement and Branch on Nonzero instruction. Therefore, registers should not be decremented during branch instructions using $\overline{\mathrm{CC}}=0$ unless it is certain the register will not reach terminal count. Call (Branch and Push MPC) instructions and Return (Branch to Stack and Pop) instructions are discussed in later sections.

## Conditional Branch Instructions

Perhaps the most useful of all branches is the conditional branch. The 'ACT8818 permits three modes of conditional branching: Branch on Condition Code; Branch 16-Way from DRA or DRB; and Branch on Condition Code 16-Way from DRA Else Branch 16-Way from DRB. This increases the versatility of the system and the speed of processing status tests because both single-bit and 4-bit status are allowed.

Testing single bit status is preferred when the status can be set up and selected through a status MUX prior to the conditional branch. Four-bit status allows the 'ACT8818 to process instructions based on Boolean status expressions, such as Branch if Overflow and Not Carry if Zero or if Negative. It also permits true n-way branches, such as If Negative then Branch to X, Else if Overflow, and Not Carry then Branch to Y. The tradeoff is speed versus program size. Since multiway branching occurs relatively infrequently in most programs, users will enjoy increased speed at a negligible cost. Call (Branch and Push MPC) instructions and Return (Branch to Stack and Pop) instructions are discussed in later sections.

## Loop Instructions

Up to two levels of nested loops are possible when both counters are used simultaneously. Loop count and levels of nesting can be increased by adding external counters if desired. The simplest and most widely used of the loop instructions is Decrement and Branch on Nonzero, in which $\overline{\mathrm{CC}}$ is forced low while a register is decremented. As before, many forms are possible, since the top-of-loop address can originate from RCA, DRA, RCB, DRB, or the stack (see Table 4). Upon terminal count, instruction flow can either drop out of the bottom of the loop or branch elsewhere.

When loops are used in conjunction with $\overline{\mathrm{CC}}$ as status, B3-BO as status and/or stack manipulation, many useful instructions are possible, including Decrement and Branch on Nonzero else Return, Decrement and Call on Nonzero, and Decrement and Branch 16-Way on Nonzero. Possible variations are summarized in Table 7. Call (Branch and Push MPC) instructions and Return (Branch to Stack and Pop) instructions are discussed in later sections.

Another level of complexity is possible if $\overline{C C}$ is selected from status while looping. This type of loop will exit either because $\overline{\mathrm{CC}}$ is true or because a terminal count has been reached. This makes it possible, for example, to search the ALU for a bit string. If the string is found, the match forces $\overline{\mathrm{CC}}$ high. However, if no match is found, it is necessary to terminate the process when the entire word has been scanned. This complex process can then be implemented in a simple compact loop using Conditional Decrement and Branch on Nonzero.

Table 7. Decrement and Branch on Nonzero Encodings

| MUX2- <br> MUXO | SE-SO | OSEL | $\overline{\mathbf{C C}}=\mathbf{L}$ |  | $\overline{\mathbf{C C}}=\mathrm{H}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | ZERO $=$ L | ZERO $=\mathrm{H}$ |  |
| LLL | LLH | X | BR S: CLR SP/RP | CONT/RPT | BR A |
| LLL | LHL | X | BR S | CONT/RPT: POP | BR A: POP |
| LLL | HLL | $x$ | BR S | CONT/RPT: PUSH | CALL A |
| LLL | HHH | 0 | BR S | CONT/RPT | BR A |
| LLL | HHH | 1 | BR S: READ | CONT/RPT: READ | BR A: READ |
| LLH | LLH | $x$ | BR S: CLR SP/RP | CONT/RPT | BR B |
| LLH | LHL | x | BR S | CONT/RPT: POP | BR B: POP |
| LLH | HLL | X | BR S | CONT/RPT: PUSH | CALL B |
| LLH | HHH | 0 | BR S | CONT/RPT | BR B |
| LLH | HHH | 1 | BR S: READ | CONT/RPT: READ | BR B: READ |
| LHL | LLH | x | BR S: CLR SP/RP | BR A | CONT/RPT |
| LHL | LHL | X | BR S | BR A: POP | CONT/RPT: POP |
| LHL | HLL | x | BR S | CALL A | CONT/RPT: PUSH |
| LHL | HHH | 0 | BR S | BR A | CONT/RPT |
| LHL | HHH | 1 | BR S: READ | BR A: READ | CONT/RPT: READ |
| LHH | LLH | x | BR S: CLR SP/RP | BR B | CONT/RPT |
| LHH | LHL | X | BR S | BR B: POP | CONT/RPT: POP |
| LHH | HLL | x | BR S | CALL B | CONT/RPT: PUSH |
| LHH | HHH | 0 | BR S | BR B | CONT/RPT |
| LHH | HHH | 1 | BR S: READ | BR B: READ | CONT/RPT: READ |
| HLL | LLH | X | BR A: CLR SP/RP | CONT/RPT | BR B |
| HLL | LHL | x | BR A | CONT/RPT: POP | BR B: POP |
| HLL | LHH | x | BR A: POP | CONT/RPT | BR B |
| HLL | HLL | $x$ | BR A | CONT/RPT: PUSH | CALL B |
| HLL | HHH | 0 | BR A | CONT/RPT | BR B |
| HLL | HHH | 1 | BR A: READ | CONT/RPT: READ | BR B: READ |
| HLH | LLH | $x$ | BR A' (16-way): CLR SP/RP | CONT/RPT | BR B' (16-way) |
| HLH | LHL | $x$ | BR A ${ }^{\prime}$ (16-way) | CONT/RPT: POP | BR B' (16-way): POP |
| HLH | LHH | x | BR A' (16-way): POP | CONT/RPT | BR B' (16-way) |
| HLH | HLL | x | BR $A^{\prime}$ (16-way) | CONT/RPT: PUSH | CALL $\mathrm{B}^{\prime}(16$-way) |
| HLH | HHH | 0 | BR A' (16-way) | CONT/RPT | BR B' (16-way) |
| HLH | HHH | 1 | BR A' (16-way): READ | CONT/RPT: READ | BR B' (16-way): READ |
| HHL | LLH | $x$ | BR A: CLR SP/RP | BR S | CONT/RPT |
| HHL | LHL | X | BR A | RET (BRS: POP) | CONT/RPT: POP |
| HHL | LHH | $x$ | BR A: POP | BR S | CONT/RPT |
| HHL | HLL | x | BR A | CALL S | CONT/RPT: PUSH |
| HHL | HHH | 0 | BR A | BR S | CONT/RPT |
| HHL | HHH | 1 | BR A: READ | BR S: READ | CONT/RPT: READ |

Table 7. Decrement and Branch on Nonzero Encodings (Continued)

| MUX2MUXO | SE-SO | OSEL | $\overline{\mathbf{C C}}=\mathrm{L}$ |  | $\overline{\mathrm{CC}}=\mathrm{H}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | ZERO $=\mathrm{L}$ | ZERO $=\mathrm{H}$ |  |
| HHH | LLH | $x$ | BR B: CLR SP/RP | BR S | CONT/RPT |
| HHH | LHL | $x$ | BR B | RET | CONT/RPT: POP |
| HHH | LHH | $x$ | BR B: POP | BR S | CONT/RPT |
| HHH | HLL | $x$ | BR B | CALL S | CONT/RPT: PUSH |
| HHH | HHH | 0 | BR B | BR S | CONT/RPT |
| HHH | HHH | 1 | BR B: READ | BR S: READ | CONT/RPT: READ |

## Subroutine Calls

The various branch instructions described above can be merged with a push instruction to implement subroutine calls in a single cycle. Calls, conditional calls, and Decrement and Call on Nonzero are the most obvious.

Since a push is conditional on $\overline{\mathrm{CC}}$ and ZERO, many hybrid instructions are also possible, such as Call X on Condition Code Else Branch, or Decrement and Return on Nonzero Else Branch. Codes that cause subroutine calls are summarized in Tables 8 and 9.

Table 8. Call Encodings without Register Decrements

| MUX2-MUXO | S2-s0 | OSEL | $\overline{\mathrm{CC}}=\mathrm{L}($ ZERO $=\mathrm{L})$ | $\overline{\mathrm{CC}}=\mathrm{H}$ |
| :---: | :---: | :---: | :---: | :---: |
| LLL | HLH | X | CALL S | BR A |
| LLL | HHL | X | CALL S | CALL A |
| LLH | HLH | X | CALL S | BR B |
| LLH | HHL | x | CALL S | CALL B |
| LHL | HLH | X | CALL S | CONT/RPT |
| LHL | HHL | X | CALL S | CONT/RPT: PUSH |
| LHH | HLH | X | CALL S | CONT/RPT |
| LHH | HHL | X | CALL S | CONT/RPT: PUSH |
| HLL | HLH | X | CALL A | BR B |
| HLL | HHL | X | CALL A | CALL B |
| HLH | HLH | X | CALL A' (16-way) | BR B' (16-way) |
| HLH | HHL | X | CALL A' (16-way) | CALL B' (16-way) |
| HHL | HLH | x | CALL A | CONT/RPT |
| HHL | HHL | X | CALL A | CONT/RPT: PUSH |
| HHH | HLH | X | CALL B | CONT/RPT |
| HHH | HHL | X | CALL B | CONT/RPT: PUSH |

Table 9. Call Encodings with Register Decrements

| MUX2- <br> MUXO | S2-S0 | OSEL | $\overline{\mathbf{C C}}=\mathbf{L}$ |  | $\overline{\mathbf{C C}}=\mathrm{H}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | ZERO = L | ZERO $=\mathbf{H}$ |  |
| LLL | HLH | X | CALL S | CONT/RPT | BR A |
| LLL | HHL | X | CALL S | CONT/RPT | CALL A |
| LLH | HLH | $x$ | CALL S | CONT/RPT | BR B |
| LLH | HHL | $x$ | CALL S | CONT/RPT | CALL B |
| LHL | HLH | X | CALL S | BR A | CONT/RPT |
| LHL | HHL | X | CALL S | BR A | CONT/RPT: PUSH |
| LHH | HLH | $x$ | CALL S | BR B | CONT/RPT |
| LHH | HHL | X | CALL S | BR B | CONT/RPT: PUSH |
| HLL | HLH | x | CALL A | CONT/RPT | BR B |
| HLL | HHL | X | CALL A | CONT/RPT | CALL B |
| HLH | HLH | X | CALL A' (16-way) | CONT/RPT | BR B' (16-way) |
| HLH | HHL | X | CALL A' (16-way) | CONT/RPT | CALL B' (16-way) |
| HHL | HLH | X | CALL A | BR S | CONT/RPT |
| HHL | HHL | X | CALL A | BR S | CONT/RPT: PUSH |
| HHH | HLH | X | CALL B | BR S | CONT/RPT |
| HHH | HHL | X | CALL B | BR S | CONT/RPT: PUSH |

## Subroutine Returns

A return from subroutine can be implemented by coding a branch to stack with a pop. Since pop is also conditional on $\overline{C C}$ and ZERO, the complex forms discussed previously also apply to return instructions: Decrement and Return on Nonzero; Return on Condition Code; Branch on Condition Code Else Return. Return encodings are summarized in Tables 10 and 11.

Table 10. Return Encodings without Register Decrements

| MUX2-MUX0 | S2-S0 | OSEL | $\overline{\text { CC }}=\mathbf{L}$ | $\overline{\mathbf{C C}}=\mathbf{H}$ |
| :---: | :---: | :---: | :---: | :--- |
| LLL | LHH | X | RET | BR A |
| LLH | LHH | X | RET | BR B |
| LHL | LHH | X | RET | CONT/RPT |
| LHH | LHH | $X$ | RET | CONT/RPT |

Table 11. Return Encodings with Register Decrements

| MUX2-MUXO | S2-SO | OSEL | $\overline{\mathbf{C C}}=\mathbf{L}$ |  | $\bar{*} \overline{\text { CC }}=\mathbf{H}$ |
| :---: | :---: | :---: | :--- | :--- | :--- |
|  |  |  | ZERO $=\mathbf{L}$ | ZERO $=\mathbf{H}$ |  |
| LLL | LHH | $X$ | RET | CONT/RPT | BR A |
| LLH | LHH | $X$ | RET | CONT/RPT | BR B |
| LHL | LHH | $X$ | RET | BR A | CONT/RPT |
| LHH | LHH | $X$ | RET | BR B | CONT/RPT |
| HHL | LHL | $X$ | BR A | RET | CONT/RPT: POP |
| HHH | LHL | $X$ | BR B | RET | CONT/RPT: POP |

## Reset

Pulling the S2-SO pins low clears the stack and read pointers, and zeroes the Y output multiplexer (See Table 5).

## Clear Pointers

The stack and read pointers may be cleared without affecting the $Y$ output multiplexer by setting S2-SO to LLH and forcing $\overline{\mathrm{CC}}$ low (see Table 5).

## Read Stack

Placing a high value on all of the stack inputs (S2-SO) and OSEL places the 'ACT8818 into the read mode. At each low-to-high clock transition, the address pointed to by the read pointer is available at the DRA port and the read pointer is decremented. The bottom of the stack is detected by monitoring the stack warning/read error pin (STKWRN/RER). A high appears on the STKWRN/RER output when the stack contains one word and a read instruction is applied to the S2-SO pins. This signifies that the last address has been read.

The stack pointer and stack contents are unaffected by the read operation. Under normal push and pop operations, the read pointer is updated with the stack pointer and contains identical information.

## Interrupts

Real-time vectored interrupt routines are supported for those applications where polling would impede system throughput. Any instruction, including pushes and pops, may be interrupted. To process an interrupt, the following procedure should be followed:

1. Place the bidirectional $Y$ bus into a high-impedance state by forcing $\overline{Y O E}$ high.
2. Force the interrupt entry point vector onto the $Y$ bus. INC should be high.
3. Push the current value in the Interrupt Return register on the stack as the execution address to return to when interrupt handling is complete.

The first instruction of the interrupt routine must push the address stored in the interrupt return register onto the stack so that proper return linkage is maintained. This is accomplished by setting $\overline{\mathrm{INT}}$ and B1 low and coding a push on the stack.

## Sample Microinstructions for the 'ACT8818

Representative examples of instructions using the 'ACT8818 are given below. The examples assume a one-level pipeline system, in which the address and contents of the next instruction are being fetched while the current instruction is being executed, and an ALU status register contains the status results of the previous instruction.

Since the incrementer looks two addresses ahead of the address in the instruction register to set up some instructions such as continue or repeat, a set-up instruction has been included with each example. This shows the required state of both INC and $\overline{\mathrm{CC}} . \overline{\mathrm{CC}}$ must be set up early because the status register on which Y -output selection is typically based contains the results of the previous instruction.

Flow diagrams and suggested code for the sample microinstructions are also given below. Numbers inside the circles are microword address locations expressed as hexadecimal numbers. Fields in microinstructions are binary numbers except for inputs on DRA or DRB, which are also in hexadecimal. For a discussion of sequencing instructions, see the preceding section on microprogramming.

## Continue

To Continue (Instruction 10), INC and $\overline{\mathrm{CC}}$ must be programmed high one cycle ahead of instruction 10 for pipelining.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue | 110 | 111 | XXX | 0 | X | X | XXXX | XXXX |

## Continue and Pop

To Continue and decrement the stack pointer (Pop), INC and $\overline{\mathrm{CC}}$ are forced high in the previous instruction.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue/Pop | 110 | 010 | XXX | X | X | X | XXXX | XXXX |

## Continue and Push

To Continue and push the microprogram counter onto the stack (Push), INC and $\overline{C C}$ are forced high one cycle ahead of Instruction 10 for pipelining.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :--- | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | $X X X$ | $X X X$ | $X X X$ | $X$ | 1 | 1 | $X X X X$ | $X X X X$ |
| 10 | Continue/Push | 110 | 100 | $X X X$ | 0 | $X$ | $X$ | $X X X X$ | $X X X X$ |



Figure 5. Continue


Figure 6. Continue and Pop


Figure 7. Continue and Push

## Branch (Example 1)

To Branch from address 10 to address $20, \overline{\mathrm{CC}}$ must be programmed high one cycle ahead of Instruction 10 for pipelining.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | $X X X$ | $X X X$ | $X X X$ | $X$ | 1 | $X$ | $X X X X$ | $X X X X$ |
| 10 | BRA | 000 | 111 | $X X X$ | 0 | $X$ | $X$ | 0020 | $X X X X$ |

## Branch (Example 2)

To Branch from address 10 to address $20, \overline{\mathrm{CC}}$ is programmed low in the previous instruction; as a result, a ZERO test follows the condition code test in Instruction 10. To ensure that a ZERO $=\mathrm{H}$ condition will not occur, registers should not be decremented during this instruction.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | $X X X$ | XXX | XXX | X | 0 | X | XXXX | XXXX |
| 10 | BR A | 110 | 111 | 000 | 0 | X | X | 0020 | XXXX |

## Sixteen-Way Branch

To Branch 16-Way, $\overline{\mathrm{CC}}$ is programmed high in the previous instruction. The branch address is derived from the concatenation DRB15-DRB4::B3-B0.

| Address | Instruction | MUX2-MUXO | S2-SO | R2-RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| (Set-up) |  | XXX | XXX | XXX | X | 1 | $X$ | XXXX | XXXX |
| 10 | BR B' | 101 | 111 | $X X X$ | 0 | $X$ | $X$ | $X X X X$ | 0040 |



Figure 8. Branch Example 1


Figure 10. Sixteen-Way Branch

## Conditional Branch

To Branch to address 20 Else Continue to address 11 ，INC is set high in the preceding instruction to set up the Continue．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | CC＇ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | XXX | XXX | XXX | X | X | 1 | XXXX | XXXX |
| （Set－up） | BR A else | 110 | 111 | 000 | 0 | X | X | 0020 | XXXX |
| 10 | Continue | 110 |  |  |  |  |  |  |  |

## Three－Way Branch

To Branch 3－Way，this example uses an instruction from Table 7 with BR $A$ in the ZERO $=L$ column，CONT／RPT in the ZERO $=H$ column and BR B in the $\overline{C C}=H$ column．To enable the $Z E R O=H$ path，register $A$ must decrement to zero during this instruction（see Table 6 for possible register operations）．INC is programmed high in Instruction 10 to set up the Continue．

| Address | Instruction | MUX2－MUXO | S2－S0 | R2－RO | OSEL | $\overline{\mathrm{CC}}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue and |  |  |  |  |  |  |  |  |
|  | Load Reg A | 110 | 111 | 010 | 0 | $\dagger$ | 1 | XXXX | XXXX |
| 11 | Decrement Reg A； Branch 3－Way | 100 | 111 | 001 | 0 | X | X | 0020 | 0030 |
| ${ }^{\dagger}$ Selected from external status |  |  |  |  |  |  |  |  |  |
| Thirty－Two－Way Branch |  |  |  |  |  |  |  |  |  |

To Branch 32－Way，the four least significant bits of the DRA＇and DRB＇addresses must be input at the B3－B0 port；these are concatenated with the 12 most significant bits of DRA and DRB to provide new addresses DRA＇（DRA15－DRA4：：B3－B0）and DRB＇ （DRB15－DRB4：：B3－B0）．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | X | 1 | XXXX | XXXX |
| 10 | 32－way Branch | 101 | 111 | 000 | 0 | $X$ | $X$ | 0040 | 0030 |


*no register decrement
Figure 11. Conditional Branch


Figure 12. Three-Way Branch

* no register decrement

Figure 13. Thirty-Two-Way Branch

## Repeat

To Repeat（Instruction 10），INC must be programmed low and $\overline{\mathrm{CC}}$ high one cycle ahead of Instruction 10 for pipelining．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  |  | XXX | XXX | XXX | X | 1 | 0 | $X X X X$ |
| OXXX |  |  |  |  |  |  |  |  |  |

## Repeat on Stack

To Continue and push the microprogram counter onto the stack（Push），INC and $\overline{\mathrm{CC}}$ must be forced high one cycle ahead for pipelining．

To Repeat（Instruction 12），an BR S instruction with ZERO $=\mathrm{L}$ is used．To avoid a ZERO $=\mathrm{H}$ condition，registers are not decremented during this instruction（see Table 6 for possible register operations．$\overline{C C}$ and INC are programmed high in Instruction 12 to set up the Continue in Instruction 11.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue／Push | 110 | 100 | XXX | X | 1 | 1 | XXXX | XXXX |
| 11 | Continue | 110 | 111 | XXX | 0 | 0 | X | XXXX | XXXX |
| 12 | BR Stack | 010 | 111 | 000 | 0 | 1 | 1 | $X X X X$ | $X X X X$ |



Figure 14．Repeat


## Repeat Until $\overline{\mathbf{C C}}=\mathbf{H}$

To Continue and push the microprogram counter onto the stack（Push），INC and $\overline{\mathrm{CC}}$ must be forced high one cycle ahead for pipelining．

To Repeat Until $\overline{\mathrm{CC}}=\mathrm{H}$（Instruction 12），use a BR S instruction with $\overline{\mathrm{CC}}=\mathrm{L}$ and CONT／RPT：POP instruction with $\overline{\mathrm{CC}}=\mathrm{H}$ ．To avoid a ZERO $=\mathrm{H}$ condition，registers are not decremented（See Table 6 for possible register operations）．$\overline{\mathrm{CC}}$ and INC are programmed high in Instruction 12 to set up the Continue in Instruction 11．A consequence of this is that the instruction following 13 cannot be conditional．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |  |
| 10 | Continue／Push | 110 | 100 | XXX | X | 1 | 1 | XXXX | XXXX |  |
| 11 | Continue | 110 | 111 | XXX | 0 | $\dagger$ | 1 | XXXX | XXXX |  |
| 12 | BR Stack else |  |  |  |  |  |  |  |  |  |
|  | Continue | 010 | 010 | 000 | X | 1 | 1 | XXXX | XXXX |  |

## Loop Until Zero

To Continue and push the microprogram counter onto the stack（Push），INC and $\overline{\mathrm{CC}}$ are forced high one cycle ahead for pipelining．Register $A$ is loaded with the loop counter using a Load A instruction from Table 6.

To decrement the loop count，a decrement register A and hold register B instruction from Table 6 is used．To Repeat Else Continue and Pop（decrement the stack pointer）， an instruction from Table 7 with BR $S$ in the ZERO $=\mathrm{L}$ column and CONT／RPT：POP in the $Z E R O=H$ column is used．$\overline{C C}$ is programmed low in Instruction 11 to force the ZERO test in Instruction 12；it is programmed high in Instruction 12 to set up the Continue in Instruction 11.

| Address | Instruction | MUX2－MUXO | S2－S0 | R2－RO | OSEL | $\overline{\mathrm{CC}}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | xxx | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue／Push | 110 | 100 | XXX | 0 | 1 | 1 | XXXX | XXXX |
| 11 | Continue／Load |  |  |  |  |  |  |  |  |
|  | Reg A | 110 | 111 | 010 | 0 | 0 | 1 | XXXX | XXXX |
| 12 | Decrement Reg A； BR S else |  |  |  |  |  |  |  |  |
|  | Continue：Pop | 000 | 010 | 001 | 1 | 1 | 1 | XXXX | XXXX |



Figure 17. Loop Until Zero

## Conditional Loop Until Zero

Two examples of a Conditional Loop on Stack with Exit are presented below．Both use the microcode shown below to branch to the stack on nonzero，continue and pop on zero，and branch to DRA with a pop if $\overline{\mathrm{CC}}=\mathrm{H}$ ．In the first example，the value on the DRA bus is the same as the value in the microprogram counter，making the exit destinations on the $\overline{C C}$ and ZERO tests the same．In the second，the values are different，generating a two－way exit．

To Continue and push the microprogram counter onto the stack（Push），INC must be high．$\overline{\mathrm{CC}}$ is forced high in the preceding instruction for pipelining．

To Continue（Instruction 11），INC must be high．$\overline{\mathrm{CC}}$ must be programmed high in the previous instruction．INC is programmed high to set up the Continue in Instruction 12.

To Decrement and Branch else Exit（Instruction 12），an instruction from Table 7 with $B R S$ in the $Z E R O=L$ column，CONT／RPT：POP in the ZERO $=H$ column and BR A：POP in the $\overline{C C}=H$ column is used．

Example 1：

| Address | Instruction | MUX2－MUXO | S2－S0 | R2－RO | OSEL | $\overline{C D}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue／Push | 110 | 100 | 010 | X | 1 | 1 | XXXX | XXXX |
|  | Load Reg A |  |  |  |  |  |  |  |  |
| 11 | Continue | 110 | 111 | XXX | 0 | $\dagger$ | 1 | XXXX | XXXX |
| 12 | Decrement Reg $A$ ； BR S else |  |  |  |  |  |  |  |  |
|  | Continue：Pop else BR A：Pop | 000 | 010 | 001 | X | 1 | 1 | 0013 | XXXX |

Example 2：

| Address | Instruction | MUX2－MUXO | S2－S0 | R2－RO | OSEL | $\bar{C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue／Push | 110 | 100 | 010 | X | 1 | 1 | XXXX | XXXX |
|  | Load Reg A |  |  |  |  |  |  |  |  |
| 11 | Continue | 110 | 111 | XXX | 0 | $\dagger$ | 1 | XXXX | XXXX |
| 12 | Decrement Reg A； BR S else |  |  |  |  |  |  |  |  |
|  | Continue：Pop else BR A：Pop | 000 | 010 | 001 | X | 1 | 1 | 0025 | XXXX |



## Jump to Subroutine

To Call a Subroutine at address 30，this example uses the instruction from Table 8 with CALL $A$ in the $\overline{C C}=H$ column．$\overline{C C}$ is programmed high in the previous instruction．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Call A | 000 | 110 | XXX | X | X | X | 0030 | XXXX |

## Conditional Jump to Subroutine

To conditionally Call a Subroutine at address 20，this example uses an instruction from Table 8 with CALL $A$ in the $\overline{C C}=L$ column and CONT／RPT in the $\overline{C C}=H$ column．$\overline{\mathrm{CC}}$ is generated by external status during the preceding instruction．INC is programmed high in the preceding instruction to set up the Continue．To avoid a ZERO $=\mathrm{H}$ condition，registers should not be decremented during Instruction 10.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | CC＇ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） <br> 10 | Call A else <br> Continue | 110 | 101 | 000 | XXX | XXX | XXX | X | $\dagger$ |
|  |  | 1 | X | X | 0020 | XXXX |  |  |  |

${ }^{\dagger}$ Selected from external status

## Two－Way Jump to Subroutine

To perform a Two－Way Call to Subroutine at address 20 or address 30，this example uses an instruction from Table 8 with CALL $A$ in the $\overline{C C}=L$ column and CALL B in the $\overline{\mathrm{CC}}=\mathrm{H}$ column．In this example，$\overline{\mathrm{CC}}$ is generated by external status during the preceding（set－up）instruction．INC is programmed high in the preceding instruction to set up the Push．To avoid a ZERO $=\mathrm{H}$ condition，registers should not be decremented during Instruction 10.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  |  |  |  |  |  |  |  |  |
| 23 | Call A else | XXX | XXX | XXX | X | $\dagger$ | 1 | XXXX | XXXX |
|  | Call B | 100 | 110 | 000 | X | X | X | 0020 | 0030 |



Figure 19. Jump to Subroutine

*no register decrement
Figure 20. Conditional Jump to Subroutine


Figure 21. Two-Way Jump to Subroutine

## Return from Subroutine

To Return from a subroutine，this example uses an instruction from Table 10 with RET in the $\overline{\mathrm{CC}}=\mathrm{L}$ column．$\overline{\mathrm{CC}}$ is programmed low in the previous instruction．To avoid a ZERO $=\mathrm{H}$ condition，registers are not decremented during Instruction 23.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 0 | X | XXXX | XXXX |
| 23 | Return | 010 | 011 | 000 | X | X | X | XXXX | XXXX |

## Conditional Return from Subroutine

To conditionally Return from a Subroutine，this example uses an instruction from Table 10 with RET in the $\overline{C C}=L$ column and CONT／RPT in the $\overline{C C}=H$ column． $\overline{\mathrm{CC}}$ is selected from external status in the previous instruction．To avoid a ZERO $=\mathrm{H}$ condition，registers are not decremented during Instruction 23.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） <br> 23 | Return else <br> Continue | XXX | XXX | XXX | X | $\dagger$ | 1 | XXXX | XXXX |
|  | 010 | 011 | 000 | X | X | X | XXXX | XXXX |  |

${ }^{\dagger}$ Selected from external status

## Clear Pointers

To Continue（Instruction 10），INC must be high；$\overline{\mathrm{CC}}$ must be programmed high in the previous instruction．To Clear the Stack and Read Pointers and Branch to address 20 （instruction 11），$\overline{\mathrm{CC}}$ is programmed low in instruction 10 to set up the Branch．To avoid a ZERO $=\mathrm{H}$ condition，registers are not decremented during Instruction 11.

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| （Set－up） |  | XXX | XXX | XXX | X | 1 | 1 | XXXX | XXXX |
| 10 | Continue | 110 | 111 | XXX | 0 | 0 | X | 0020 | XXXX |
| 11 | BR A and Clear |  |  |  |  |  |  |  |  |
|  | SP／RP | 110 | 001 | 000 | X | X | X | XXXX | XXXX |

## Reset

To Reset the＇ACT8818，pull the S2－S0 pins low．This clears the stack and read pointers and places the Y bus into a low state．

| Address | Instruction | MUX2－MUXO | S2－SO | R2－RO | OSEL | $\overline{C C}$ | INC | DRA | DRB |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 10 | Reset | XXX | 000 | XXX | X | X | X | XXXX | XXXX |


*no register decrement
Figure 22. Return from Subroutine


SN74ACT8818
*no register decrement
Figure 23. Conditional Return from Subroutine

*no register decrement
Figure 24. Clear Pointers

## Overview

## SN74ACT8818 16-Bit Microsequencer <br> 2

SN74ACT8832 32-Bit Registered ALU
3

## SN74ACT8836 32-×32-Bit Parallel Multiplier

SN74ACT8841 Digital Crossbar Switch 6
SN74ACT8847 64-Bit Floating Point/Integer Processor

## Support

# SN74ACT8832 CMOS 32-Bit Registered ALU 

\author{

- 50-ns Cycle Time <br> - Low-Power EPIC ${ }^{\text {TM }}$ CMOS <br> - Three-Port I/O Architecture <br> - 64-Word by 36-Bit Register File <br> - Simultaneous ALU and Register Operations <br> - Configurable as Quad 8-Bit or Dual 16-Bit Single Instruction, Multiple Data Machine <br> - Parity Generation/Checking
}

The SN74ACT8832 is a 32-bit registered ALU that can operate at 20 MHz and 20 MIPS (million instructions per second). Most instructions can be performed in a single cycle. The 'ACT8832 was designed for applications that require highspeed logical, arithmetic, and shift operations and bit/byte manipulations.

The 'ACT8832 can act as host CPU or can accelerate a host microprocessor. In high-performance graphics systems, the 'ACT8832 generates display-list memory addresses and controls the display buffer. In I/O controller applications, the 'ACT8832 performs high-speed comparisons to initialize and end data transfers.

A three-operand, 64-word by 36-bit register file allows the 'ACT8832 to create an instruction and store the previous result in a single cycle.

EPIC is a trademark of Texas Instruments Incorporated.
$n$
$Z$
1
1
$\vdots$
-1
0
0
$N$
$N$

## Contents

Page
Introduction ..... 3-13
Understanding Microprogrammed Architecture ..... 3-13
'ACT8832 Registered ALU ..... 3-13
Support Tools ..... 3-14
Design Support ..... 3-15
Systems Expertise ..... 3-15
'ACT8832 Pin Descriptions ..... 3-16
'ACT8832 Specification Tables ..... 3-25
'ACT8832 Registered ALU ..... 3-28
Architecture ..... 3-28
Data Flow ..... 3-29
Architectural Elements ..... 3-31
Three-Port Register File ..... 3-31
R and S Multiplexers ..... 3-32
Data Input and Output Ports ..... 3-34
ALU ..... 3-34
ALU and MO Shifters ..... 3-36
Bidirectional Serial I/O Pins ..... 3-36
MO Register ..... 3-37
Conditional Shift Pin ..... 3-37
Master/Slave Comparator ..... 3-37
Divide/BCD Flip-Flops ..... 3-37
Status ..... 3-38
Input Data Parity Check ..... 3-38
Test Pins ..... 3-38
Instruction Set Overview ..... 3-39
Arithmetic/Logic Instructions with Shifts ..... 3-43
Other Arithmetic Instructions ..... 3-46
Data Conversion Instructions ..... 3-48
Bit and Byte Instructions ..... 3-49
Other Instructions ..... 3-49
Configuration Options ..... 3-50
Masked 32-Bit Operation ..... 3-50
Shift Instructions ..... 3-50
Bit and Byte Instructions ..... 3-51
Status Selection ..... 3-51

## Contents (Continued)

Page
Instruction Set ..... 3-52
ABS ..... 3-53
ADD ..... 3-55
ADDI ..... 3-57
AND ..... 3-59
ANDNR ..... 3-61
BADD ..... 3-63
BAND ..... 3-65
BCDBIN ..... 3-67
BINCNS ..... 3-70
BINCS ..... 3-72
BINEX3 ..... 3-74
BOR ..... 3-76
BSUBR ..... 3-78
BSUBS ..... 3-80
BXOR ..... 3-82
CLR ..... 3-84
CRC ..... 3-85
DIVRF ..... 3-88
DNORM ..... 3-90
DUMPFF ..... 3-92
EX3BC ..... 3-94
EX3C ..... 3-96
INCNR ..... 3-99
INCNS ..... 3-101
INCR ..... 3-103
INCS ..... 3-105
LOADFF ..... 3-107
LOADMQ ..... 3-109
MOSLC ..... 3-111
MOSLL ..... 3-113
MOSRA ..... 3-115
MOSRL ..... 3-117
NAND ..... 3-119
NOP ..... 3-121
NOR ..... 3-123
OR ..... 3-125
PASS ..... 3-127

## Contents (Concluded)

Page
SDIVI ..... 3-129
SDIVIN ..... 3-131
SDIVIS ..... 3-133
SDIVIT ..... 3-135
SDIVO ..... 3-137
SDIVQF ..... 3-139
SEL ..... 3-141
SETO ..... 3-143
SET1 ..... 3-145
SLA3-147
SLAD ..... 3-149
SLC ..... 3-151
SLCD ..... 3-153
SMTC ..... 3-155
SMULI ..... 3-157
SMULT ..... 3-159
SNORM ..... 3-161
SRA ..... 3-163
SRAD ..... 3-165
SRC ..... 3-167
SRCD ..... 3-169
SRL ..... 3-171
SRLD ..... 3-173
SUBI ..... 3-175
SUBR ..... 3-177
SUBS ..... 3-179
TBO ..... 3-181
TB1 ..... 3-183
UDIVI ..... 3-185
UDIVIS ..... 3-187
UDIVIT ..... 3-189
UMULI ..... 3-191
XOR ..... 3-193

3
てع8810ヤ七LNS

## List of Illustrations

Figure Title Page
1 Microprogrammed System Block Diagram ..... 3-14
2 SN74ACT8832 GB Package ..... 3-16
3 SN74ACT8832 Logic Symbol ..... 3-17
4 'ACT8832 32-Bit Registered ALU ..... 3-30
5 Data I/O ..... 3-31
6 16-Bit Configuration ..... 3-34
7 8-Bit Configuration ..... 3-35
8 Shift Examples, 32-Bit Configuration ..... 3-44
9 Shift Examples, 16-Bit Configuration ..... 3-51
10 Shift Examples, 8-Bit Configuration ..... 3-52

3
てع8810ヤャLNS

## List of Tables

Table Title Page
1 SN74ACT8832 Pin Grid Allocation ..... 3-18
2 SN74ACT8832 Pin Description ..... 3-19
3 Recommended Operating Conditions ..... 3-25
4 Electrical Characteristics ..... 3-26
5 Register File Write Setup ..... 3-26
6 Maximum Switching Characteristics ..... 3-27
7 'ACT8832 Response to Control Inputs ..... 3-29
8 RF MUX Select Inputs ..... 3-32
9 ALU Source Operand Selects ..... 3-32
10 Destination Operand Select/Enables ..... 3-33
11 Configuration Mode Selects ..... 3-36
12 Data Determining SIO Input ..... 3-36
13 Data Determining BYOF Outputs ..... 3-38
14 Test Pin Inputs ..... 3-39
15 'ACT8832 Instruction Set ..... 3-39
16 Shift Definitions ..... 3-44
17 Bidirectional SIO Pin Functions ..... 3-45
18 Signed Multiplication Algorithm ..... 3-46
19 Unsigned Multiplication Algorithm ..... 3-46
20 Mixed Multiplication Algorithm ..... 3-46
21 Signed Division Algorithm ..... 3-47
22 Unsigned Division Algorithm ..... 3-47
23 BCD to Binary Algorithm ..... 3-48
24 Binary to Excess-3 Algorithm ..... 3-49
25 CRC Algorithm ..... 3-50

## Introduction

The SN74ACT8832 Registered Arithmetic/Logic Unit (ALU) holds a primary position in the Texas Instruments family of innovative 32-bit LSI devices. Compatible with the SN74AS888 architecture and instruction set, the 'ACT8832 performs as a high-speed microprogrammable 32-bit registered ALU which can also be configured to operate as two 16-bit ALUs or four 8-bit ALUs in single-instruction, multiple-data (SIMD) mode.

Besides introducing the 'ACT8832, this section discusses basic concepts of microprogrammed architecture and the support tools available for system development. Details of the 'ACT8832 architecture and instruction set are presented. Pin descriptions and assignments for the 'ACT8832 are also presented.

## Understanding Microprogrammed Architecture

Figure 1 shows a simple microprogrammed system. The three basic components are an arithmetic/logic unit, a microsequencer, and a memory. The program that resides in this memory is commonly called the microprogram, while the memory itself is referred to as a micromemory or control store. The ALU performs all the required operations on data brought in from the external environment (main memory or peripherals, for example). The sequencer is dedicated to generating the next micromemory address from which a microinstruction is to be fetched. The sequencer and the ALU operate in parallel so that data processing and next-address generation are carried out concurrently.

The microprogram instruction, or microinstruction, consists of control information to the ALU and the sequencer. The microinstruction consists of a number of fields of code that directly access and control the ALU, registers, bus transceivers, multiplexers, and other system components. This high degree of programmability in a parallel architecture offers greater speed and flexibility than a typical microprocessor, although the microinstruction serves the same purpose as a microprocessor opcode: it specifies control information by which the user is able to implement desired data processing operations in a specified sequence. The microinstruction cycle is synchronized to a system clock by latching the instruction in the microinstruction, or pipeline, register once for each clock cycle. Status results are collected in a status register which the sequencer samples to produce conditional branches within the microprogram.

## 'ACT8832 Registered ALU

This device comprises a 32 -bit ALU, a 64 -word by 36 -bit register file, two shifters to support double-precision arithmetic, and three independent bidirectional data ports.

The 'ACT8832 is engineered to support high-speed, high-level operations. The ALU's 13 basic arithmetic and logic instructions can be combined with a single- or doubleprecision shift operation in one instruction cycle. Other instructions support data conversions, bit and byte operations, and other specialized functions.


Figure 1. Microprogrammed System Block Diagram
The configuration of this processor enchances processing throughput in arithmetic and radix conversion. Internal generation and testing of status results in fast processing of division and multiplication algorithms. This decision logic is transparent to the user; the reduced overhead assures shorter microprograms, reduced hardware complexity, and shorter software development time.

## Support Tools

Texas Instruments has designed a family of low-cost, real-time evaluation modules (EVM) to aid with initial hardware and microcode design. Each EVM is a small selfcontained system which provides a convenient means to test and debug simple microcode, allowing software and hardware evaluation of components and their operation.

At present, the 74AS-EVM-8 Bit-Slice Evaluation Module has been completed, and 16- and 32-bit EVMs are in advanced stages of development. EVMs and support tools for other devices in the 'ACT8800 family are also planned for future development.

## Design Support

TI's '8832 32-bit registered ALU is supported by a variety of tools developed to aid in design evaluation and verification. These tools will streamline all stages of the design process, from assessing the operation and performance of the ' 8832 to evaluating a total system application. The tools include a functional model, behavioral model, and microcode development software and hardware. Section 8 of this manual provides specific information on the design tools supporting TI's SN74ACT8800 Family.

## Systems Expertise

Texas Instruments VLSI Logic applications group is available to help designers analyze TI's high-performance VLSI products, such as the '8832 32-bit registered ALU. The group works directly with designers to provide ready answers to device-related questions and also prepares a variety of applications documentation.

The group may be reached in Dallas, at (214) 997-3970.

## 'ACT8832 Pin Descriptions

Pin descriptions and grid allocations for the 'ACT8832 are given on the following pages.

$$
\begin{gathered}
\text { GB . . PACKAGE } \\
\text { (TOP VIEW) }
\end{gathered}
$$

$$
\begin{array}{lllllllllllllllll}
1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 & 17
\end{array}
$$



Figure 2. SN74ACT8832 . . . GB Package


Figure 3. SN74ACT8832 . . . Logic Symbol

Table 1. SN74ACT8832 Pin Grid Allocation


Table 2. SN74ACT8832 Pin Description

| PIN |  | $1 / 0$ | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. | $1 / 0$ | DESCRIPTION |
| AO | R7 | 1 | Register file A port read address select |
| A1 | P7 |  |  |
| A2 | T6 |  |  |
| A3 | S6 |  |  |
| A4 | R6 |  |  |
| A5 | P6 |  |  |
| B0 | T12 | 1 | Register file B port read address select |
| B1 | R10 |  |  |
| B2 | S11 |  |  |
| B3 | T11 |  |  |
| B4 | S10 |  |  |
| B5 | T10 |  |  |
| BYOFO | B2 | 0 | Status signals indicate overflow conditions in certain data bytes |
| BYOF1 | A4 |  |  |
| BYOF2 | D13 |  |  |
| BYOF3 | C17 |  |  |
| C | C10 | 0 | Status signal representing carry out condition |
| CO | S13 | 1 | Register file write address select |
| C1 | T14 |  |  |
| C2 | R11 |  |  |
| C3 | S12 |  |  |
| C4 | P10 |  |  |
| C5 | T13 |  |  |
| CFO | E2 | 1 | Configuration mode select, single 32 -bit, two 16 -bit, or four 8-bit ALU's |
| CF1 | D1 |  |  |
| CF2 | F4 |  |  |
| Cn | F2 | 1 | ALU carry input |
| CLK | F3 | 1 | Clocks synchronous registers on positive edge |
| DAO | K3 | I/O | A port data bus. Outputs register data $(\overline{O E A}=0)$ or inputs external data ( $\overline{O E A}=1$ ). |
| DA1 | M1 |  |  |
| DA2 | L2 |  |  |
| DA3 | N1 |  |  |
| DA4 | M2 |  |  |
| DA5 | P1 |  |  |
| DA6 | N2 |  |  |
| DA7 | M3 |  |  |
| DA8 | T2 |  |  |
| DA9 | P4 |  |  |

Table 2．SN74ACT8832 Pin Description（Continued）

| PIN |  | 1／0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO． |  |  |
| DA10 | S3 | 1／0 | A port data bus．Outputs register data $(\overline{O E A}=0)$ or inputs external data（ $\overline{\mathrm{OEA}}=1$ ）． |
| DA11 | R4 |  |  |
| DA12 | T3 |  |  |
| DA13 | S4 |  |  |
| DA14 | T4 |  |  |
| DA15 | P5 |  |  |
| DA16 | P14 |  |  |
| DA17 | T17 |  |  |
| DA18 | P15 |  |  |
| DA19 | N14 |  |  |
| DA20 | R16 |  |  |
| DA21 | S17 |  |  |
| DA22 | P16 |  |  |
| DA23 | N15 |  |  |
| DA24 | K15 |  |  |
| DA25 | K16 |  |  |
| DA26 | K17 |  |  |
| DA27 | J16 |  |  |
| DA28 | J15 |  |  |
| DA29 | J17 |  |  |
| DA30 | H17 |  |  |
| DA31 | H16 |  |  |
| DB0 | G1 | 1／0 | $B$ port data bus．Outputs register data $(\overline{\mathrm{OEB}}=0)$ or used to input external data（ $\overline{\mathrm{OEB}}=1$ ） |
| DB1 | H2 |  |  |
| DB2 | H1 |  |  |
| DB3 | J1 |  |  |
| DB4 | J2 |  |  |
| DB5 | J3 |  |  |
| DB6 | K1 |  |  |
| DB7 | K2 |  |  |
| DB8 | P2 |  |  |
| DB9 | N3 |  |  |
| DB10 | S1 |  |  |
| DB11 | R2 |  |  |
| DB12 | P3 |  |  |
| DB13 | N4 |  |  |
| DB14 | T1 |  |  |
| DB15 | S2 |  |  |

Table 2. SN74ACT8832 Pin Description (Continued)

| PIN |  | I/O | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  | DESCRIPTION |
| DB16 | T15 | 1/O | $B$ port data bus. Outputs register data $(\overline{O E B}=0)$ or used to input external data $(\overline{\mathrm{OEB}}=1)$ |
| DB17 | S14 |  |  |
| DB18 | R13 |  |  |
| DB19 | T16 |  |  |
| DB20 | S15 |  |  |
| DB21 | R14 |  |  |
| DB22 | P13 |  |  |
| DB23 | S16 |  |  |
| DB24 | R17 |  |  |
| DB25 | N16 |  |  |
| DB26 | M15 |  |  |
| DB27 | P17 |  |  |
| DB28 | M16 |  |  |
| DB29 | N17 |  |  |
| DB30 | L16 |  |  |
| DB31 | M17 |  |  |
| $\overline{E A}$ | G2 | 1 | ALU input operand select. High state selects external DA bus and low state selects register file |
| $\begin{aligned} & \mathrm{EBO} \\ & \mathrm{~EB} 1 \end{aligned}$ | $\begin{aligned} & \hline \text { G3 } \\ & \text { F1 } \end{aligned}$ | 1 | ALU input operand select. Selects between register file, external DB port and MO register |
| GND | C8 |  |  |
| GND | D6 |  |  |
| GND | D7 |  |  |
| GND | D8 |  |  |
| GND | D10 |  |  |
| GND | D11 |  |  |
| GND | D12 |  |  |
| GND | G4 |  |  |
| GND | G14 |  | Ground pins. All ground pins must be used. |
| GND | H4 |  | Ground pins. All ground pins must be used. |
| GND | H14 |  |  |
| GND | K4 |  |  |
| GND | K14 |  |  |
| GND | L4 |  |  |
| GND | L14 |  |  |
| GND | M4 |  |  |
| GND | P9 |  |  |
| GND | P12 |  |  |

Table 2．SN74ACT8832 Pin Description（Continued）

| PIN |  | 1／0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | No． |  |  |
| 10 | D17 | 1 | Instruction input |
| 11 | F15 |  |  |
| 12 | E16 |  |  |
| 13 | E17 |  |  |
| 14 | F16 |  |  |
| 15 | G15 |  |  |
| 16 | F17 |  |  |
| 17 | G16 |  |  |
| $\overline{\text { IESIOO }}$ | A8 | 1 | Shift pin enables，increases system speed and reduces bus conflict，active low |
| $\overline{\text { IESIO1 }}$ | A7 |  |  |
| $\overline{\text { ESIO2 }}$ | B7 |  |  |
| IESIO3 | B6 |  |  |
| MSERR | B11 | 0 | Master Slave Error pin，indicates error between data at Y output MUX and external Y port |
| N | A10 | 0 | Output status signal representing sign condition |
| $\overline{O E A}$ | T5 | 1 | DA bus enable，active low |
| $\overline{\text { OEB }}$ | R12 | 1 | DB bus enable，active low |
| $\overline{\text { OES }}$ | A11 | 1 | Status enable，active low |
| $\overline{\text { OEYO }}$ | C3 | 1 | Y bus output enable，active low |
| $\overline{\text { OEY1 }}$ | C7 |  |  |
| $\overline{\mathrm{OEY}} \mathbf{}$ | C14 |  |  |
| $\overline{\mathrm{OEY}}$ | F14 |  |  |
| OVR | B10 | 0 | Output status signal represents overflow condition |
| PAO | R1 | 1／0 | Parity bits port for DA data |
| PA1 | R5 |  |  |
| PA2 | M14 |  |  |
| PA3 | G17 |  |  |
| PBO | L1 | I／O | Parity bits port for DB data |
| PB1 | R3 |  |  |
| PB2 | R15 |  |  |
| PB3 | L17 |  |  |
| PERRA | S5 | 0 | DA data parity error，signals error if an even parity check fails for any byte |
| PERRB | P11 | 0 | DB data parity error，signals error if an even parity check fails for any byte |
| PERRY | C11 | 0 | Y data parity error，signals error if an even parity check fails for any byte |

Table 2．SN74ACT8832 Pin Description（Continued）

| PIN |  |  | I／O |
| :--- | :--- | :--- | :--- |
| NAME | NO． |  | DESCRIPTION |
| PYO | D4 |  |  |
| PY1 | B5 | I／O | Y port parity data，input and output |
| PY2 | B15 |  |  |
| PY3 | C16 |  |  |
| RFCLK | S9 | I | Register File Clock，allows multiple writes to be <br> performed in one master clock cycle |
| SELMQ | E1 | I | MQ register select，selects output of ALU shifter or <br> MQ register to be placed on Y bus |
| SELRF0 | T9 | I | Register File select．Controls selection of the <br> RELRF1 |
| Register Fite（RF）inputs by the RF MUX |  |  |  |

Table 2. SN74ACT8832 Pin Description (Concluded)

| PIN |  |  | I/O |
| :--- | :--- | :--- | :--- |
| NAME | NO. |  |  |
| YO | E3 |  |  |
| Y1 | DESCRIPTION |  |  |
| Y2 | C1 |  |  |
| Y3 | D3 |  |  |
| Y4 | E4 |  |  |
| Y5 | C2 |  |  |
| Y6 | B1 |  |  |
| Y7 | A1 |  |  |
| Y8 | D5 |  |  |
| Y9 | C4 |  |  |
| Y10 | B3 |  |  |
| Y11 | C5 |  |  |
| Y12 | B4 |  |  |
| Y13 | A2 |  |  |
| Y14 | C6 |  |  |
| Y15 | A3 | I/O | Y port data bus |
| Y16 | B12 |  |  |
| Y17 | C12 |  |  |
| Y18 | A13 |  |  |
| Y19 | B13 |  |  |
| Y20 | A14 |  |  |
| Y21 | B14 |  |  |
| Y22 | C13 |  |  |
| Y23 | A15 |  |  |
| Y24 | A16 |  |  |
| Y25 | A17 |  |  |
| Y26 | B16 |  |  |
| Y27 | D14 |  |  |
| Y28 | C15 |  |  |
| Y29 | B17 |  |  |
| Y30 | E14 |  |  |
| Y31 | D15 |  |  |
| Z | B9 | O |  |

## ＇ACT8832 Specification Tables

## absolute maximum ratings over operating free－air temperature range （unless otherwise noted）${ }^{\dagger}$

Supply voltage，VCC ．．．．．．．．．．．．．．．．．．．．．．．．．．．-0.5 V to 6 V


Continuous output current， $\mathrm{I}_{\mathrm{O}}\left(\mathrm{V}_{\mathrm{O}}=0\right.$ to $\mathrm{V}_{\mathrm{C}}$ ）$\ldots . . . . . . . . . \pm 50 \mathrm{~mA}$
Continuous current through $\mathrm{V}_{\mathrm{C}}$ or GND pins ．．．．．．．．．．．．．$\pm 100 \mathrm{~mA}$
Operating free－air temperature range ．．．．．．．．．．．．．．．．．． $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$
Storage temperature range ．．．．．．．．．．．．．．．．．．．．$-65^{\circ} \mathrm{C}$ to $150^{\circ} \mathrm{C}$
${ }^{\dagger}$ Stresses beyond those listed under＂absolute maximum ratings＂may cause permanent damage to the device． These are stress ratings only and functional operation of the device at these or any other conditions beyond those indicated under＂recommended operating conditions＂is not implied．Exposure to absolute－maximum－ rated conditions for extended periods may affect device reliability．

Table 3．Recommended Operating Conditions

| PARAMETER | MIN | NOM | MAX | UNIT |
| :--- | ---: | ---: | ---: | :---: |
| $\mathrm{V}_{\mathrm{CC}}$ Supply voltage | 4.5 | 5.0 | 5.5 | V |
| $\mathrm{~V}_{\text {IH }}$ High－level input voltage | 2 |  | $\mathrm{~V}_{\mathrm{CC}}$ | V |
| $\mathrm{V}_{\text {IL }}$ Low－level input voltage | 0 | 0.8 | V |  |
| $\mathrm{I}_{\mathrm{OH}}$ High－level output current |  | -8 | mA |  |
| $\mathrm{I}_{\mathrm{OL}}$ Low－level output current |  | 8 | mA |  |
| $\mathrm{~V}_{\mathrm{I}}$ Input voltage | 0 | $\mathrm{~V}_{\mathrm{CC}}$ | V |  |
| $\mathrm{V}_{\mathrm{O}}$ Output voltage | 0 | $\mathrm{~V}_{\mathrm{CC}}$ | V |  |
| $\mathrm{dt} / \mathrm{dv}$ Input transition rise or fall rate | 15 | $\mathrm{~ns} / \mathrm{V}$ |  |  |
| $\mathrm{T}_{\mathrm{A}}$ Operating free－air temperature | 0 | 70 | ${ }^{\circ} \mathrm{C}$ |  |

Table 4. Electrical Characteristics


[^5]Table 6. Maximum Switching Characteristics

| PARAMETER | FROM (INPUT) | TO (OUTPUT) |  |  |  |  |  |  |  |  |  |  | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | Y | C | Z | SIO | PERRA/B | N | OVR | $\begin{aligned} & \hline \mathrm{PA} / \mathrm{B} \\ & \mathrm{DA} / \mathrm{B} \end{aligned}$ | PY | PERRY | MSERR |  |
| ${ }^{t} \mathrm{pd}$ | A5-A0, B5-B0 | 36 | 30 | 37 | 28 |  | 30 | 37 | 16 | 37 |  |  | ns |
|  | DA31-DAO,PA3-PAO <br> DB31-DBO,PB3-PBO | 36 | 25 | 37 | 25 | 20 | 28 | 37 |  | 37 |  |  |  |
|  | $\mathrm{C}_{\mathrm{n}}$ | 30 | 22 | 31 | 24 |  | 28 | 28 |  | 32 |  |  |  |
|  | $\overline{E A}$ | 37 | 28 | 37 | 25 |  | 31 | 37 |  | 37 |  |  |  |
|  | EB1-EBO | 37 | 28 | 37 | 25 |  | 31 | 37 |  | 37 |  |  |  |
|  | 17-10 | 37 | 30 | 37 | 28 |  | 32 | 37 |  | 37 |  |  |  |
|  | CF2-CFO | 37 | 30 | 37 | 28 |  | 32 | 37 |  | 37 |  |  |  |
|  | $\overline{O E B}, \overline{O E A}$ |  |  |  |  |  |  |  | 15 |  |  |  |  |
|  | $\overline{\text { OEY3-OEYO }}$ | 20 |  |  |  |  |  |  |  | 20 |  |  |  |
|  | SELMQ | 15 |  |  |  |  |  |  |  | 20 |  |  |  |
|  | SIO3-SIOO | 15 |  | 25 |  |  | 25 |  |  | 27 |  |  |  |
|  | CLK | 21 |  |  |  |  |  |  |  | 28 |  |  |  |
|  | CLKMQ | 37 |  |  |  |  |  |  |  | 37 |  |  |  |
|  | RCLK | 37 | 32 | 37 | 24 |  | 32 | 37 |  | 37 |  |  |  |
|  | $\overline{\text { IESIO3-IESIOO }}$ | 15 |  | 25 |  |  | 25 |  |  | 27 |  |  |  |
|  | SSF | 25 |  | 30 | 22 |  | 30 | 22 |  | 30 |  |  |  |
|  | Y |  |  |  |  |  |  |  |  |  | 15 | 15 |  |

SN74ACT8832 $\omega$

## ＇ACT8832 Registered ALU

The SN74ACT8832 is a 32－bit registered ALU that can be configured to operate as four 8－bit ALUs，two 16 －bit ALUs，or a single 32－bit ALU．The processor instruction set is 100 percent upwardly compatible with the＇AS888 and includes 13 arithmetic and logical functions with 8 conditional shifts，multiplication，division，normalization， add and subtract immediate，bit and byte operations，and data conversions such as $B C D$ ，excess－3，and sign magnitude．New instructions permit internal flip－flops controlling BCD and divide operations to be loaded or read．

Additional functions added to the＇ACT8832 include byte parity and master／slave operation．Parity is checked at the three data input ports and generated at the Y output port．The 64 －word register file is 36 bits wide to permit storage of the parity bits． Master／slave comparator circuitry is provided at the Y port．

The DA and DB ports can simultaneously input data to the ALU and the 64 －word by 36－bit register file．Data and parity from the register file can be output on the DA and DB ports．Results of ALU and shift operations are output at the bidirectional Y port． The Y port can also be used in an input mode to furnish external data to the register file or during master／slave operation as an input to the master／slave comparator．

Three 6－bit address ports allow a two－operand fetch and an operand write to be performed at the register file simultaneously．An MQ shifter and MQ register can also be configured to function independently to implement double－precision 8 －bit，16－bit， and 32－bit shift operations．An internal ALU bypass path increases the speeds of multiply，divide and normalize instructions．The path is also used by＇ACT8832 instructions that permit bits and bytes to be manipulated．

## Architecture

Figure 4 is a functional block diagram of the＇ACT8832．Control input signals are summarized in Table 7．Data flow and details of the functional elements are presented in the following paragraphs．

Table 7. 'ACT8832 Response to Control Inputs

| SIGNAL | HIGH | LOW |
| :--- | :--- | :--- |
| CF2-CFO | See Table 11 | See Table 11 |
| $\overline{\mathrm{EA}}$ | Selects external DA bus | Selects register file |
| EB1-EBO | See Table 9 | See Table 9 |
| $\overline{\text { IESIO3-IESIOO }}$ | Normal operation | Force corresponding SIO <br> inputs to high impedance |
| I7-IO | See Table 15 | See Table 15 |
| MQSEL | Selects MQ register | Selects ALU |
| $\overline{\text { OEA }}$ | Inhibits DA and PA output | Enables DA and PA output |
| $\overline{\text { OEB }}$ | Inhibits DB and PB output | Enables DB and PB output |
| $\overline{\text { OEY3- } \overline{O E Y O}}$ | Inhibits Y and PY outputs | Enables Y and PY outputs |
| SELRF1-SELRFO | See Table 8 | See Table 8 |
| SSF | Selects shifted ALU output | Selects ALU (unshifted) output |
| TP1-TPO | See Table 14 | See Table 14 |
| $\overline{\text { WE3-WEO }}$ | Inhibits register file write | Byte enables for register file <br> write (0 $=$ LSB) |

## Data Flow

As shown in Figure 5, data enters the 'ACT8832 from three primary sources: the bidirectional Y port, which is used in an input mode to pass data to the register file; and the bidirectional DA and DB ports, used to input data to the register file or the $R$ and $S$ buses serving the ALU. Three associated I/O ports (PY, PA, and PB) are provided for associated parity data input and output.

Data is input to the ALU through two multiplexers: R MUX, which selects the R bus operand from the DA port or the register file addressed by A5-AO; and S MUX, which selects data from the DB port, the register file addressed by B5-BO, or the multiplierquotient (MQ) register.

The result of the ALU operation is passed to the ALU shifter, where it is shifted or passed without shift to the $Y$ bus for possible output from the 'ACT8832 and to the feedback MUX for possible storage in the internal register file. The MQ shifter, which operates in parallel with the ALU shifter, can be loaded from the ALU or the MQ register. The MO shift result is passed to the MO register, where it can be routed through the S MUX to the ALU or to the Y MUX for output from the chip.

An internal bypass path allows data from the S MUX to be loaded directly into the ALU shifter or the divide/BCD flip-flops. Data from the divide/BCD flip-flops can be output via the MO register.


Figure 4. 'ACT8832 32-Bit Registered ALU


Figure 5. Data I/O
Data can be output from the three bidirectional ports, $\mathrm{Y}, \mathrm{DA}$, and DB, and their associated parity ports, PY, PA, and PB. DA and DB can also be used to read ALU input data on the $R$ and $S$ buses for debug or other special purposes.

## Architectural Elements

## Three-Port Register File

The register file is 36 bits wide, permitting storage of a 32 -bit data word with its associated parity bits. The 64 registers are accessed by three address ports. C5-CO address the destination register during write operations; A5-AO and B5-BO address any two registers during read operations. The address buses are also used to furnish
immediate data to the ALU：A3－A0 to provide constant data for the add and subtract immediate instructions；C3－CO and A3－AO to provide masks for set，reset，and test bit operations．

Data is written into the register file when the write enable is low and a low－to－high register file clock（RFCLK）transition occurs．The separate register file clock allows multiple writes to be performed in one master clock cycle，allowing processors in multi－ processor environments to update one another＇s internal register files during a single cycle．

Four write enable inputs are provided to allow separate control of data inputs in a byte－ oriented system．$\overline{W E 3}$ is the write enable for the most significant byte．

Register file inputs are selected by the RF MUX under the control of two register file select signals，SELRF1 and SELRFO，shown in Table 8 （see also Table 10）．

Table 8．RF MUX Select Inputs

| SELRF1 | SELRFO | SOURCE |
| :---: | :---: | :--- |
| 0 | 0 | External DA input |
| 0 | 1 | External DB input |
| 1 | 0 | Y－output MUX |
| 1 | 1 | External Y port |

## R and S Multiplexers

ALU inputs are selected by the R and S multiplexers．Controls which affect operand selection for instructions other than those using constants or masks are shown in Table 9.

Table 9．ALU Source Operand Selects

| R－BUS OPERAND SELECT EA | S－BUS OPERAND SELECT EB1－EB0 | RESULT DESTINATION | －SOURCE OPERAND |
| :---: | :---: | :---: | :---: |
| 0 |  | R bus | $\leftarrow$ Register file addressed by A5－A0 |
| 1 |  | $R$ bus | $\leftarrow$ DA port |
|  | 00 | $S$ bus | $\leftarrow$ Register file addressed by B5－B0 |
|  | 10 | S bus | $\leftarrow$ DB port |
|  | $\times 1$ | S bus | $\leftarrow$ MQ register |

Table 10. Destination Operand Select/Enables

| REGISTER <br> FILE <br> WRITE <br> ENABLE <br> $\overline{W E}$ | Y BUS OUTPUT <br> ENABLE $\overline{\mathrm{OEY}}$ | Y MUS SELECT MOSEL | REGISTER FILE SELECT RFSEL1-RFSELO |  | DA PORT OUTPUT ENABLE $\overline{\text { OEA }}$ | $\begin{array}{c\|} \hline \text { DB } \\ \text { PORT } \\ \text { OUTPUT } \\ \text { ENABLE } \\ \overline{O E B} \end{array}$ | RESULT DESTINATION | $\leftarrow$ SOURCE |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 1 | 0 | 0 | X | X |  |  | Y/PY | $\leftarrow$ ALU shifter/parity generate |
| 1 | 0 | 1 | X | X |  |  | Y/PY | $\leftarrow$ MO register/parity generate |
| 0 | 0 | 0 | 1 | 0 |  |  | Y/PY, RF | $\leftarrow$ ALU shifter/parity generate |
| 0 | 0 | 1 | 1 | 0 |  |  | Y/PY, RF | $\leftarrow$ MO register/parity generate |
| 0 | 1 | X | 1 | 1 |  |  | RF | $\leftarrow$ External Y/PY |
| 0 | X | X | 0 | 0 | 1 | X | RF | $\leftarrow$ External DA/PA |
| 0 | X | X | 0 | 1 | X | 1 | RF | $\leftarrow$ External DB/PB |
|  |  |  |  |  | 0 |  | DA/PA | $\leftarrow \mathrm{R}$ bus register file output |
|  |  |  |  |  | 1 |  | DA/PA | Hi-Z |
|  |  |  |  |  |  | 0 | DB/PB | $\leftarrow$ S bus register file output |
|  |  |  |  |  |  | 1 | DB/PB | Hi-Z |

## Data Input and Output Ports

The DA and DB ports can be used to load the $S$ and/or $R$ multiplexers from an external source or to read $S$ or $R$ bus outputs from the register file. The $Y$ port can be used to load the register file and to output the next address selected by the Y output multiplexer. Tables 9 and 10 describe the MUX and output controls which affect DA, $D B$, and $Y$.

## ALU

The ALU can perform seven arithmetic and six logical instructions on the two 32-bit operands selected by the R and S multiplexers. It also supports multiplication, division, normalization, bit and byte operations and data conversion, including excess-3 BCD arithmetic. The 'ACT8832 instruction set is summarized in Table 15.

The 'ACT8832 can be configured to operate as a single 32-bit ALU, two 16-bit ALUs, or four 8-bit ALUs (see Figures 6 and 7). It can also be configured to operate on a 32-bit word formed by adding leading zeros to the 12 least significant bits of R bus data. This is useful in certain IBM relative addressing schemes.


Figure 6. 16-Bit Configuration


Figure 7. 8-Bit Configuration

Configuration modes are controlled by three CF inputs as shown in Table 11．These signals also select the data from which status signals other than byte overflow will be generated．

Table 11．Configuration Mode Selects

| CONTROL INPUTS |  |  | MODE SELECTED | DATA FROM WHICH STATUS OTHER <br> THAN BYOF WILL BE GENERATED |
| :---: | :---: | :---: | :---: | :---: |
| CF2 | CF1 | CFO |  | Byte 0 |
| 0 | 0 | 0 | Four 8－bit | Byte 1 |
| 0 | 0 | 1 | Four 8－bit | Byte 2 |
| 0 | 1 | 0 | Four 8－bit | Byte 3 |
| 0 | 1 | 1 | Four 8－bit | Least significant 16－bit word |
| 1 | 0 | 0 | Two 16－bit | Most significant 16－bit word |
| 1 | 0 | 1 | Two 16－bit | 32－bit word |
| 1 | 1 | 0 | One 32－bit | 32－bit word |
|  | 1 | 1 | Masked 32－bit |  |

## ALU and MO Shifters

The ALU and MO shifters are used in all of the shift，multiply，divide and normalize functions．They can be used independently for single precision or concurrently for double precision shifts．Shifts can be made conditional，using the Special Shift Function （SSF）pin．

## Bidirectional Serial I／O Pins

Four bidirectional $\overline{\text { SIO }}$ pins are provided to supply an end fill bit for certain shift instructions．These pins may also be used to read bits that are shifted out of the ALU or MO shifters during certain instructions．Use of the $\overline{\mathrm{SIO}}$ pins as inputs or outputs is summarized in Table 17.

The four pins allow separate control of end fill inputs in configurations other than 32－bit mode（see Table 12 and Figure 4）．

Table 12．Data Determining $\overline{\text { SIO }}$ Input

| SIGNAL | CORRESPONDING WORD，PARTIAL WORD OR BYTE |  |  |
| :---: | :---: | :---: | :---: |
|  | 32－BIT MODE | 16－BIT MODE | 8－BIT MODE |
| $\overline{\mathrm{SIO}}$ | - | - | Byte 3 |
| $\overline{\mathrm{SIO2}}$ | - | most significant word | Byte 2 |
| $\overline{\mathrm{SIO1}}$ | - | - | Byte 1 |
| $\overline{\text { SIOO }}$ | 32－bit word | least significant word | Byte 0 |

To increase system speed and reduce bus conflict, four $\overline{\text { SIO }}$ input enables ( $\overline{\mathrm{IESIO}}-\overline{\mathrm{IESIOO}}$ ) are provided. A low on these enables will override internal pull-up resistor logic and force the corresponding $\overline{\mathrm{SIO}}$ pins to the high impedance state required before an input signal can appear on the signal line. If the $\overline{\mathrm{SIO}}$ enables are not used, this condition is generated internally in the chip. Use of the enables allow internal decoding to be bypassed, resulting in faster speeds.

The $\overline{\text { IESIOs }}$ are defaulted to a high because of internal pull-up resistors. When an $\overline{S I O}$ pin is used as an output, a low on its corresponding IESIO pin would force $\overline{\text { SIO }}$ to a high impedance state. The output would then be lost, but the internal operation of the chip would not be affected.

## MO Register

Data from the MQ shifter is written into the MQ register when a low-to-high transition occurs on clock CLK. The register has specific functions in double precision shifts, multiplication, division and data conversion algorithms and can also be used as a temporary storage register. Data from the register file and the DA and DB buses can be passed to the MO register through the ALU.

The Y bus contains the output of the ALU shifter if SELMQ is low and the output of the MQ register if SELMO is high. If $\overline{O E Y}$ is low, ALU or MQ shifter output will be passed to the $Y$ port; if $\overline{\mathrm{OEY}}$ is high, the Y port becomes an input to the feedback MUX.

## Conditional Shift Pin

Conditional shifting algorithms may be implemented using the SSF pin under hardware or firmware control. If the SSF pin is high or floating, the shifted ALU output will be sent to the output buffers. If the SSF pin is pulled low externally, the ALU result will be passed directly to the output buffers, and MQ shifts will be inhibited. Conditional shifting is useful for scaling inputs in data arrays or in signal processing algorithms.

## Master/Slave Comparator

A master/slave comparator is provided to compare data bytes from the Y output MUX with data bytes on the external $Y$ port when $\overline{\mathrm{OEY}}$ is high. If the data are not equal, a high signal is generated on the master slave error output pin (MSERR). A similar comparator is provided for the Y parity bits.

## Divide/BCD Flip-Flops

Internal multiply/divide flip-flops are used by certain multiply and divide instructions to maintain status between instructions. Internal excess-3 BCD flip-flops preserve the carry from each nibble in excess-3 BCD operations. The BCD flip-flops are affected by all instructions except NOP and are cleared when a CLR instruction is executed. The flip-flops can be loaded and read externally using instructions LOADFF and DUMPFF
（see Table 15）．This feature permits an iterative arithmetic operation such as multiplication or division to be interrupted immediately so that an external interrupt can be processed．

## Status

Eight status output signals are generated by the＇ACT8832．Four signals （BYOF3－BYOFO）indicate overflow conditions in certain data bytes（see Table 13）．The others represent sign（ N ），zero（ZERO），carry－out（Cout）and overflow（OVR）．N，ZERO， Cout，and OVR are generated from data selected by the mode configuration controls （CF2－CFO）as shown in Table 11.

Carry－out is evaluated after each ALU operation．Sign and zero status are evaluated after ALU shift operation．Overflow（OVR）is determined by ORing the overflow result from the ALU with the overflow result from the ALU shifter．

Table 13．Data Determining BYOF Outputs

| SIGNAL | CORRESPONDING WORD，PARTIAL WORD OR BYTE |  |  |
| :---: | :---: | :---: | :---: |
|  | 32－BIT MODE | 16－BIT MODE | 8－BIT MODE |
| BYOF3 | 32－bit word | most significant word | Byte 3 |
| BYOF2 | - | - | Byte 2 |
| BYOF1 | - | least significant word | Byte 1 |
| BYOFO | - | - | Byte 0 |

## Input Data Parity Check

An even parity check is performed on each byte of input data at the DA，DB and $Y$ ports．The check is performed by counting the number of ones in each byte and its corresponding parity bit．Parity bits are input on PA for DA data，PB for DB data and PYF or Y data．PAO，PBO and PYO are the parity bits for the least significant bytes of DA，DB and Y，respectively．If the result of the parity count is odd for any byte， a high appears at the parity error output pin（PERRA for DA data，PERRB for DB data， PERRY for $Y$ data）．

## Test Pins

Two pins，TP1－TPO，support system testing．These may be used，for example，to place all outputs in a high－impedance state，isolating the chip from the rest of the system （see Table 14）．

Table 14. Test Pin Inputs

| TP1 | TPO | RESULT |
| :---: | :---: | :--- |
| 0 | 0 | All outputs and I/Os forced low |
| 0 | 1 | All outputs and I/Os forced high |
| 1 | 0 | All outputs and I/Os placed in a high impedance state |
| 1 | 1 | Normal operation (default state) |

## Instruction Set Overview

Bits 17-10 are used as instruction inputs to the 'ACT8832. Table 15 lists all instructions, divided into five groups, with their opcodes and mnemonics.

Table 15. 'ACT8832 Instruction Set

| GROUP 1 INSTRUCTIONS |  |  |
| :---: | :---: | :---: |
| $\begin{gathered} \hline \text { INSTRUCTION BITS } \\ \text { I3-IO } \\ \text { (HEX) } \\ \hline \end{gathered}$ | MNEMONIC | FUNCTION |
| 0 |  | Used to access Group 4 instructions |
| 1 | ADD | $\mathrm{R}+\mathrm{S}+\mathrm{Cn}$ |
| 2 | SUBR | $\bar{R}+S+C n$ |
| 3 | SUBS | $\mathrm{R}+\overline{\mathrm{S}}+\mathrm{Cn}$ |
| 4 | INCS | $\mathrm{S}+\mathrm{Cn}$ |
| 5 | INCNS | $\overline{\mathrm{S}}+\mathrm{Cn}$ |
| 6 | INCR | $\mathrm{R}+\mathrm{Cn}$ |
| 7 | INCNR | $\overline{\mathrm{R}}+\mathrm{Cn}$ |
| 8 |  | Used to access Group 3 instructions |
| 9 | XOR | R XOR S |
| A | AND | R AND S |
| B | OR | R OR S |
| C | NAND | R NAND S |
| D | NOR | R NOR S |
| E | ANDNR | $\overline{\mathrm{R}}$ AND S |
| F |  | Used to access Group 5 instructions |

Table 15. 'ACT8832 Instruction Set (Continued)


Table 15. 'ACT8832 Instruction Set (Continued)

| GROUP 3 INSTRUCTIONS |  |  |
| :---: | :---: | :--- |
| INSTRUCTION BITS <br> I7-I0 <br> (HEX) | MNEMONIC | FUNCTION |
| 08 | SET1 | Set bit 1 |
| 18 | SETO | Set bit 0 |
| 28 | TB1 | Test bit (one) |
| 38 | TBO | Test bit (zero) |
| 48 | ABS | Absolute value |
| 58 | SMTC | Sign magnitude/two's complement |
| 68 | ADDI | Add immediate |
| 78 | SUBI | Subtract immediate |
| 88 | BADD | Byte add R to S |
| 98 | BSUBS | Byte subtract S from R |
| A8 | BSUBR | Byte subtract R from S |
| B8 | BINCS | Byte increment S |
| C8 | BINCNS | Byte increment negative S |
| D8 | BXOR | Byte XOR R and S |
| E8 | BAND | Byte AND R and S |
| F8 | BOR | Byte OR R and $S$ |

Table 15. 'ACT8832 Instruction Set (Continued)


Table 15．＇ACT8832 Instruction Set（Continued）

| GROUP 5 INSTRUCTIONS |  |  |
| :---: | :---: | :--- |
| INSTRUCTION BITS <br> I7－IO <br> （HEX） | MNEMONIC |  |
| OF |  |  |
| 1 F | FUNCTION |  |
| $2 F$ | CLR | Clear |
| 3 F | CLR | Clear |
| 4 F | CLR | Clear |
| 5 F | CLR | Clear |
| $6 F$ | DUMPFF | Output divide／BCD flip－flops |
| $7 F$ | CLR | Clear |
| $8 F$ | BCDBIN | BCD to binary |
| $9 F$ | EX3BC | Excess－3 byte correction |
| AF | EX3C | Excess－3 word correction |
| BF | SDIVO | Signed divide overflow test |
| CF | CLR | Clear |
| DF | CLR | Clear |
| EF | BINEX3 | Binary to excess－3 |
| FF | CLR | Clear |
|  | NOP | No operation |

Group 1，a set of ALU arithmetic and logic operations，can be combined with the user－ selected shift operations in Group 2 in one instruction cycle．The other groups contain instructions for bit and byte operations，division and multiplication，data conversion， and other functions such as sorting，normalization and polynomial code accumulation．

## Arithmetic／Logic Instructions with Shifts

The seven Group 1 arithmetic instructions operate on data from the R and／or S multiplexers and the carry－in．Carry－out is evaluated after ALU operation；other status pins are evaluated after the accompanying shift operation，when applicable．Group 1 logic instructions do not use carry－in；carry－out is forced to zero．

Possible shift instructions are listed in Group 2．Fourteen single and double precision shifts can be specified，or the ALU result can be passed unshifted to the MO register or to the specified output destination by using the LOADMO or PASS instructions． Table 16 lists shift definitions．

When using the shift registers for double precision operations，the least significant half should be placed in the MO register and the most significant half in the ALU for passage to the ALU shifter．An example of a double－precision shift using the ALU and MO shifters is given in Figure 8.

SERIAL DATA
INPUT SIGNALS


Single Precision Logical Right Single Shift, 32-Bit Configuration
SERIAL DATA
INPUT SIGNALS


Double Precision Logical Right Single Shift, 32-Bit Configuration
Figure 8. Shift Examples, 32-Bit Configuration

All Group 2 shifts can be made conditional using the conditional shift pin (SSF). If the SSF pin is high or floating, the shifted ALU output will be sent to the output buffers, MQ register, or both. If the SSF pin is pulled low, the ALU result will be passed directly to the output buffers and any MQ shifts will be inhibited.

Table 16. Shift Definitions

| SHIFT TYPE | NOTES |
| :--- | :--- |
| Left | $\begin{array}{l}\text { Moves a bit one position towards the most significant bit } \\ \text { Right } \\ \text { Arithmetic right } \\ \text { Arithmetic left a bit one position towards the least significant bit } \\ \text { Retains the sign unless an overflow occurs, in which case, the } \\ \text { sign would be inverted } \\ \text { May lose the sign bit if an overflow occurs. Zero is filled into } \\ \text { the least significant bit unless the bit is set externally }\end{array}$ |
| Circular right | $\begin{array}{l}\text { Fills the least significant bit in the most significant bit position } \\ \text { Circular left } \\ \text { Logical right }\end{array}$ |
| Lills the most significant bit in the least significant bit position |  |
| Fills a zero in the most significant bit position unless the bit |  |
| is forced to one by placing a zero on an SIO pin |  |
| Fills a zero in the least significant bit position unless the bit |  |
| is forced to one by placing a zero on an SIO pin |  |$\}$

The bidirectional $\overline{\text { SIO }}$ pins can be used to supply external end fill bits for certain Group 2 shift instructions. When $\overline{\mathrm{SIO}}$ is high or floating, a zero is filled, otherwise a 1 is filled Table 17 lists instructions that make use of the $\overline{\mathrm{SIO}}$ inputs and identifies input and output functions.

Table 17. Bidirectional SIO Pin Functions

| INSTRUCTION <br> BITS I7-IO <br> (HEX) | SIO |  |  |
| :---: | :---: | :--- | :--- |
|  | MNEMONIC | I/O | DATA |
| O* $^{*}$ | SRA | 0 | Shift out |
| $1^{*}$ | SRAD | 0 | Shift out |
| $2^{*}$ | SRL | I | Most significant bit |
| $3^{*}$ | SRLD | 1 | Most significant bit |
| $4^{*}$ | SLA | 1 | Least significant bit |
| $5^{*}$ | SLAD | 1 | Least significant bit |
| $6^{*}$ | SLC | 0 | Shifted input to MO shifter |
| $7^{*}$ | SLCD | 0 | Shifted input to MO shifter |
| $8^{*}$ | SRC | 0 | Shifted input to ALU shifter |
| $9^{*}$ | SRCD | 0 | Shifted input to ALU shifter |
| A* $^{*}$ | MOSRA | 0 | Shift out |
| B* $^{*}$ | MOSRL | 1 | Most significant bit |
| C* | MQSLL | 1 | Least significant bit |
| D* | MOSLC | 0 | Shifted input to MO shifter |
| 00 | CRC | 0 | Internally generated end fill bit |
| 20 | SNORM | 1 | Least significant bit |
| 30 | DNORM | 1 | Least significant bit |
| 60 | SMULI | 0 | ALUO |
| 70 | SMULT | 0 | ALUO |
| 80 | SDIVIN | 0 | Internally generated end fill bit |
| 90 | SDIVIS | 0 | Internally generated end fill bit |
| AO | SDIVI | 0 | Internally generated end fill bit |
| BO | UDIVIS | 0 | Internally generated end fill bit |
| CO | UDIVI | 0 | Internally generated end fill bit |
| DO | UMULI | 0 | Internal input |
| EO | SDIVT | 0 | Internally generated end fill bit |
| FO | UDIVIT | 0 | Internally generated end fill bit |
| DF | BCDBIN | 1 | Least significant bit |
|  | BINEX3 | 0 | Shifted input to MO register |

## Other Arithmetic Instructions

The＇ACT8832 supports two immediate arithmetic operations．ADDI and SUBI （Group 3）add or subtract a constant between the values of 0 and 15 from an operand on the $S$ bus．The constant value is specified in bits A3－AO．

Twelve Group 4 instructions support serial division and multiplication．Signed，unsigned and mixed multiplication are implemented using three instructions：SMULI，which performs a signed times unsigned iteration；SMULT，which provides negative weighting of the sign bit of a negative multiplier in signed multiplication；and UMULI，which performs an unsigned multiplication iteration．Algorithms using these instructions are given in Tables 18，19，and 20．These include：signed multiplication，which performs a two＇s complement multiplication；unsigned multiplication，which produces an unsigned times unsigned product；and mixed multiplication which multiplies a signed multiplicand by an unsigned multiplier to produce a signed result．

Table 18．Signed Multiplication Algorithm

| OP <br> CODE | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT <br> Y PORT |
| :---: | :--- | :---: | :--- | :---: | :--- |
| E4 | LOADMO | 1 | Multiplier | - | Multiplier |
| 60 | SMULI | $\mathrm{N}-1 \dagger$ | Accumulator | Multiplicand | Partial product <br> 70 |
| SMULT | 1 | Accumulator | Multiplicand | Product $(M S H)^{\ddagger}$ |  |

Table 19．Unsigned Multiplication Algorithm

| OP <br> CODE | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT <br> Y PORT |
| :---: | :--- | :---: | :---: | :---: | :--- |
| E4 | LOADMO | 1 | Multiplier | - | Multiplier <br> DO |
| UMULI | $\mathrm{N}-1^{\dagger}$ | Accumulator | Multiplicand | Partial product <br> DO | UMULI |

Table 20．Mixed Multiplication Algorithm

| OP <br> CODE | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT <br> Y PORT |
| :---: | :--- | :---: | :--- | :---: | :--- |
| E4 | LOADMO | 1 | Multiplier | - | Multiplier |
| 60 | SMULI | $\mathrm{N}-1^{\dagger}$ | Accumulator | Multiplicand | Partial product <br> 60 |
| SMULI | 1 | Accumulator | Multiplicand | Product $(M S H)^{\ddagger}$ |  |

[^6]Instructions that support division include start, iterate and terminate instructions for unsigned division routines (UDIVIS, UDIVI and UDIVIT); initialize, start, iterate and terminate instructions for signed division routines (SDIVIN, SDIVIS, SDIVI and SDIVIT); and correction instructions for these routines (DIVRF and SDIVQF). A Group 5 instruction, SDIVO, is available for optional overflow testing. Algorithms for signed and unsigned division are given in Tables 21 and 22. These use a nonrestoring technique to divide a 16 N -bit integer dividend by an 8 N -bit integer divisor to produce an 8 N -bit integer quotient and remainder; where $\mathrm{N}=1$ for quad 8 -bit mode, $\mathrm{N}=2$ for dual 16 -bit mode, and $N=4$ for 32 -bit mode.

Table 21. Signed Division Algorithm

| $\begin{array}{\|c\|} \hline \text { OP } \\ \text { CODE } \end{array}$ | MNEMONIC | CLOCK CYCLES | INPUT S PORT | INPUT R PORT | OUTPUT <br> Y PORT |
| :---: | :---: | :---: | :---: | :---: | :---: |
| E4 | LOADMQ | 1 | Dividend (LSH) | - | Dividend (LSH) |
| 80 | SDIVIN | 1 | Dividend (MSH) | Divisor | Remainder ( N ) |
| AF | SDIVO | 1 | Remainder ( N ) | Divisor | Overflow Test Result |
| 90 | SDIVIS | 1 | Remainder ( N ) | Divisor | Remainder ( N ) |
| AO | SDIVI | $\mathrm{N}-2^{\dagger}$ | Remainder ( N ) | Divisor | Remainder ( N ) |
| E0 | SDIVIT | 1 | Remainder ( N ) | Divisor | Remainder§ |
| 40 | DIVRF | 1 | Remainder ${ }^{\ddagger}$ | Divisor | Remainder |
| 50 | SDIVQF | 1 | MO register | Divisor | Quotient \# |

${ }^{\dagger} \mathrm{N}=8$ for quad 8 -bit mode, 16 for dual 16 -bit mode, 32 for 32 -bit mode.
${ }^{\ddagger}$ The least significant half of the product is in the MO register.
§Unfixed
IFixed (corrected)
\# The quotient is stored in the MQ register. Remainder can be output at the Y port or stored in the register file accumulator.

Table 22. Unsigned Division Algorithm

| OP <br> CODE | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT <br> Y PORT |
| :---: | :--- | :---: | :--- | :--- | :--- |
| E4 | LOADMQ | 1 | Dividend (LSH) | - | Dividend (LSH) |
| BO | UDIVIS | 1 | Dividend (MSH) | Divisor | Remainder (N) |
| CO | UDIVI | $\mathrm{N}-1^{\dagger}$ | Remainder (N) | Divisor | Remainder (N) |
| FO | UDIVIT | 1 | Remainder (N) | Divisor | Remainder $\ddagger$ |
| 40 | DIVRF | 1 | Remainder $\S$ | Divisor | Remainder $\S$ |

[^7]
## Data Conversion Instructions

Conversion of binary data to one＇s and two＇s complement can be implemented using the INCNR instruction（Group 1）．SMTC（Group 3）permits conversion from two＇s complement representation to sign magnitude representation，or vice versa．Two＇s complement numbers can be converted to their positive value，using ABS（Group 3）．

SNORM and DNORM（Group 4）provide for normalization of signed，single－and double－ precision data．The operand is placed in the MQ register and shifted toward the most significant bit until the two most significant bits are of opposite value．Zeroes are shifted into the least significant bit，provided SIO is high or floating．（A low on SIO will shift a one into the least significant bit．）SNORM allows the number of shifts to be counted and stored in one of the register files to provide the exponent．

Data stored in binary－coded decimal form can be converted to binary using BCDBIN （Group 5）．A routine for this conversion，given in Table 23，allows the user to convert てع88ㄱローLNS an N －digit BCD number to a 4 N －bit binary number in $4 \mathrm{~N}+8$ clock cycles．

Table 23．BCD to Binary Algorithm

| $\begin{gathered} \text { OP } \\ \text { CODE } \end{gathered}$ | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT DESTINATION |
| :---: | :---: | :---: | :---: | :---: | :---: |
| E4 | LOADMQ | 1 | BCD operand | － | MQ reg． |
| D2 | SUBR／MOSLC | 1 | Accumulator | Accumulator | Accumulator／MO reg． |
| D2 | SUBR／MOSLC | 1 | Mask reg． | Mask reg． | Mask reg／MO reg． |
| D1 | MOSLC | 2 | Don＇t care | Don＇t care | MO reg． |
| 68 | ADDI（15） | 1 | Accumulator | Decimal 15 | Mask reg． |
| REPEAT $\mathrm{N}-1$ TIMES ${ }^{\dagger}$ |  |  |  |  |  |
| DA | AND／MQSLC | 1 | MO reg． | Mask reg． | Interim reg／MQ reg． |
| D1 | ADD／MOSLC | 1 | Accumulator | Interim reg． | Interim reg／MQ reg． |
| 7F | BCDBIN | 1 | Interim reg． | Interim res． | Accumulator／MO reg． |
| 7F | BCDBIN | 1 | Accumulator | Interim reg． | Accumulator／MO reg． |
| END REPEAT |  |  |  |  |  |
| FA | AND | 1 | MQ reg． | Mask reg． | Interim reg． |
| D1 | ADD MOSLC | 1 | Accumulator | Interim reg． | Accumulator |

${ }^{\dagger} \mathrm{N}=$ Number of BCD digits
BINEX3，EX3BC，and EX3C assist binary to excess－3 conversion．Using BINEX3，an N －bit binary number can be converted to an N／4－digit excess－ 3 number．For an algorithm，see Table 24.

Table 24. BCD to Binary Algorithm

| OP <br> CODE | MNEMONIC | CLOCK <br> CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT <br> DESTINATION |
| :---: | :--- | :---: | :---: | :---: | :--- |
| E4 | LOADMQ | 1 | Binary number | - | MO reg. |
| D2 | SUBR | 1 | Accumulator | Accumulator | Accumulator |
| D2 | SET1 (33)16 | 1 | Accumulator | Mask (33)16 | Accumulator |
| REPEAT N TIMES $\dagger$ |  |  |  |  |  |
| DF | BINEX3 | 1 | Accumulator | Accumulator | Accumulator/MO reg |
| 9F | EX3C | 1 | Accumulator | Internal data | Accumulator |
| END REPEAT |  |  |  |  |  |

${ }^{\dagger} \mathrm{N}=$ Number of bits in binary number

## Bit and Byte Instructions

Four Group 3 instructions allow the user to test or set selected bits within a byte. SET1 and SETO force selected bits of a selected byte (or bytes) to one and zero, respectively. TB1 and TB0 test selected bits of a selected byte (or bytes) for ones and zeros. The bits to be set or tested are specified by an 8-bit mask formed by the concatentation of register file address inputs $\mathrm{C} 3-\mathrm{CO}$ and $\mathrm{A} 3-\mathrm{AO}$. The register file addressed by $\mathrm{B} 5-\mathrm{BO}$ is used as the destination operand for the set bit instructions. Register writes are inhibited for test bit instructions. Bytes to be operated on are selected by forcing SIOn low, where $n$ represents the byte position and 0 represents the least significant byte. A high on the zero output pin signifies that the test data matches the mask; a low on the zero output indicates that the test has failed.

Individual bytes of data can also be manipulated using eight Group 3 byte arithmetic/logic instructions. Bytes can be added, subtracted, incremented, ORed, ANDed and exclusive ORed. Like the bit instructions, bytes are selected by forcing SIOn low, but multiple bytes can be operated on only if they are adjacent to one another; at least one byte must be nonselected.

## Other Instructions

SEL (Group 4) selects one of the ALU's two operands, S or R, depending on the state of the SSF pin. This instruction could be used in sort routines to select the larger or smaller of two operands by performing a subtraction and sending the status result to SSF. CRC (Group 4) is designed to verify serial binary data that has been transmitted over a channel using a cyclic redundancy check code. An algorithm using this instruction is given in Table 25.

Table 25. CRC Algorithm

| $\begin{gathered} \hline \text { OP } \\ \text { CODE } \end{gathered}$ | MNEMONIC | CLOCK CYCLES | INPUT <br> S PORT | INPUT <br> R PORT | OUTPUT DESTINATION |
| :---: | :---: | :---: | :---: | :---: | :---: |
| E4 | LOADMO | 1 | Vector $\mathrm{c}^{\prime}(\mathrm{x})^{\dagger}$ | - | MQ reg. |
| F6 | INCR | 1 | - | Polynomial $\mathrm{g}(\mathrm{x})$ | Poly reg. |
| F2 | SUBR | 1 | Accumulator | Accumulator | Accumulator |
| REPEAT $n / 8 \mathrm{~N}$ TIMES ${ }^{\dagger}$ |  |  |  |  |  |
| 00 | CRC | 1 | Accumulator | Poly reg. | Accumulator |
| E4 | LOADMO | 1 | Vector $c^{\prime}(x)^{\dagger}$ | - | MQ reg. |
| END REPEAT |  |  |  |  |  |

${ }^{\dagger} \mathrm{N}=$ Number of bits in binary number
$\mathrm{n}=$ Length of the code vector

CLR forces the ALU output to zero and clears the internal BCD flip-flops used in excess-3 BCD operations. NOP forces the ALU output to zero, but does not affect the flip-flops.

## Configuration Options

The 'ACT8832 can be configured to operate in 8-bit, 16 -bit, or 32 -bit modes, depending on the setting of the configuration mode selects (CF2-CFO). Table 11 shows the control inputs for the four operating modes. Selecting an operating configuration other than 32-bit mode affects ALU operation and status generation in several ways, depending on the mode selected.

## Masked 32-Bit Operation

Masked 32-bit operation is selected to reset to zero the 20 most significant bits of the R Mux input. The 12 least significant bits are unaffected by the mask. Only Group 1 and Group 2 instructions can be used in this operating configuration. Status generation is similar to unmasked 32 -bit operating mode.

## Shift Instructions

Shift instructions operate similarly in 8-bit, 16 -bit, and 32 -bit modes. The serial I/O (SIO3'-SIOO') pins are used to select end-fill bits or to shift bits in or out, depending on the operation being performed. Table 12 shows the $\overline{\mathrm{SIO}}$ signals associated with each byte or word in the different modes, and Table 17 indicates the specific function performed by the $\overline{\text { SIO }}$ pins during shift, multiply, and divide operations.

Figures 9 and 10 present examples of logical right shifts in 16-bit and 8-bit configurations.

SERIAL DATA
INPUT SIGNALS


Single Precision Logical Right Single Shift, 16-Bit Configuration


Double Precision Logical Right Single Shift, 16-Bit Configuration
Figure 9. Shift Examples, 16-Bit Configuration

## Bit and Byte Instructions

The 'ACT8832 performs bit operations similarly in 8-bit, 16-bit, and 32-bit modes. Masks are loaded into the R MUX on the A3-AO and C3-CO address inputs, and the bytes to be masked are selected by pulling their $\overline{\mathrm{SIO}^{\prime}}$ inputs low. Instructions which set, reset, or test bits are explained later

Byte operations should be performed in 32-bit mode to get the necessary status outputs. While byte overflow signals are provided for all four bytes (BYOF3-BYOFO), the other status signals ( $C, N, Z$ ) are output only for the word selected with the configuration control signals (CF2-CFO).

## Status Selection

Status results (C, N, Z, and overflow) are internally generated for all words in all modes, but only the overflow results (BYOF3-BYOFO) are available for all four bytes in 8-bit mode or for both words in 16-bit mode. If a specific application requires that the four status results are read for two or four words, it is possible to toggle the configuration

SERIAL DATA
INPUT SIGNALS


Single-Precision Logical Right Shift, 8-Bit Configuration
SERIAL DATA
INPUT SIGNALS
3


Double-Precision Logical Right Shift, 8-Bit Configuration
Figure 10. Shift Examples, 8-Bit Configuration
control signals (CF2-CFO) within the same clock cycle and read the additional status results. This assumes that the necessary external hardware is provided to toggle CF2-CFO and collect the status for the individual words before the next clock signal is input.

## Instruction Set

The 'ACT8832 instruction set is presented in alphabetical order on the following pages. The discussion of each instruction includes a functional description, list of possible operands, data flow diagram, and notes on status and control bits affected by the instruction. Microcoded examples are also shown.

Mnemonics and opcodes for instructions are given at the top of each page. Opcodes for instructions in Groups 1 and 2 are four bits long and are combined into eight-bit instructions which select combinations of arithmetic, logical, and shift operations. Opcodes for the other instruction groups are all eight bits long.

An asterisk in the left side of the opcode box for a Group 1 instruction indicates that a Group 2 opcode is needed to complete the instruction. An asterisk in the right side of a box indicates that a Group 1 opcode is required to combine with the Group 2 opcode in the left side of the box.

## FUNCTION

Computes the absolute value of two's complement data on the S bus.

## DESCRIPTION

Two's complement data on the $S$ bus is converted to its absolute value. The carry must be set to one by the user for proper conversion. ABS causes $\mathrm{S}^{\prime}+\mathrm{Cn}$ to be computed; the state of the sign bit determines whether $S$ or $S^{\prime}+C n$ will be selected as the result. SSF is used to transmit the sign of S.

Available R Bus Source Operands
$\begin{array}{|c|c|c|c|}\hline \text { RF } \\ \text { (A5-AO) }\end{array}$ A3-AO $\begin{array}{c}\text { A3med }\end{array}$ DA-Port $\left.\begin{array}{c}\text { C3-CO } \\ \text { A3-AO } \\ \text { Mask }\end{array}\right\}$

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Should be programmed high for proper conversion. |

## Status Signals

```
ZERO \(=1\) if result \(=0\)
    \(N=1\) if MSB (input) \(=1\)
    OVR = 1 if input of most significant byte is 80 (Hex) and inputs (if any) in all
        other bytes are 00 (Hex).
    \(C=1\) if \(S=0\)
```


## EXAMPLES（assumes a 32－bit configuration）

Convert the two＇s complement number in register 1 to its positive value and store the result in register 4.

| Instr <br> Code <br> 17－I0 | Oprd <br> Addr $\mathrm{A} 5-\mathrm{AO}$ | Oprd <br> Addr <br> B5－BO | Oprd Sel$\overline{E A} E B O$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMQ | Destination Selects |  |  |  |  | OES | Cn | CF2－CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | SELRF1－ |  |  | $\overline{\text { OEY3 }}$ |  |  |  |
|  |  |  |  |  |  | WEO | SELRFO | $\overline{O E A}$ | $\overline{O E B}$ | OEYO |  |  |  |
| 01001000 | XX XXXX | 000001 | $\times 00$ | 000100 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Example 1：Assume register file 1 holds F6D81340（Hex）：
Source $11110110110110000001001101000000 \quad S \leftarrow R F(1)$

Destination $00001001001001111110110011000000 \quad R F(4) \leftarrow \mathrm{S}+\mathrm{Cn}$

Example 2：Assume register file 1 holds 09D527CO（Hex）：
Source $00001001110101010010011111000000 \quad S \leftarrow R F(1)$

Destination $00001001110101010010011111000000 \quad R F(4) \leftarrow S$

## FUNCTION

Adds data on the R and S buses to the carry-in.

## DESCRIPTION

Data on the R and S buses is added with carry. The sum appears at the ALU and MQ shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-A0 <br> (A5-A0) <br> Immed |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of <br> instruction field. |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Increments sum if set to one. |

## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
OVR = 1 if signed arithmetic overflow
    C = 1 if carry-out = 1
```

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLES (assumes a 32 -bit configuration)

Add data in register 1 to data on the DB bus with carry-in and pass the result to the MO register.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr $B 5-B 0$ | $\left\lvert\, \begin{array}{r} \text { Oprd Sel } \\ \text { EB1 }- \\ \overline{E A} E B O \end{array}\right.$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | Destination Selects |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3- | SELRF1- |  |  | $\overline{\text { OEY3- }}$ |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{O E A}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{\text { OES }}$ |  |  |
| 11100001 | 000001 | XX XXXX | 010 | XX XXXX | 0 | 1111 | 10 | X | X | XXXX | 0 | 0 | 110 |

Assume register file 1 holds 0802C618 (Hex and DB bus holds 1E007530 (Hex):
Source
00001000000000101100011000011000
$R \leftarrow R F(1)$

Source
00011110000000000111010100110000
$S \leftarrow D B$ bus

Destination
00100110000000110011101101001000
MQ register $\leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

## FUNCTION

Adds four-bit immediate data on A3-AO with carry to S-bus data.

## DESCRIPTION

Immediate data in the range 0 to 15 , supplied by the user at $A 3-A O$, is added with carry to S .

Available R Bus Source Operands (Constant)

| RF | A3-AO | DA-Port | C3-C0 <br> (A5-A0) |
| :---: | :---: | :---: | :---: |
| Immed | Mask |  |  |
| No | Yes | No | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MO <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands

Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Increments sum if set to one. |

## Status Signals

```
ZERO = 1 if result =0
    N = 1 if MSB = 1
    OVR = 1 if signed arithmetic overflow
    C = 1 if carry-out = 1
```


## EXAMPLES (assumes a 32 -bit configuration)

Add the valule 12 to data on the DB bus with carry-in and store the result in register file 1.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-BO | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{E A} E B O \end{array}$ | Dest <br> Addr C5-CO | SELMQ | $\frac{\overline{W E 3}}{\overline{W E O}}$ | Destinatio <br> SELRF1- <br> SELRFO | n Seled $\overline{\mathrm{OEA}}$ | ts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01101000 | 001100 | XX X X X $\times$ | $\times 10$ | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 |

Assume bits A5-AO hold OC (Hex) and DB bus holds 24000100 (Hex):
Source $00000000000000000000000000001100 \quad R \leftarrow$ A5-AO

Source $00100100000000000000000100000000 \quad S \leftarrow D B$ bus
Destination $00100100000000000000000100001100 \quad \mathrm{RF}(1) \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

## FUNCTION

Evaluates the logical expression R AND S.

## DESCRIPTION

Data on the R bus is ANDed with data on the $S$ bus. The result appears at the ALU and MO shifters.

* The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.


## Available R Bus Source Operands

| RF | A3-A0 |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-A0 <br> Mask |
| Yes | No | Yes | No |


| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands

Shift Operations

| ALU | MQ |
| :---: | :---: |
| Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of <br> instruction field. |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Inactive |

## Status Signals ${ }^{\boldsymbol{\dagger}}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if } M S B=1 \\
\text { OVR } & =0 \\
C & =0
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLES（assumes a 32－bit configuration）

Logically AND the contents of register 3 and register 5 and store the result in register 5 ．

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{\text { EA EBO }} \end{array}$ | Dest <br> Addr <br> C5－CO | Destination Selects |  |  |  |  |  |  | Cn | CF2－CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | ELRF1－ |  |  | $\overline{\text { OEY3 }}$ |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{O E A}$ | $\overline{O E B}$ | $\overline{\text { OEYO }}$ | $\overline{\text { OES }}$ |  |  |
| 11111010 | 000011 | 000101 | 000 | 000101 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 3 holds F617D840（Hex）and register file 5 holds 15F6D842（Hex）：

| Source | 11110110000101111101100001000000 | $R \leftarrow R F(3)$ |
| ---: | ---: | ---: |
| Source | 00010101111101101101100001000010 | $S \leftarrow R F(5)$ |
| Destination | 00010100000101101101100001000000 | $R F(5) \leftarrow R$ AND S |

## ANDNR

## FUNCTION

Computes the logical expression S AND NOT R.

## DESCRIPTION

The logical expression S AND NOT R is computed. The result appears at the ALU and MO shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands <br> Shift Operations

| RF <br> (C5-C0) | RF <br> $(B 5-B O)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of <br> instruction field. |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Inactive |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=0\)
\(O V R=0\)
    \(C=0\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Invert the contents of register 3，logically AND the result with data in register 5 and store the result in register 10.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr B5-BO | Oprd Sel$\overline{E A} E B 1-$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | Destination Selects |  |  |  |  |  |  | Cn | CF2－CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | SELRF1－ |  |  | $\overline{\text { OEY3－}}$ |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{O E A}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{\mathrm{OES}}$ |  |  |
| 11111110 | 000011 | 000101 | 000 | 001010 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 3 holds 15F6D840（Hex）and register file 5 hold F617D842（Hex）：
Source $00010101111101101101100001000000 \quad R \leftarrow R F(3)$

Source $11110110000101111101100001000010 \quad S \leftarrow R F(5)$

Destination $11100010000000010000000000000010 \quad \operatorname{RF}(10) \leftarrow \overline{\mathrm{R}}$ AND S

Byte Add R to S with Carry

## FUNCTION

Adds $S$ with carry－in to a selected byte or selected adjacent bytes of R．

## DESCRIPTION

$\overline{\mathrm{SIO}}$－$\overline{\mathrm{SIOO}}$ are used to select bytes of R to be added to the corresponding bytes of S ．A byte of R with $\overline{\mathrm{SIO}}$ programmed low is selected for the computation of $R+S+C n$ ．If the $\overline{S I O}$ signal for a byte of $R$ is left high，the corresponding byte of $S$ is passed unaltered．Multiple bytes can be selected only if they are adjacent to one another．At least one byte must be nonselected．

| RF | A3－AO | DA－Port | C3－C0 <br> $::$ <br> A3－AO <br> （A5－AO） |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  | Mask | Yes |
| :---: |
| No |

Available S Bus Source Operands

| RF <br> （B5－B0） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | RF <br> （B5－B0） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No Use |  |
| $\overline{\text { SIOO }}$ | Yes | Inactive |
| $\overline{\mathrm{SIO1}}$ | Yes | Byte select |
| $\overline{\mathrm{SIO2}}$ | Yes | Byte select select |
| $\overline{\mathrm{SIO3}}$ | Yes | Byte select |
| Cn | Yes | Propagates through nonselected bytes；increments <br> selected byte（s）if programmed high． |

## Status Signals

```
ZERO = 1 if result (selected bytes) = 0
    N = 0
OVR = 1 if signed arithmetic overflow (selected bytes)
    C = 1 if carry-out (most significant selected byte) =1
```


## EXAMPLE（assumes a 32－bit configuration）

Add bytes 1 and 2 of register 3 with carry to the contents of register 1 and store the result in register 11.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr $A 5-A O$ | Oprd <br> Addr B5-BO | Oprd Sel <br> EB1－ <br> $\overline{E A}$ EBO | Dest <br> Addr <br> C5－CO |  | $\overline{\mathrm{WE3}}-\mathrm{D}$ | Destinatio <br> SELRF1－ <br> SELRFO |  | cts $\overline{\mathrm{OEB}}$ | $\overline{\text { OEY }}$－ | $\overline{\mathrm{OES}}$ | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\text { SIO3－}}{\text { SIOO }}$ | $\frac{\text { IESIO3－}}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01001000 | 000011 | 000001 | $0 \quad 00$ | 001011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1001 | 0000 |

Assume register file 3 holds 2C018181（Hex）and register file 1 holds 7A8FBE3E（Hex）：


ALU $10100110100100010100000011000000 \quad \mathrm{Fn} \leftarrow \mathrm{Rn}+\mathrm{Sn}+\mathrm{Cn}$

Destination $01111010100100010100111100111110 \quad \mathrm{RF}(11) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$

```
\({ }^{\dagger}{ }^{\mathrm{F}}=\mathrm{ALU}\) result
\(\mathrm{n}=\mathrm{nth}\) byte
```

Register file 11 gets F if byte selected， S if byte not selected．

\section*{BAND} Byte AND R AND S（Byte Logical AND R AND S） |  |  |
| :--- | :--- |

## FUNCTION

Evaluates the logical AND of selected bytes of R－bus and S－bus data．

## DESCRIPTION

Bytes with their corresponding $\overline{\text { SIO }}$ signals programmed low compute R AND S．Bytes with $\overline{\mathrm{SIO}}$ signals programmed high，pass S unaltered．Multiple bytes can be selected only if they are adjacent to one another．At least one byte must be nonselected．

Available R Bus Source Operands

| RF | A3－AO | DA－Port | C3－C0 <br> $::$ <br> A3－AO <br> （A5－AO） |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available S Bus Source Operands

| RF <br> （B5－B0） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | RF <br> （B5－B0） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control／Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Forced low |  |
| $\overline{S I O O}$ | Yes | Byte select |  |
| $\overline{\text { SIO1 }}$ | Yes | Byte select |  |
| $\overline{S I O 2}$ | Yes | Byte select |  |
| $\overline{S I O 3}$ | Yes | Byte select |  |
| $C n$ | No | Inactive |  |

Status Signals

```
ZERO = 1 if result (selected bytes) \(=0\)
    \(\mathrm{N}=0\)
OVR \(=0\)
    \(C=0\)
```


## EXAMPLE (assumes a 32-bit configuration)

Logically AND bytes 1 and 2 of register 3 with input on the DB bus; store the result

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | $\begin{array}{r} \text { Oprd } \mathrm{Sel} \\ \text { EB1- } \\ \text { EA EBO } \end{array}$ | Dest <br> Addr <br> C5-C0 | SELMO | $\overline{\overline{W E 3}-}$ | Destinatio <br> SELRF1- <br> SELRFO | on Seled $\overline{O E A}$ | ects $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}} \overline{\text { OEYO }}$ | $\overline{O E S}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\mathrm{SIO}}{\mathrm{SIO}}$ | $\frac{\overline{1 E S I O 3}}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11101000 | 000011 | XX XXXX | $0 \quad 10$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | $\times$ | 110 | 1001 | 0000 |

Assume register file 3 holds 398FBEBE (Hex) and input on the DB port is 4290BFBF (Hex):

Source $00111001100011111011111010111110 \quad R n \leftarrow R F(3) n$

Source $01000010100100001011111110111111 \quad \mathrm{Sn} \leftarrow \mathrm{DBn}$

Destination $01000010100000001011111010111111 \quad \mathrm{RF}(3) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$

[^8]
## FUNCTION

Converts a BCD number to binary.

## DESCRIPTION

This instruction allows the user to convert an N -digit BCD number to a 4 N -bit binary number in $4(N-1)$ plus 8 clocks. The instruction sums the $R$ and $S$ buses with carry.

A one-bit arithmetic left shift is performed on the ALU output. A zero is filled into bit 0 of the least significant byte unless $\overline{\mathrm{SIOO}}$ is set low, which would force bit 0 to one. Bit 7 of the most significant byte is dropped.

Simultaneously, the contents of the MO register are rotated one bit to the left. Bit 7 of the most significant byte is rotated to bit 0 of the least significant byte.

## Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

## Recommended Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | No |

Shift Operations

| ALU | MO |
| :---: | :---: |
| Left | Left |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive <br> If high or floating, fills a zero in LSB of ALU shifter; <br> if low, fills a one in LSB of ALU shifter. |
| $\overline{\text { SIOO }}$ | Yes | Inactive in 32-bit configuration. Used in other <br> configurations to select endfill in LSBs. |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{S I O 2}$ | No | Should be programmed low for proper conversion. |
| $\overline{S I O 3}$ | No |  |
| Cn | Yes |  |

## Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
OVR = 1 if signed arithmetic overflow
    C = 1 if carry-out = 1
```


## ALGORITHM

The following code converts an N -digit BCD number to a 4 N -bit binary number in $4(\mathrm{~N}-1)$ plus 8 clocks. This is one possible user generated algorithm. It employs the standard conversion formula for a BCD number (shown here for 32 bits):

$$
A B C D=[(A \times 10+B) \times 10+C] \times 10+D
$$

The conversion begins with the most significant BCD digit. Addition is performed in radix 2 .

## PSEUDOCODE

LOADMQ NUM Load MQ with BCD number．

SUB ACC，ACC，SLCMQ Clear accumulator；
Circular left shift MO．
SUB
MSK，MSK，SLCMQ

SLCMO
SLCMO
ADDI
ACC，MSK， 15
Clear mask register；
Circular left shift MQ．
Circular left shift MO．
Circular left shift MQ．
Store 15 in mask register．
Repeat N－1 times：
（ $N=$ number of $B C D$ digits）
AND MO，MSK，R1，
SLCMQ Extract one digit；
Circular left shift MQ．
ADD ACC，R1，R1，SLCMQ
Add extracted digit to accumulator，and store result in

BCDBIN R1，R1，ACC

BCDBIN ACC，R1，ACC
R1；Circular left shift MO．
Perform BCDBIN instruction，and store result in accumulator
［ $4 \times($ ACC $+4 \times$ digit）$]$ ；
Circular left shift MO．
Perform BCDBIN instruction，and store result in accumulator $[10 \times($ ACC $+10 \times$ digit $)]$ ； Circular left shift MQ．
（END REPEAT）
AND
MO MSK，R1
ADD
ACC，R1，ACC

Fetch last digit．
Add in last digit and store result in accumulator．

## FUNCTION

$S^{\prime}+C n$ for selected bytes of $S$ ．

## DESCRIPTION

Bytes with $\overline{\text { SIOO }}$ programmed low compute $\mathrm{S}^{\prime}+\mathrm{Cn}$ ．Bytes with $\overline{\mathrm{SIOO}}$ programmed high pass $S$ unaltered．Multiple bytes can be selected only if they are adjacent to one another．At least one byte must be nonselected．

## Available R Bus Source Operands

| RF | A3－AO | DA－Port | C3－C0 <br> $:$ <br> （A5－AO） <br> Ammed <br> Imask |
| :---: | :---: | :---: | :---: |
| No | No | No | No |

Available S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | $R F$ <br> $(B 5-B 0)$ | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{S S F}$ | No | Inactive |
| $\overline{\text { SIOO }}$ | Yes | Byte select |
| $\overline{S I O 1}$ | Yes | Byte select |
| $\overline{\mathrm{SIO2}}$ | Yes | Byte select |
| $\overline{\mathrm{SIO3}}$ | Yes | Byte select |
| Cn | Yes | Propagates through nonselected bytes；increments <br>  |

## BINCNS

## Status Signals

```
ZERO = 1 if result (selected bytes) = 0
    N}=
OVR = 1 if signed arithmetic overflow (selected bytes)
    C = 1 if carry-out (most significant selected byte) =1
```


## EXAMPLE (assumes a 32 -bit configuration)

Invert bytes 0 and 1 of register 3 and add them to the carry (bytes 2 and 3 are not changed). Store the result in register 3.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | $\begin{aligned} & \text { Oprd Sel } \\ & \overline{E B 1-} \\ & \overline{E A} \mathrm{EBO} \end{aligned}$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ |  |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\text { SIO3 }}$ | $\frac{\overline{1 E S I O 3}}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 11001000 | X X X X X $\times$ | 000001 | X 00 | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1100 | 0000 |

Assume register file 3 holds A3018181 (Hex):
Source $10100011000000011000000110000001 \quad \mathrm{Sn} \leftarrow \operatorname{RF}(3) \mathrm{n}$

ALU $01011100111111100111111001111111 \quad \mathrm{Fn} \leftarrow \mathrm{S}^{\prime} \mathrm{n}+\mathrm{Cn}$

Destination $10100011000000010111111001111111 \quad \mathrm{RF}(3) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
$\dagger^{\prime} F=A L U$ result
$\mathrm{n}=$ nth byte
Register file 3 gets F if byte selected, S if byte not selected.

## FUNCTION

Increments selected bytes of $S$ if the carry is set.

## DESCRIPTION

Bytes with SIO' inputs programmed low compute $\mathrm{S}+\mathrm{Cn}$. Bytes with $\overline{\mathrm{SIO}}$ inputs programmed high, pass $S$ unaltered. Multiple bytes can be selected only if they are adjacent to one another. At least one byte must be nonsselected.

## Available R Bus Source Operands

| $\begin{gathered} R F \\ (\mathrm{~A} 5-\mathrm{A} 0) \end{gathered}$ | $\begin{aligned} & \text { A3-AO } \\ & \text { Immed } \end{aligned}$ | DA-Port | $\begin{gathered} \text { C3-C0 } \\ :: \\ \text { A3-AO } \\ \text { Mask } \end{gathered}$ |
| :---: | :---: | :---: | :---: |
| No | No | No | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands

Shift Operations

| RF |  |
| :---: | :---: | :---: |
| (C5-C0) | RF |
| (B5-B0) |  | Y-Port.


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Use |
| $\overline{S I O O}$ | Yes | Byte select |
| $\overline{S I O 1}$ | Yes | Byte select |
| $\overline{S I O 2}$ | Yes | Byte select |
| $\overline{S I O 3}$ | Yes | Byte select |
| Cn | Yes | Propagates through nonselected bytes; increments |
|  |  | selected byte(s) if programmed high. |

## Status Signals

```
ZERO = 1 if result (selected bytes) = 0
    N = 0
OVR = 1 if signed arithmetic overflow (selected bytes)
    C = 1 if carry-out (most significant selected byte) = 1
```


## EXAMPLE (assumes a 32 -bit configuration)

Add bytes 1 and 2 of register 7 to the carry (bytes 0 and 3 are not changed). Store the result in register 2.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | $\begin{aligned} & \text { Oprd Sel } \\ & \text { EB1- } \\ & \overline{E A} E B O \end{aligned}$ | Dest <br> Addr C5-CO | $$ |  |  |  |  |  |  | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\mathrm{SIO3}}{\mathrm{SIOO}}$ | $\frac{1 \text { ESIO3- }}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 10111000 | XX XXXX | 000111 | $\times 00$ | 000010 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1100 | 0000 |

Assume register file 7 holds 408FBEBE (Hex):


Destination
01000000100011111011111110111110
$\mathrm{RF}(2) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
$\begin{aligned} \dagger_{F} & =A L U \text { result } \\ n & =\text { nth byte }\end{aligned}$
Register file 11 gets $F$ if byte selected, $S$ if byte not selected.

## FUNCTION

Converts a binary number to excess－3 representation．

## DESCRIPTION

This instruction converts an N －digit binary number to a $\mathrm{N} / 4$ digit excess－ 3 number representation in $2 \mathrm{~N}+3$ clocks．The data on the R and S buses are added to the carry－ in，which contains the most significant bit of the MQ register．The contents of the MQ register are rotated one bit to the left．The most significant bit is shifted out and passed to the least significant bit position．Depending on the configuration selected， this shift may be within the same byte or from the most significant byte to the least significant byte．

Recommended R Bus Source Operands

| RF | A3－AO | DA－Port | C3－CO <br> （A5－AO） <br> A3－AO <br> Immed |
| :---: | :---: | :---: | :---: |
| Mask |  |  |  |

Recommended S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | Left |

## Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Holds MSB of MO register． |

## Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
OVR = 1 if signed arithmetic overflow
    C = 1 if carry-out = 1
```


## ALGORITHM

The following code converts an N-digit binary number to a N/4 digit excess-3 number in $2 N+3$ clocks. It employs the standard conversion formula for a binary number:

$$
\begin{aligned}
& a_{n} 2^{n}+a_{n-1} 2^{n-1}+a_{n-2} 2^{n-2}+\ldots+a_{o}= \\
& \quad\left\{\left[\left(2 a_{n}+a_{n-1}\right) \times 2+a_{n-1}\right] \times 2+\ldots+a_{o}\right\} \times 2+a_{0} .
\end{aligned}
$$

The conversion begins with the most significant bit. Addition during the BINEX3 instruction is performed in radix 10 (excess-3).

LOADMO NUM
SUB ACC, ACC, ACC
SET1 ACC, 33 (Hex)

Load MQ with binary number.
Clear accumulator;
Store 33 (Hex) in all bytes of accumulator.

Repeat N times:
( $\mathrm{N}=$ number of bits in binary number)

BINEX3 ACC, ACC, ACC

EX3C ACC

Double accumulator and add in most significant bit of MO register. Circular left shift MO.

Perform excess-3 correction.
(END REPEAT)

## FUNCTION

Evaluates R OR S of selected bytes.

## DESCRIPTION

Bytes with $\overline{\text { SIO }}$ inputs programmed low evaluate R OR S. Bytes with $\overline{\mathrm{SIO}}$ inputs programmed high, pass $S$ unaltered. Multiple bytes can be selected only if they are adjacent to one another. At least one byte must be nonselected.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-C0 <br> Im-A0 <br> A3-Ask <br> Mask |
| Yes | No | Yes | No |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| SSF | No | Inactive |  |
| $\overline{S I O O}$ | Yes | Byte select |  |
| $\overline{\text { SIO1 }}$ | Yes | Byte select |  |
| $\overline{S I O 2}$ | Yes | Byte select |  |
| $\overline{S I O 3}$ | Yes | Byte select |  |
| Cn | No | Inactive |  |

Status Signals

```
ZERO = 1 if result (selected bytes) \(=0\)
    \(\mathrm{N}=0\)
    OVR = 0
    \(C=0\)
```


## EXAMPLE (assumes a 32-bit configuration)

Logically OR bytes 1 and 2 of register 12 with bytes 1 and 2 on the DB bus. Concatenate with DB bytes 0 and 3, storing the result in register 12.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | $\begin{array}{r} \text { Oprd } \mathrm{Sel} \\ \text { EB1- } \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr C5-CO | $$ |  |  |  |  |  |  | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\text { SIO3 }}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 11111000 | 001100 | XX XXXX | - 10 | 001100 | 0 | 0000 | 10 | $\times$ | X | XXXX | 0 | X | 110 | 1001 | 0000 |

Assume register file 12 holds 578FBEBE (Hex) and the DB bus holds 1C90BEBE (Hex):
Source $01010111100011111011111010111110 \quad R n \leftarrow R F(12) n$

Source $00011100100100001011111010111100 \quad \mathrm{Sn} \leftarrow \mathrm{DBn}$

Destination 00011100100111111011111010111110 $\mathrm{RF}(12) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$

[^9]
## FUNCTION

Subtracts $R$ from $S$ in selected bytes．

## DESCRIPTION

Bytes with $\overline{\mathrm{SIO}}$ inputs programmed low compute $\mathrm{R}^{\prime}+\mathrm{S}+\mathrm{Cn}$ ．Bytes with $\overline{\mathrm{SIO}}$ inputs programmed high，pass $S$ unaltered．Multiple bytes can be selected only if they are adjacent to one another．At least one byte must be nonselected．

## Available R Bus Source Operands

| RF | A3－AO | DA－Port | C3－CO <br> （A5－A0） <br> Immed |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

Available S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | RF <br> $(B 5-B 0)$ | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | Yes | Byte select |
| $\overline{S I O 1}$ | Yes | Byte select |
| $\overline{S I O 2}$ | Yes | Byte select |
| $\overline{S I O 3}$ | Yes | Byte select |
| Cn | Yes | Propagates through nonselected bytes；should be <br> set high for two＇s complement subtraction． |

## Status Signals

```
ZERO \(=1\) if result (selected bytes) \(=0\)
    \(\mathrm{N}=0\)
    OVR = 1 if signed arithmetic overflow (selected bytes)
        \(\mathrm{C}=1\) if carry-out (most significant selected byte) \(=1\)
```


## EXAMPLE (assumes a 32-bit configuration)

Subtract bytes 1 and 2 of register 1 with carry from bytes 1 and 2 of register 3 . Concatenate with bytes 0 and 3 of register 3, storing the result in register 11 .

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | $\begin{array}{\|r} \text { Oprd } \mathrm{Sel} \\ \text { EB1- } \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr C5-C0 | Destination Selects |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\mathrm{SIO3}}{\mathrm{SIOO}}$ | $\frac{\overline{1 E S I O 3}}{\text { ESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3 | SELRF1- |  |  | EY |  |  |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{\text { OEA }}$ | OEB | OEYO | $\overline{\text { OES }}$ |  |  |  |  |
| 10101000 | 000001 | 000011 | 000 | 001011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1001 | 0000 |

Assume register file 1 holds 091B5858 (Hex) and register file 3 holds 703A9898 (Hex):
Source $00001001000110110101100001011000 \quad R n \leftarrow R F(1) n$
Source $01110000001110101001100010011000 \quad \mathrm{Sn} \leftarrow \mathrm{RF}(3) \mathrm{n}$
ALU $01100111000111110100000001000000 \quad \mathrm{Fn} \leftarrow \mathrm{R}^{\prime} \mathrm{n}+\mathrm{Sn}+\mathrm{Cn}$

Destination
01110000000111110100000010011000
$\mathrm{RF}(11) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
${ }^{\dagger} \mathrm{F}=\mathrm{ALU}$ result
$n=n t h$ package
Register file 11 gets F if byte selected, S if byte not selected.

## FUNCTION

Subtracts $S$ from $R$ in selected bytes.

## DESCRIPTION

Bytes with $\overline{\mathrm{SIO}}$ inputs programmed low compute $R+\mathrm{S}^{\prime}+\mathrm{Cn}$. Bytes with $\overline{\mathrm{SIO}}$ inputs programmed high, pass $S$ unaltered. Multiple bytes can be selected only if they are adjacent to one another. At least one byte must be nonselected.

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> (A5-AO) <br> A3-A0 <br> Immed <br> Mask |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

## Available S Bus Source Operands

| RF <br> $(B 5-B 0)$ | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | Yes | Byte select |
| $\overline{S I O 1}$ | Yes | Byte select |
| $\overline{\text { SIO2 }}$ | Yes | Byte select |
| $\overline{S I O 3}$ | Yes | Byte select |
| Cn | Yes | Propagates through nonselected bytes; should be |
|  |  |  |

Status Signals

```
ZERO = 1 if result (selected bytes) \(=0\)
    \(N=0\)
OVR = 1 if signed arithmetic overflow (selected bytes)
    \(\mathrm{C}=1\) if carry-out (most significant selected byte) \(=1\)
```


## EXAMPLE（assumes a 32－bit configuration）

Subtract bytes 1 and 2 of register 3 with carry from bytes 1 and 2 of register 1 ． Concatenate with bytes 0 and 3 of register 3，storing the result in register 11.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr B5-BO | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{E A} E B O \end{array}$ | Dest <br> Addr <br> C5－CO | est |  |  |  |  |  |  | $\mathrm{Cn}$ | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\text { SIO3 }}$ | $\overline{\text { IESIO3 }}$ IESIOO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3 | RF |  |  | EY |  |  |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{O E S}$ |  |  |  |  |
| 10011000 | 000001 | 000011 | $0 \quad 00$ | 001011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1001 | 0000 |

Destination
01010010010011100010000010111000
$\mathrm{RF}(11) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
${ }^{\dagger} \mathrm{F}=\mathrm{ALU}$ result
$\mathrm{n}=$ nth byte
Register file 11 gets $F$ if byte selected，$S$ if byte not selected．

## FUNCTION

Evaluates $R$ exclusive OR $S$ in selected bytes.

## DESCRIPTION

Bytes with $\overline{\text { SIO }}$ inputs programmed low evaluate R exclusive OR S. Bytes with $\overline{\text { SIO }}$ inputs programmed high, pass $S$ unaltered. Multiple bytes can be selected only if they are adjacent to one another. At least one byte must be nonselected.

## Available R Bus Source Operands

| RF | A3-A0 | DA-Port | C3-C0 <br> $::$ <br> A3-A0 <br> (A5-A0) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  | Dask | Yes |
| :---: |
| No |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| SSF | No | Inactive |  |
| $\overline{\mathrm{SIOO}}$ | Yes | Byte select |  |
| $\overline{\mathrm{SIO1}}$ | Yes | Byte select |  |
| $\overline{\mathrm{SIO2}}$ | Yes | Byte select |  |
| $\overline{\mathrm{SIO}}$ | Yes | Byte select |  |
| Cn | No | Inactive |  |

## Status Signals

```
ZERO = 1 if result (selected bytes) \(=0\)
    \(\mathrm{N}=0\)
    \(O V R=0\)
    \(C=0\)
```


## EXAMPLE (assumes a 32-bit configuration)

Exclusive OR bytes 1 and 2 of register 6 with bytes 1 and 2 on the DB bus; concatenate the result with DB bytes 0 and 3 , storing the result in register 10 .

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-BO | $\begin{array}{r} \text { Oprd } \mathrm{Sel} \\ \mathrm{~EB} 1- \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr <br> C5-CO |  |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2 } \\ & \text { CFO } \end{aligned}$ | $\frac{\mathrm{SIO}}{\mathrm{SIOO}}$ | $\frac{\overline{\text { IESIO3- }} \text { IESIOO }}{\text { I }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 11011000 | 000110 | XX XXXX | $0 \quad 10$ | 001010 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 | 1001 | 0000 |

Assume register file 6 holds 938FBEBE (Hex) and the DB bus holds 4190BEBE (Hex):


Destination $01000001000111110000000010111110 \quad \mathrm{RF}(10) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
$\dagger$
$\dagger$
$\mathrm{F}=\mathrm{ALU}$ result
$\mathrm{n}=\mathrm{nth}$ package
Register file 10 gets $F$ if byte selected, $S$ if byte not selected.

## FUNCTION

Forces ALU output to zero and clears the BCD flip-flops.

## DESCRIPTION

ALU output is forced to zero and the BCD flip-flops are cleared.
${ }^{\dagger}$ This instruction may also be coded with the following opcodes:
$[2][F],[3][F],[4][F],[6][F],[B][F],[C][F],[E][F]$

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-C0 <br> I: <br> A3-AO <br> Mask |
| No | No | No | No |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| No | No | No |

## Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Status Signals

```
ZERO = 1
    N = 0
    OVR=0
    Cn = 0
```


## FUNCTION

Evaluates R exclusive OR S for use with cyclic redundancy check codes.

## DESCRIPTION

Data on the R bus is exclusive ORed with data on the $S$ bus. If MOO XNORed with SO is zero (MQO is the LSB of the MO register and SO is the LSB of S-bus data), the result is sent to the ALU shifter. Otherwise, data on the $S$ bus is sent to the ALU shifter.

A right shift is performed; the MSB is filled with RO (MOO XOR SO), where RO is the LSB of R-bus data. A circular right shift is performed on MO data.

Recommended R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> (A5-AO) <br> A3-AO <br> Immed |
| :---: | :---: | :---: | :---: |
| Mask |  |  |  |

Recommended S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

## Recommended Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | No |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| Right | Right |

Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |  |
| $\overline{S I O O}$ | No | Inactive |  |
| $\overline{S I O 1}$ | No | Inactive |  |
| $\overline{S I O 2}$ | No | Inactive |  |
| $\overline{S I O 3}$ | No | Inactive |  |
| $C n$ | No | Inactive |  |

## Status Signals

```
ZERO = 1 if result = 0
    N = 0
OVR = O
    Cn = 0
```


## CYCLIC REDUNDANCY CHARACTER CHECK DESCRIPTION

Serial binary data transmitted over a channel is susceptible to error bursts．These bursts may be detected and corrected by standard encoding methods such as cyclic redundancy check codes，fire codes，or computer generated codes．These codes all divide the message vector by a generator polynomial to produce a remainder that contains parity information about the message vector．

If a message vector of $m$ bits，$a(x)$ ，is divided by a generator polynomial，$g(x)$ ，of order $k-1$ ，a $k$ bit remainder，$r(x)$ ，is formed．The code vector，$c(x)$ ，consisting of $m(x)$ and $r(x)$ of length $n=m+k$ is transmitted down the channel．The receiver divides the received vector by $g(x)$ ．

After $m$ divide iterations，$r(x)$ will be regenerated only if there is no error in the message bits．After $k$ more iterations，the result will be zero if and only if no error has occurred in either the message or the remainder．

## ALGORITHM

An algorithm for a cyclic redundancy character check，using the＇ACT8832 as a receiver，is given below：

LOADMQ VEC（X）

LOAD POLY
CLEAR SUM
REPEAT（ $n / 32$ ）TIMES：
SUM＝SUM CRC POLY

LOADMO VEC（X）

Load MQ with first 32 message bits of received vector $c^{\prime}(x)$ ．

Load register with polynomial $g(x)$ ．
Clear register acting as accumulator．

Perform CRC instruction where
R Bus＝POLY S Bus＝SUM
Store result in SUM．
Load MQ with next 32 message bits of received vector $c^{\prime}(x)$ ．
（END REPEAT）

\section*{CRC Cyclic Redundancy Character Accumulation | 0 | 0 |
| :--- | :--- |}

SUM now contains the remainder $\left[r^{\prime}(x)\right]$ of $c^{\prime}(x)$. A syndrome generation routine may be called next, if required.

Note that the most significant bit of

$$
g(x)=\left(g_{k-1}\right)\left(x^{k-1}\right)+\left(g_{k-2}\right)\left(x^{k-2}\right)+\ldots\left(g_{0}\right)\left(x^{0}\right)
$$

is implied and that $\operatorname{POLY}(0)$ is set to zero if the length of $g(x)$ requires fewer bits than are in the machine word width.

## FUNCTION

Corrects the remainder of nonrestoring division routine if correction is required．

## DESCRIPTION

DIVRF tests the result of the final step in nonrestoring division iteration：SDIVIT（for signed division）or UDIVIT（for unsigned division）．An error in the remainder results when it is nonzero and the signs of the remainder and the dividend are different．

The $R$ bus must be loaded with the divisor and the $S$ bus with the most significant half of the previous result．The least significant half is in the MO register．The $Y$ bus result must be stored in the register file for use during the subsequent SDIVOF instruction．

DIVRF tests to determine whether a fix is required and evaluates：
$Y \leftarrow S+R^{\prime}+1$ if a fix is necessary
$\mathrm{Y} \leftarrow \mathrm{S}+\mathrm{R}+0$ if a fix is unnecessary
Overflow is reported to OVR at the end of the division routine（after SDIVQF）．

Recommended R Bus Source Operands

| RF | A3－A0 |  |  |
| :---: | :---: | :---: | :---: |
| （A5－A0） | Immed | DA－Port | C3－CO <br> $::$ <br> A3－A0 <br> Mask |
| Yes | No | No | No |

Recommended S Bus Source Operands

| RF <br> （B5－B0） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－B0） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | No |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | Yes | Should be programmed high |

## Status Signals

$$
\begin{aligned}
\text { ZERO } & =1 \text { if remainder }=0 \\
N & =0 \\
\text { OVR } & =0 \\
C n & =1 \text { if carry-out }=1
\end{aligned}
$$

## FUNCTION

Tests the two most significant bits of a double precision number．If they are the same， shifts the number to the left．

## DESCRIPTION

This instruction is used to normalize a two＇s complement，double precision number by shifting the number one bit to the left and filling a zero into the LSB unless $\overline{\mathrm{SIOO}}$ is low．The S bus holds the most significant half；the MQ register holds the least significant half．
Normalization is complete when overflow occurs．The shift is inhibited whenever normalization is attempted on a number already normalized．

Available R Bus Source Operands

| RF | A3－AO |  |  |
| :---: | :---: | :---: | :---: |
| （A5－AO） | Immed | DA－Port | C3－CO <br> $:$ <br> A3－AO <br> Mask |
| No | No | No | No |

Recommended S Bus Source
Operands（MSH）

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | No | No |

Recommended Destination Operands

| RF |  |  |
| :---: | :---: | :---: |
| （C5－CO） | RF <br> （B5－BO） | Y－Port |
| Yes | No | No |

Shift Operations
（conditional）

| ALU | MQ |
| :---: | :---: |
| Left | Left |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{S I O O}$ | Yes | When low, selects a one end-fill bit in LSB |
| $\overline{S I O 1}$ | No | Passes internally generated end-fill bits |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No |  |
| Cn | No |  |

## Status Signals

```
ZERO = 1 if result =0
    N = 1 if MSB = 1
    OVR = 1 if MSB XOR 2nd MSB = 1
        Cn = 0
```


## EXAMPLE (assumes a 32-bit configuration)

Normalize a double-precision number.
(This example assumes that the MSH of the number to be normalized is in register 3 and the LSH is in the MQ register. The zero on the OVR pin at the end of the instruction cycle indicates that normalization is not complete and the instruction should be repeated).

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-B0 | Oprd Sel <br> EB1- <br> $\overline{E A}$ EBO | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMQ | $\overline{\text { WE3- }}$ | Destinati SELRF1SELRFO | $\begin{aligned} & \text { Self } \\ & \overline{\mathrm{OEA}} \end{aligned}$ | cts <br> $\overline{O E B}$ | OEY3- | $\overline{\text { OES }}$ | Cn | CF2- CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 00110000 | XX XXXX | 000011 | $\times 00$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 3 holds FA75D84E (Hex) and MQ register holds 37F6D843 (Hex):


[^10]
## FUNCTION

Output contents of the divide/BCD flip-flops.

## DESCRIPTION

The contents of the divide/BCD flip-flops are passed through the MQ register to the Y output Imultiplexer.

Available R Bus Source Operands
3

| RF | A3-AO | DA-Port | C3-C0 <br> (A5-AO) <br> A3-AO <br> Immed |
| :---: | :---: | :---: | :---: |
| No | No | No | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | $M Q$ <br> Register |
| :---: | :---: | :---: |
| No | No | No |

Available Destination Operands
Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| No | No | Yes |


| ALU | MO |
| :---: | :---: |
| None | None |

## Status Signals

$$
\begin{aligned}
\mathrm{ZERO} & =0 \\
N & =0 \\
\text { OVR } & =0 \\
C n & =0
\end{aligned}
$$

## DUMPFF

EXAMPLES (assumes a 32 -bit configuration)
Dump divide/BCD flip-flops to Y output.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-80 | Oprd Sel <br> EB1- <br> $\overline{E A}$ EBO | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMQ | $\overline{\text { WE3 }}$ | Destinatio SELRF1SELRFO | on Sele $\overline{\mathrm{OEA}}$ | cts <br> $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01011111 | XX XXXX | XX X X X $\times$ | X XX | X $\times$ XXXX | 1 | XXXX | XX | X | X | 0000 | X | X | 110 |

Assume divide/BCD flip-flops contain 2A055470 (Hex):
Source 00101010000001010101010001110000 MQ register $\leftarrow$ Divide/BCD flip-flops

Destination $00101010000001010101010001110000 \quad \mathrm{Y}$ output $\leftarrow \mathrm{MQ}$ register

## FUNCTION

Corrects the result of excess－3 addition or subtraction in selected bytes．

## DESCRIPTION

This instruction corrects excess－3 additions or subtractions in the byte mode．For correct excess -3 arithmetic，this instruction must follow each add or subtract．The operand must be on the $S$ bus．

Data on the $S$ bus is added to a constant on the $R$ bus determined by the state of the BCD flip flops and previous overflow condition reported on the SSF pin．Bytes with $\overline{\text { SIO }}$ inputs programmed low evaluate the correct excess－3 representation．Bytes with $\overline{\mathrm{SIO}}$ inputs programmed high or floating，pass S unaltered．

## Available R Bus Source Operands

| RF |
| :---: | :---: | :---: | :---: |
| （A5－AO） | A3－AO | Immed |
| :---: | DA－Port | C3－C0 |
| :---: |
| ：： |
| A3－AO |
| Mask |$|$| No | No |
| :---: | :---: |

Available S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | No | No |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | No |


| ALU | MO |
| :---: | :---: |
| No | No |

## Control／Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| SSF | No | Inactive |  |
| $\overline{S I O O}$ | Yes | Byte select |  |
| $\overline{S I O 1}$ | Yes | Byte select |  |
| $\overline{S I O 2}$ | Yes | Byte select |  |
| $\overline{S I O 3}$ | Yes | Byte select |  |
| $C n$ | No | Inactive |  |

## Status Signals

$$
\begin{aligned}
\text { ZERO } & =0 \\
N & =0 \\
\text { OVR } & =1 \text { if arithmetic signed overflow } \\
C n & =1 \text { if carry-out }=1
\end{aligned}
$$

## EXAMPLE (assumes a 32-bit configuration)

Add two BCD numbers and store the sum in register 3. Assume data comes in on DB bus.

1. Clear accumulator (SUB ACC, ACC)
2. Store 33 (Hex) in all bytes of register (SET1 R2, H/33/)
3. Add 33 (Hex) to selected bytes of first BCD number (BADD DB, R2, R1)
4. Add 33 (Hex) to selected bytes of second BCD number (BADD DB, R2, R3)
5. Add selected bytes of registers 1 and 3 (BADD, R1, R3, R3)
6. Correct the result (EX3BC, R3, R3)

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-BO | $\begin{gathered} \text { Oprd Sel } \\ \quad \mathrm{EB} 1- \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{gathered}$ | Dest <br> Addr <br> C5-CO | SELMO | WE3- | Destinatio <br> SELRF1- <br> SELRFO | $\begin{aligned} & \overline{O E A} \text { Sele } \\ & \hline \end{aligned}$ | ects $\overline{O E B}$ | $\overline{\text { OEY3- }}$ | $\overline{\mathrm{OES}}$ | Cn | CF2- | $\frac{\text { SIO3- }}{\text { SIOO }}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110010 | 000010 | XX $\times \times X X$ | 0 XX | 000010 | 0 | 0000 | 10 | X | X | XXX | 0 | 1 | 110 | XXXX | XXXX |
| 00001000 | 000010 | XX $\times X X X$ | 0 x ${ }^{\text {d }}$ | 000010 | 0 | 0000 | 10 | $x$ | $x$ | XXXX | 0 | X | 110 | XXXX | XXXX |
| 10001000 | 000010 | XX XXXX | 010 | 000001 | 0 | 0000 | 10 | X | $x$ | XXXX | 0 | 0 | 110 | 1100 | 0000 |
| 10001000 | 000010 | XX XXXX | 010 | 000011 | 0 | 0000 | 10 | X | $x$ | $x \times X X$ | 0 | 0 | 110 | 1100 | 0000 |
| 10001000 | 000001 | 000011 | 000 | 000011 | 0 | 0000 | 10 | $x$ | $x$ | XXXX | 0 | 0 | 110 | 1100 | 0000 |
| 10001111 | XX XXXX | 000011 | $\times \quad 00$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1100 | 0000 |

Assume DB bus holds 51336912 at third instruction and 34867162 at fourth instruction.

1
00000000000000000000000000000000
$R F(2) \leftarrow 0$

2
$00000000000000000011001100110011 \quad \mathrm{RF}(2) \leftarrow 00003333$ (Hex)

3
$R F(1) \leftarrow R F(2)+D B$

4
$R F(3) \leftarrow R F(2)+D B$

5

$R F(3) n \leftarrow R F(1) n+R F(3) n$

6

```
00110100100001100100000001110100
```

RF(3) $n \leftarrow$ Corrected RF(3)n result

## FUNCTION

Corrects the result of excess－3 addition or subtraction．

## DESCRIPTION

This instruction corrects excess－3 additions or subtractions in the word mode．For correct excess－3 arithmetic，this instruction must follow each add or subtract．The operand must be on the $S$ bus．

Data on the $S$ bus is added to a constant on the $R$ bus determined by the state of the BCD flip－flops and previous overflow condition reported on the SSF pin．

Available R Bus Source Operands

| RF | A3－AO | DA－Port | C3－C0 <br> （A5－AO） <br> A3－AO <br> Immed |
| :---: | :---: | :---: | :---: |
| No | No | No | No |

Available S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MO <br> Register |
| :---: | :---: | :---: |
| Yes | No | No |

Available Destination Operands
Shift Operations

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| No | No |

## Control／Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |  |
| $\overline{\text { SIOO }}$ | No | Inactive |  |
| $\overline{\text { SIO1 }}$ | No | Inactive |  |
| $\overline{S I O 2}$ | No | Inactive |  |
| $\overline{\text { SIO3 }}$ | No | Inactive |  |
| Cn | No | Inactive |  |

## Status Signals

$$
\begin{aligned}
\text { ZERO } & =0 \\
N & =1 \text { if MSB }=1 \\
\text { OVR } & =1 \text { if arithmetic signed overflow } \\
C n & =1 \text { if carry-out }=1
\end{aligned}
$$

## EXAMPLE (assumes a 32-bit configuration)

Add two BCD numbers and store the sum in register 3. Assume data comes in on DA bus.

1. Clear accumulator (SUB ACC, ACC)
2. Store 33 (Hex) in all bytes of register (SET1 R2, H/33/)
3. Add 33 (Hex) to all bytes of first BCD number (ADD DB, R2, R1)
4. Add 33 (Hex) to all bytes of second BCD number (ADD DB, R2, R3)
5. Add the excess-3 data (ADD, R1, R3, R3)
6. Correct the excess-3 result (EX3C, R3, R3)
7. Subtract the excess-3 bias to go to BCD result.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-BO | Oprd Sel <br> EB1 <br> $\overline{E A}$ EBO | Dest <br> Addr <br> C5-CO | SELMO | $\overline{\overline{W E 3-}} \overline{\text { WEO }}$ | Destination SELRF1SELRFO |  | cts <br> $\overline{O E B}$ | $\overline{\overline{O E Y 3}}-\overline{O E Y O}$ | OES | Cn | CF2- |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110010 | 000010 | XX Xxxx | 0 XX | 000010 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |
| 00001000 | 000010 | x ${ }^{\text {x } x x x x ~}$ | 0 XX | 000010 | 0 | 0000 | 10 | X | X | xxxx | 0 | x | 110 |
| 11110001 | 000010 | $x \mathrm{xxxxx}$ | 010 | 000001 | 0 | 0000 | 10 | x | X | xxxx | 0 | 0 | 110 |
| 11110001 | 000010 | x ${ }^{\text {x }}$ x $\times$ x | 010 | 000011 | 0 | 0000 | 10 | X | X | xxxx | 0 | 0 | 110 |
| 111.10001 | 000001 | 000011 | 000 | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 |
| 10011111 | XX XXXX | 000011 | $\times 00$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 |
| 11110010 | 000010 | 000011 | $0 \quad 00$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 |

Assume DB bus holds 51336912 at third instruction and 34867162 at fourth instruction.

Results of Instruction Cycles:
$100000000000000000000000000000000 \quad \mathrm{RF}(2) \leftarrow 0$
$200110011001100110011001100110011 \quad \mathrm{RF}(2) \leftarrow 33333333$ (Hex)
$310000100011001101001110001000101 \quad \mathrm{RF}(1) \leftarrow \mathrm{RF}(2)+\mathrm{DB}$

4
01100111101110011010010010010101
$\mathrm{RF}(3) \leftarrow \mathrm{RF}(2)+\mathrm{DB}$

5
11101100001000000100000011011010
$R F(3) \leftarrow R F(1)+R F(3)$

6 10111001010100110111001110100111

RF(3) $\leftarrow$ Corrected RF(3) result

7 10000110001000000100000001110100 $R F(3) \leftarrow R F(3)-R F(2)$

## FUNCTION

Evaluates $\mathrm{R}^{\prime}+\mathrm{Cn}$ ．

## DESCRIPTION

Data on the R bus is inverted and added with carry．The result appears at the ALU and MQ shifters．
＊The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble（17－14）of the instruction field．The result may also be passed without shift．Possible instructions are listed in Table 15.

| RF | A3－AO | DA－Port | C3－C0 <br> $::$ <br> A3－AO <br> （A5－AO） |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| No | No | No |

Available Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17－14 of |
| $\overline{\text { SIOO }}$ | No | instruction field． |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Increments if programmed high． |
| Cn | Yes |  |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB = 1
    OVR \(=1\) if signed arithmetic overflow
        \(C=1\) if carry-out \(=1\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Convert the data on the DA bus to two＇s complement and store the result in register 4.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－BO | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMQ | WE3－ <br> WEO | Destinatio <br> SELRF1－ <br> SELRFO | on Sele $\overline{\mathrm{OEA}}$ | cts <br> $\overline{O E B}$ | $\frac{\overline{\text { OEY3 }}}{\overline{\text { OEYO }}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110111 | XX XXXX | XX XXXX | 1 XX | 000100 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 3791FEF6（Hex）：
Source
00110111100100011111111011110110
$R \leftarrow D A$

Destination
11001000011011100000000100001010
$\mathrm{RF}(4) \leftarrow \mathrm{R}^{\prime}+\mathrm{Cn}$

## FUNCTION

Evaluates $S^{\prime}+C n$.

## DESCRIPTION

Data on the $S$ bus is inverted and added to the carry. The result appears at the ALU and MQ shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| $\begin{gathered} \mathrm{RF} \\ (\mathrm{~A} 5-\mathrm{A} 0) \end{gathered}$ | A3-A0 <br> Immed | DA-Port | $\begin{gathered} \text { C3-C0 } \\ :: \\ \text { A3-A0 } \\ \text { Mask } \end{gathered}$ |
| :---: | :---: | :---: | :---: |
| No | No | No | No |

## Available S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{S I O O}$ | No | instruction field. |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No |  |
| Cn | Yes | Increments if programmed high. |


\section*{| $*$ | 5 Increment Negative $S$ using Carry（ $\mathbf{S}^{\prime}+\mathbf{C n}$ ） INCNS |
| :--- | :--- | :--- |}

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB = 1
    OVR = 1 if signed arithmetic overflow
    C \(=1\) if carry-out \(=1\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Convert the data on the MO register to one＇s complement and store the result in register 4.


Assume MO register file 1 holds 3791FEF6（Hex）：

Source $00110111100100011111111011110110 \quad \mathrm{~S} \leftarrow \mathrm{MQ}$ register

Destination
11001000011011100000000100001001
$\mathrm{RF}(4) \leftarrow \mathrm{S}^{\prime}+\mathrm{Cn}$

## FUNCTION

Increments R if the carry is set.

## DESCRIPTION

Data on the $R$ bus is added to the carry. The sum appears at the ALU and MO shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

Available S Bus Source
Operands (MSH)

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| No | No | No |

## Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-I4 of |
| $\overline{\text { SIOO }}$ | No | instruction field. |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| $\overline{\text { SIO3 }}$ | No |  |
| Cn | Yes | Increments R if programmed high. |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if MSB }=1 \\
\text { OVR } & =1 \text { if signed arithmetic overflow } \\
C n & =1 \text { if carry-out }=1
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Increment the data on the DA bus and store the result in register 4.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-BO | Oprd Sel <br> E EB1- <br> EA EBO | Dest Addr C5-CO | SELMO | WE3- <br> $\overline{\text { WEO }}$ | Destinatio <br> SELRF1- <br> SELRFO |  | cts $\overline{O E B}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110110 | X $\times$ X $\times$ X | x $\times$ x $\times$ x ${ }^{\text {a }}$ | XX | 000100 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 3791FEF6 (Hex).

Source

$$
00010111100100011111111011110110
$$

$$
\mathrm{R} \leftarrow \mathrm{DA}
$$

Destination

```
0001011110010001111111110111110111
```

$\mathrm{RF}(4) \leftarrow \mathrm{R}+\mathrm{Cn}$

## FUNCTION

Increments S if the carry is set.

## DESCRIPTION

Data on the $S$ bus is added to the carry. The sum appears at the ALU and MO shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-CO <br> I3-AO <br> A3-A <br> Mask |
| No | No | No | No |

Available S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits $17-14$ of <br> $\overline{S I O O}$ <br> $\overline{S I O 1}$ <br> $\overline{S I O 2}$ <br> $\overline{S I O 3}$ <br> No <br> No <br> No Yes |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(N=1\) if MSB = 1
OVR = 1 if signed arithmetic overflow
        \(C=1\) if carry-out \(=1\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Increment the data in the MQ register and store the result in register 4.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr $\mathrm{A} 5-\mathrm{AO}$ | Oprd <br> Addr <br> B5－BO | Oprd Sel <br> EB1－ <br> $\overline{E A}$ EBO | Dest <br> Addr C5-CO | Destination Selects |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | ELRF1－ |  |  | OEY |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{\mathrm{OES}}$ |  |  |
| 11110100 | XX XXXX | XX XXXX | $\times 11$ | 000100 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume MQ register holds 54FFOOFF（Hex）：
Source
01010100111111110000000011111111
$S \leftarrow M O$ register

Destination
01010100111111110000000100000000
$\mathrm{RF}(4) \leftarrow \mathrm{S}+\mathrm{Cn}$

## FUNCTION

Load divide/BCD flip-flops from external data input.

## DESCRIPTION

Uses an internal bypass path to load data from the S MUX directly into the divide/BCD flip-flops.

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| No | No | No | No | No |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Inactive |



Status Signals

$$
\begin{aligned}
\text { ZERO } & =0 \\
N & =0 \\
\text { OVR } & =0 \\
C & =0
\end{aligned}
$$

## EXAMPLE (assumes a 32-bit configuration)

Load the divide/BCD flip-flops with data from the DB input bus.

| Instr <br> Code $17-10$ | Oprd <br> Addr A5-A0 | Oprd <br> Addr <br> B5-B0 | Oprd Sel $\overline{E A} \quad \begin{gathered} E B 1- \\ \hline \end{gathered}$ | Dest <br> Addr C5-CO | SELMQ | WE3 <br> WEO | Destinatio <br> SELRF1- <br> SELRFO | n Sele $\overline{\mathrm{OEA}}$ | cts $\overline{O E B}$ | $\overline{\overline{O E Y Z}} \overline{\overline{O E Y O}}$ | $\overline{\mathrm{OES}}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 00001111 | XX XXXX | XX XXXX | $\times 10$ | XX XXXX | X | XXXX | XX | X | X | XXXX | X | X | 110 |

Assume DB input holds 2A08C618 (Hex):
Source
00101010000010001100011000011000
$S \leftarrow \mathrm{DB}$ bus

Destination
00101010000010001100011000011000
Divide/BCD flip-flops $\leftarrow$ S

## LOADMO

## FUNCTION

Passes the result of the ALU instruction specified in the lower nibble of the instruction field to $Y$ and the MO register．

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field（ $13-10$ ）is passed unshifted to Y and the MO register．
＊A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| None | None |

## Available Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－B0） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Outputs MOO（LSB） |
| $\overline{\text { SIOO }}^{\prime}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Inactive |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
\mathrm{~N} & =1 \text { if MSB of result }=1 \\
& =0 \text { if MSB of result }=0 \\
\text { OVR } & =1 \text { if signed arithmetic overflow } \\
C & =1 \text { if carry-out }=1
\end{aligned}
$$

[^11]| $E$ | $*$ | Pass $(Y \leftarrow F)$ and Load MO with $F \quad$ LOADMO |
| :--- | :--- | :--- |

EXAMPLE（assumes a 32－bit configuration）
Load the MQ register with data from register 1，and pass the data to the Y port． （In this example，data is passed to the ALU by and INCR instruction without carry－in．）

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－BO | Oprd Sel <br> EB1－ <br> $\overline{E A}$ EBO | Dest <br> Addr <br> C5－CO | SELMQ | $\begin{aligned} & \overline{\text { WE3- }} \\ & \overline{\text { WEO }} \end{aligned}$ | Destination SELRF1－ SELRFO |  | ts <br> OEB | $\frac{\overline{O E Y B}}{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110110 | 000001 | X $\times$ X $\times$ x ${ }^{\text {a }}$ | 0 XX | XX XXXX | 0 | XXXX | XX | X | X | X XXX | 0 | 0 | 110 |

Assume register file 1 holds 2A08C618（Hex）：

Source $00101010000010001100011000011000 \quad R \leftarrow R F(1)$

Destination
00101010000010001100011000011000
MO register $\leftarrow \mathrm{R}+\mathrm{Cn}$

## MOSLC

## FUNCTION

Passes the result of the ALU instruction specified in the upper nibble of the instruction field to Y MUX. Performs a circular left shift on MQ.

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field (I3-IO) is passed unshifted to Y MUX.

The contents of the MQ register are rotated one bit to the left. The MSB is rotated out and passed to the LSB of the same word, which may be 1, 2, or 4 bytes long.
The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the MQ register. If SSF is low, the MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| None | Circular Left |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-CO) | RF <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User Programmable | Use |
| :---: | :---: | :---: |
| SSF | Yes | Passes shift result if high or floating; retains MO without shift if low. |
| $\overline{\mathrm{SIOO}}$ | No | Inactive |
| SIO1 | No | Inactive |
| $\overline{\mathrm{SIO}}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Affects arithmetic operation programmed in bits 13-10 of instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(N=1\) if MSB of result = 1
        \(=0\) if MSB of result \(=0\)
    OVR \(=1\) if signed arithmetic overflow
    \(C=1\) if carry-out \(=1\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Add data in register 1 to data on the DB bus with carry-in and store the unshifted result in register 1 . Circular shift the contents of the MQ register one bit to the left.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr B5-B0 | Oprd Sel $\begin{aligned} & E B 1- \\ & \overline{E A} E B O \end{aligned}$ | Dest <br> Addr C5-CO | SELMO | $\overline{\text { WEZ }} \overline{\text { WEO }}$ | Destinatio <br> SELRF1- <br> SELRFO | n Sele $\overline{O E A}$ | ects <br> $\overline{O E B}$ | $\begin{aligned} & \overline{\mathrm{OEYS}}- \\ & \overline{\mathrm{OEYO}} \end{aligned}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11010001 | 000001 | XX XXXX | 010 | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 2508C618 (Hex), DB bus holds 11007530 (Hex), and MO register holds 4DA99A0E (Hex).

Source $00100101000010001100011000011000 \quad R \leftarrow R F(1)$
Source $00010001000000000111010100110000 \quad \mathrm{~S} \leftarrow \mathrm{DB}$ bus

Destination $00110110000010010011101101001001 \quad \mathrm{RF}(1) \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

Source 01001101101010011001101000001110 MQ shifter $\leftarrow$ MQ register

Destination
10011011010100110011010000011100
MQ register $\leftarrow M Q$ shifter

## FUNCTION

Passes the result of the ALU instruction specified in the upper nibble of the instruction field to Y MUX. Performs a left shift on MQ.

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field (I3-IO) is passed unshifted to Y MUX.

The contents of the MQ register are shifted one bit to the left. A zero is filled into the least significant bit of each word unless the $\overline{\mathrm{SIO}}$ input for that word is programmed low; this will force the least significant bit to one. The MSB is dropped from each word, which may be 1, 2, or 4 bytes long, depending on the configuration selected.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the MQ register. If SSF is low, the MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| None | Logical Left |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high or floating; retains MO <br> without shift if low. |
| $\overline{\text { SIOO }}$ | Yes | Fills a zero in LSB of MO shifter if high or floating; <br> sets LSB to one if low. <br> Inactive in 32-bit configuration; used in <br> configurations to select end-fill in LSBs. |
| $\overline{\text { SIO1 }}$ | No | Affects arithmetic operation programmed in bits <br> $\overline{\text { SIO2 }}$ <br> $\overline{\text { SIO3 }}$ <br> No <br> Cn |
|  | No | No instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result = 0
    N = 1 if MSB of result = 1
        = 0 if MSB of result = 0
    OVR = 1 if signed arithmetic overflow
        C = 1 if carry-out = 1
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Add data in register 7 to data on the DB bus with carry-in and store the unshifted result in register 7 . Shift the contents of the MQ register one bit to the left, filling a zero into the least significant bit.


Assume register file 7 holds 7308C618 (Hex), DB bus holds 54007530 (Hex), and MQ register holds 61A99A0E (Hex).


## MOSRA

FUNCTION
Passes the result of the ALU instruction specified in the upper nibble of the instruction field to $Y$ MUX. Performs an arithmetic right shift on MQ.

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field ( $13-10$ ) is passed unshifted to Y MUX.

The contents of the MQ register are rotated one bit to the right. The sign bit of the most significant byte is retained. Bit 0 of the least significant byte is dropped.

The shift may be made conditional on SSF, If SSF is high or floating, the shift result will be sent to the MQ register. If SSF is low, the MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| None | Arithmetic Right |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-CO) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User Programmable | Use |
| :---: | :---: | :---: |
| SSF | Yes | Passes shift result if high or floating; retains MQ without shift if low. |
| $\overline{\text { SIOO }}$ | No | Outputs LSB of MQ shifter (inverted). |
| $\overline{\text { SIO1 }}$ | No | Inactive in 32-bit configurations; used in other |
| $\overline{\mathrm{SIO}}$ | No | configurations to output LSBs from MQ shifter |
| $\overline{\mathrm{SIO}}$ | No | (inverted). |
| Cn | No | Affects arithmetic operation programmed in bits 13-10 of instruction field. |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
\mathrm{~N} & =1 \text { if MSB of result }=1 \\
& =0 \text { if MSB of result }=0 \\
\text { OVR } & =1 \text { if signed arithmetic overflow } \\
C & =1 \text { if carry-out }=1
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and N (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Add data in register 1 to data in register 10 with carry-in and store the unshifted result in register 1 . Shift the contents of the MQ register one bit to the right, retaining the sign bit.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-B0 | Oprd Sel $\overline{E A} E B O$ | Dest <br> Addr C5-CO | SELMQ | $\begin{aligned} & \overline{\text { WE3 }} \\ & \overline{\text { WEO }} \end{aligned}$ | Destinati SELRF1SELRFO | on Sele $\overline{\mathrm{OEA}}$ | cts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y B}} \overline{\overline{O E Y O}}$ | $\overline{\mathrm{OES}}$ | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 10100001 | 000001 | 001010 | 000 | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 5608C618 (Hex), register file 10 holds 14007530 (Hex), and MO register holds 98A99AOE (Hex).

| Source | 01010110000010001100011000011000 | $R \leftarrow R F(1)$ |
| :---: | :---: | :---: |
| Source | 00010100000000000111010100110000 | $S \leftarrow R F(10)$ |
| Destination | 01101010000010010011101101001001 | $\mathrm{RF}(1) \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$ |
| Source | 10011000101010011001101000001110 | MQ shifter $\leftarrow M \mathrm{MQ}$ register |
| Destination | 11001100010101001100110100000111 | MQ register $\leftarrow M Q$ shifter |

## FUNCTION

Passes the result of the ALU instruction specified in the upper nibble of the instruction field to $Y$ MUX. Performs a right shift on MQ.

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field (13-10) is passed unshifted to Y MUX.

The contents of the MO register are shifted one bit to the right. A zero is placed in the sign bit of the most significant byte unless the $\overline{\mathrm{SIO}}$ input for that byte is set to zero; this will force the sign bit to 1 . Bit 0 of the least significant byte is dropped.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the MO register. If SSF is low, the MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MO Shifter |
| :---: | :---: |
| None | Logical Right |

## Available Destination Operands (ALU Shifter)

| $\begin{gathered} \mathrm{RF} \\ \text { (C5-CO) } \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ (\mathrm{~B} 5-\mathrm{BO}) \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high or floating; retains MO <br> without shift if low. <br> Fills a zero in LSB of MO shifter if high or floating; <br> sets LSB to one if low. |
| $\overline{\text { SIOO }}$ | Yes | Inactive in 32-bit configuration; used in other <br> configurations to select end-fill in LSBs. |
| $\overline{S I O 1}$ | No | Affects arithmetic operation programmed in bits <br> $\overline{S I O 2}$ <br> $\overline{S I O 3}$ <br> I3-IO of instruction field. |
| No | No |  |



## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result \(=0\)
    \(N=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR = 1 if signed arithmetic overflow
    \(\mathrm{C}=1\) if carry-out \(=1\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Add data in register 1 to data on the DB bus with carry－in and store the unshifted result in register 1．Shift the contents of the MO register one bit to the left．

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－B0 | Oprd Sel $\overline{E A} E B 1-$ | Dest <br> Addr C5-CO | SELMQ | WE3－ <br> WEO | Destinatio <br> SELRF1－ <br> SELRFO | n Sele $\overline{O E A}$ | cts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y Z}} \overline{\text { OEYO }}$ | $\overline{\text { OES }}$ | Cn | CF2－ CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 10110001 | 000001 | XX XXXX | $0 \quad 10$ | 00.0001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 5608C618（Hex），DB bus holds 14007530 （Hex），and MQ register holds 98A99AOE（Hex）．

Source $01010110000010001100011000011000 \quad R \leftarrow R F(1)$


Destination $01101010000010010011101101001001 \quad \mathrm{RF}(1) \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

Source 10011000101010011001101000001110 MQ shifter $\leftarrow$ MQ register

Destination 01001100010101001100110100000111

MQ register $\leftarrow M Q$ shifter

## FUNCTION

Evaluates the logical expression R NAND S.

## DESCRIPTION

Data on the R bus is NANDed with data on the $S$ bus. The result appears at the ALU and MQ shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available S Bus Source Operands

| RF |  |  |
| :---: | :---: | :---: |
| (B5-BO) | DB-Port | MQ <br> Register |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{\text { SIOO }}$ | No | instruction field. |
| $\overline{\mathrm{SIO1}}$ | No |  |
| $\overline{\mathrm{SIO2}}$ | No |  |
| $\overline{\mathrm{SIO}}$ | No | Inactive |
| Cn |  |  |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if } M S B=1 \\
\text { OVR } & =0 \\
C & =0
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated
after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Logically NAND the contents of register 3 and register 5，and store the result in register 5.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－BO | Oprd Sel$\overline{E A} E B 0$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMO | Destination Se |  |  |  |  | OES | Cn | CF2－CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | SELRF1－ |  |  | $\overline{\text { OEY3－}}$ |  |  |  |
|  |  |  |  |  |  | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{O E B}$ | $\overline{O E Y O}$ |  |  |  |
| 11111100 | 000011 | 000101 | 000 | 000101 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 1 holds 60F6D840（Hex）and register file 5 holds 13F6D377（Hex）．


Destination $11111111000010010010111110111111 \quad$ RF $(5) \leftarrow$ R NAND $S$

## FUNCTION

Forces ALU output to zero.

## DESCRIPTION

This instruction forces the ALU output to zero. The BCD flip-flops retain their old value. Note that the clear instruction (CLR) forces the ALU output to zero and clears the BCD flip-flops.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  | Mask | No |
| :---: |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| No | No | No |

## Available Destination Operands Shift Operations

| $\begin{gathered} \mathrm{RF} \\ (\mathrm{C} 5-\mathrm{CO}) \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ (\mathrm{~B} 5-\mathrm{BO}) \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Status Signals

```
ZERO = 1
    N}=
OVR=0
    C = 0
```


## EXAMPLE（assumes a 32－bit configuration）

Clear register 12.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr C5-CO | SELMQ | Destination Selects |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | SELRF1－ |  |  | OEY3－ |  |  |  |
|  |  |  |  |  |  | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{\text { OES }}$ |  |  |
| 11111111 | XX XXXX | XX XXXX | X XX | 001100 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Destination $00000000000000000000000000000000 \quad \operatorname{RF}(12) \leftarrow 0$

## FUNCTION

Evaluates the logical expression R NOR S.

## DESCRIPTION

Data on the R bus is NORed with data on the $S$ bus. The result appears at the ALU and MQ shifters.

* The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

Available R Bus Source Operands
$\begin{array}{|c|c|c|c|}\hline \text { RF } \\ \text { (A5-AO) }\end{array}$ A3-AO $\begin{array}{c}\text { Immed }\end{array}$ DA-Port $\left.\begin{array}{c}\text { C3-CO } \\ \text { A3-AO } \\ \text { Mask }\end{array}\right\}$

Available S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{\text { SIOO }}$ | No | instruction field. |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No |  |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if MSB }=1 \\
\text { OVR } & =0 \\
C & =0
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Logically NOR the contents of register 3 and register 5，and store the result in register 5 ．

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | Oprd Sel $\overline{E A} E B O$ | Dest <br> Addr C5-CO | SELMQ | $\overline{\text { WE3 }}$ | Destinatio SELRF1－ SELRFO | on Sele $\overline{\mathrm{OEA}}$ | cts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11111011 | 000011 | 000101 | 000 | 000101 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 3 holds 60F6D840（Hex）and register file 5 holds 13F6D377（Hex）．
Source
01100000111101101101100001000000
$R \leftarrow R F(3)$

Source

$$
00010011111101101101001101110111
$$

$$
S \leftarrow R F(5)
$$

Destination
10001100000010010010010010001000
$R F(5) \leftarrow R$ NOR $S$

## FUNCTION

Evaluates the logical expression R OR S.

## DESCRIPTION

Data on the $R$ bus is ORed with data on the $S$ bus. The result appears at the ALU and MQ shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of <br> SIOO |
| No |  |  |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| $\overline{S I O 3}$ | No | Inactive |
| $C n$ | No |  |

## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result =0
    N = 1 if MSB = 1
    OVR=0
    C=0
```

${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32 －bit configuration）

Logically OR the contents of register 5 and register 3，and store the result in register 3.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | Oprd Sel <br> EB1－ <br> $\overline{E A} E B O$ | Dest <br> Addr C5-C0 | SELMQ | Destination Selects |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | ELRF1－ |  |  | $\overline{\text { OEY3－}}$ |  |  |  |
|  |  |  |  |  |  | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{\text { OEB }}$ | OEYO | $\overline{\text { OES }}$ |  |  |
| 11111011 | 000101 | 000011 | 000 | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 5 holds 60F6D840（Hex）and register file 3 holds 13F6D377（Hex）．

> Source | 01100000111101101101100001000000 | $R \leftarrow R F(5)$ |
| :--- | :--- |
| Source | 00010011111101101101001101110111 |$\quad S \leftarrow R F(3)$

Destination
01110011111101101101101101110111
$R F(3) \leftarrow R$ OR $S$

## FUNCTION

Passes the result of the ALU instruction specified in the lower nibble of the instruction field to Y MUX.

## DESCRIPTION

The result of the arithmetic or logical operation specified in the lower nibble of the instruction field (13-10) is passed unshifted to Y MUX.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | None | None |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{S S F}$ | No | Inactive |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | No | Affects arithmetic operation specified in bits $13-10$ of |
|  |  | instruction field. |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if MSB of result }=1 \\
& =0 \text { if MSB of result }=0 \\
\text { OVR } & =1 \text { if signed arithmetic overflow } \\
C & =1 \text { if carry-out condition }
\end{aligned}
$$

[^12]
## EXAMPLE（assumes a 32－bit configuration）

Add data in register 1 to data on the DB bus with carry－in and store the unshifted result in register 10.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | $\begin{array}{r} \text { Oprd Sel } \\ \quad \mathrm{EB} 1- \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMQ | $\begin{aligned} & \overline{\text { WE3 }} \\ & \overline{W E O} \end{aligned}$ | Destinatio <br> SELRF1－ <br> SELRFO | on Sele <br> $\overline{\text { OEA }}$ | ects <br> $\overline{\text { OEB }}$ | $\overline{\overline{O E Y B}}$ | $\overline{\mathrm{OES}}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110001 | 000001 | XX XXXX | 010 | 00.1010 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 3 holds 9308 C 618 （Hex）and DB bus holds 24007530 （Hex）．

Destination
10110111000010010011101101001001
$\mathrm{RF}(10) \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

## FUNCTION

Performs one of $\mathrm{N}-2$ iterations of nonrestoring signed division by a test subtraction of the N -bit divisor from the 2 N -bit dividend. An algorithm using this instruction is given in the "Other Arithmetic Instructions" section.

## DESCRIPTION

SDIVI performs a test subtraction of the divisor from the dividend to generate a quotient bit. The test subtraction passes if the remainder is positive and fails if negative. If it fails, the remainder will be corrected during the next instruction.

SDIVI checks the pass/fail result of the test subtraction from the previous instruction, and evaluates

$$
\begin{array}{ll}
F \leftarrow R+S & \text { if the test fails } \\
F \leftarrow R^{\prime}+S+C n & \text { if the test passes }
\end{array}
$$

A double precision left shift is performed; bit 7 of the most significant byte of the MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is circulated into the least significant bit of the MQ shifter.

The R bus must be loaded with the divisor, the $S$ bus with the most significant half of the result of the previous instruction (SDIVI during iteration or SDIVIS at the beginning of iteration). The least significant half of the previous result is in the MO register. Carryin should be programmed high. Overflow occurring during SDIVI is reported to OVR at the end of the signed divide routine (after SDIVQF).

Available R Bus Source Operands

| RF | A3-A0 | DA-Port | C3-CO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| A3-AO |  |  |  |
| Immed |  |  |  |
| Mask |  |  |  |$|$ Des

Recommended S Bus Source Operands

| RF <br> $(B 5-B 0)$ | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands
Shift Operations

| RF <br> $(C 5-C 0)$ | RF <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| Left | Left |

Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Pass internally generated end－fill bits． |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Should be programmed high |
| Cn | Yes |  |

3
Status Signals
てع881つももLNS

```
ZERO = 1 if intermediate result =0
    N = 0
    OVR = 0
    C = 1 if carry-out
```


## FUNCTION

Initializes 'ACT8832 for nonrestoring signed division by shifting the dividend left and internally preserving the sign bit. An algorithm using this instruction is given in the "Other Arithmetic Instructions section.

## DESCRIPTION

This instruction prepares for signed divide iteration operations by shifting the dividend and storing the sign for future use.

The preceding instruction should load the MO reqister with the least significant half of the dividend. During SDIVIN, the S bus should be loaded with the most significant half of the dividend, and the R bus with the divisor. Y-output should be written back to the register file for use in the next instruction.

A double precision logical left shift is performed; bit 7 of the most significant byte of the MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte of the ALU shifter is lost. The unfixed quotient sign bit is shifted into the least significant bit of the MO shifter.

Available R Bus Source Operands

| RF <br> (A5-AO) | A3-AO <br> Immed | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> Mask |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

Recommended S Bus Source Operands

| RF <br> $(B 5-B O)$ | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> $(C 5-C 0)$ | $R F$ <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| Left | Left |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{S S F}$ | No | Inactive |
| $\overline{S I O O}$ | No | Pass internally generated end－fill bits． |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | No |  |

Status Signals

```
ZERO = 1 if divisor = 0
    N = 0
    OVR=O
    Cn = 0
```


## FUNCTION

Computes the first quotient bit of nonrestoring signed division. An algorithm using this instruction is given in the "Other Arithmetic Instructions" section..

## DESCRIPTION

SDIVIS computes the first quotient bit during nonrestoring signed division by subtracting the divisor from the dividend, which was left-shifted during the prior SDIVIN instruction. The resulting remainder due to subtraction may be negative. If so, the subsequent SDIVI instruction will restore the remainder during the next subtraction.

The R bus must be loaded with the divisor and the $S$ bus with the most significant half of the remainder. The result on the Y bus should be loaded back into the register file for use in the next instruction. The least significant half of the remainder is in the MO register. Carry-in should be programmed high.

A double precision left shift is performed; bit 7 of the most significant byte of the MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is circulated into the least significant bit of the MQ shifter.

Overflow occurring during SDIVIS is reported to OVR at the end of the signed division routine (after SDIVQF).

Available R Bus Source Operands

| RF | A3-A0 |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| Left | Left |

Signed Divide Start

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Use |
| $\overline{\text { SIOO }}$ | No | Pastive internally generated end-fill bits. |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| $\overline{\text { SIO3 }}$ | No |  |
| Cn | Yes | Should be programmed high. |

Status Signals

```
ZERO = 1 if intermediate result \(=0\)
    \(N=0\)
    OVR \(=0\)
    C = 1 if carry-out
```


## FUNCTION

Solves the final quotient bit during nonrestoring signed division. An algorithm using this instruction is given in the "Other Arithmetic Instructions" section.

## DESCRIPTION

SDIVIT performs the final subtraction of the divisor from the remainder during nonrestoring signed division. SDIVIT is preceded by $\mathrm{N}-2$ iterations of SDIVI, where N is the number of bits in the dividend.

The R bus must be loaded with the divisor, and the $S$ bus must be loaded with the most significant half of the result of the last SDIVI instruction. The least significant half lies in the MQ register. The $Y$ bus result must be loaded back into the register file for use in the subsequent DIVRF instruction. Carry-in should be programmed high.
SDIVIT checks the pass/fail result of the previous instruction's test subtraction and evaluates;

$$
\begin{array}{ll}
Y \leftarrow R+S & \text { if the test fails } \\
Y \leftarrow R^{\prime}+S+C n & \text { if the test passes }
\end{array}
$$

The contents of the MQ register are shifted one bit to the left; the unfixed quotient bit is circulated into the least significant bit.

Overflow during this instruction is reported to OVR at the end of the signed division routine (after SDIVQF).

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Recommended S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> (C5-C0) | RF <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| Left | Left |

Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{S I O O}$ | No | Pass internally generated end－fill bits． |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No |  |
| Cn | Yes | Should be programmed high |

Status Signals

```
ZERO = 1 if intermediate result =0
    N = 0
    OVR = 0
    C = 1 if carry-out
```


## FUNCTION

Tests for overflow during nonrestoring signed division. An algorithm using this instruction is given in the "Other Arithmetic Instructions section.

## DESCRIPTION

This instruction performs an initial test subtraction of the divisor from the dividend. If overflow is detected, it is preserved internally and reported at the end of the divide routine (after SDIVOF). If overflow status is ignored, the SDIVO instruction may be omitted.

The divisor must be loaded onto the R bus; the most significant half of the previous SDIVIN result must be loaded onto the $S$ bus. The least significant half is in the MQ register.

The result on the Y bus should not be stored back into the register file; WE' should be programmed high.

Carry-in should also be programmed high.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

## Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Use |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Should be programmed high |

3
Status Signals
てE8810シtLNS

| ZERO | $=1$ if divisor $=0$ |
| ---: | :--- |
| $N$ | $=0$ |
| OVR | $=0$ |
| $C$ | $=1$ if carry-out |

## FUNCTION

Tests the quotient result after nonrestoring signed division and corrects it if necessary． An algorithm using this instruction is given in the＂Other Arithmetic Instructions＂ section．

## DESCRIPTION

SDIVQF is the final instruction required to compute the quotient of a 2 N －bit dividend by an N －bit divisor．It corrects the quotient if the signs of the divisor and dividend are different and the remainder is nonzero．

The fix is implemented by incrementing S ：

$$
\begin{array}{ll}
Y \leftarrow S+1 & \text { if a fix is required } \\
Y \leftarrow S+0 & \text { if no fix is required }
\end{array}
$$

The R bus must be loaded with the divisor，and the $S$ bus with the most significant half of the result of the preceding DIVRF instruction．The least significant half is in the MQ register．

## Available R Bus Source Operands

| RF | A3－AO | DA－Port | C3－C0 <br> （A5－AO） <br> A3－AO <br> Immed |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

Recommended S Bus Source Operands

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | None |

## Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Use |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Should be programmed high |

Status Signals

```
ZERO = 1 if quotient = 0
    N = 1 if sign of quotient + 1
        =0 if sign of quotient +0
    OVR = 1 if divide overflow
    C = 1 if carry-out
```


## FUNCTION

Selects S if SSF is high; otherwise selects R.

## DESCRIPTION

Data on the $S$ bus is passed to Y if SSF is programmed high or floating; data on the $R$ bus is passed without carry to Y if SSF is programmed low.

## Available R Bus Source Operands

| RF |  |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | A3-AO |  |  |
| Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |  |
| Yes | No | Yes | No |

## Available S Bus Source

Operands (MSH)

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations

| ALU | MO |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | Yes | Selects S if high, R if low. |
| $\overline{\text { SIOO }}$ | No | Inactive |
| $\overline{\text { SIO1 }}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{\text { SIO3 }}$ | No | Inactive |
| Cn | No | Inactive |

## Status Signals

```
ZERO = 1 if result =0
    N = 1 if MSB = 1
OVR=0
    C=0
```


## EXAMPLE（assumes a 32－bit configuration）

Compare the two＇s complement numbers in registers 1 and 3 and store the larger in register 5.

1．Subtract（SUBS）data in register 3 from data in register 1 and pass the result to the Y bus．
2．Perform Select S／R instruction and pass result to register 5.
［This example assumes the SSF is set by the negative status（ N ）from the previous instruction］．

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－BO | Oprd Sel <br> EAB1－ <br> EA EBO | Dest <br> Addr <br> C5－C0 | SELMO | WE3－ | Destinatio <br> SELRF1－ <br> SELRFO | n Sele <br> $\overline{O E A}$ |  | $\overline{\text { OEY }}$－ | $\overline{\mathrm{OES}}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11110011 | 000001 | 000011 | 000 | x $\times$ x $\times$ x | 0 | XXXX | XX | X | X | 0000 | 0 | 1 | 110 |
| 00010000 | 000001 | 000011 | 000 | 000101 | 0 | 0000 | 10 | X | x | XXXX | 0 | 0 | 110 |

Assume register file 1 holds 008497DO（Hex）and register file 3 holds 01C35250（Hex）．
Instruction Cycle 1
Source $00000000100001001001011111010000 \quad R \leftarrow R F(1)$

Source $00000001110000110101001001010000 \quad \mathrm{~S} \leftarrow \mathrm{RF}(3)$

Destination $11111110110000010100010110000000 \quad \mathrm{Y}$ bus $\leftarrow \mathrm{R}+\mathrm{S}^{\prime}+\mathrm{Cn}$


## Instruction Cycle 2

Source $00000000100001001001011111010000 \quad R \leftarrow R F(1)$


## FUNCTION

Resets bits in selected bytes of S-bus data using mask in C3-C0::A3-A0.

## DESCRIPTION

The register addressed by $\mathrm{B} 5-\mathrm{BO}$ is both the source and destination for this instruction. The source word is passed on the $S$ bus to the $A L U$, where it is compared to an 8 -bit mask, consisting of a concatenation of the $\mathrm{C} 3-\mathrm{CO}$ and A3-AO address ports (C3-C0::A3-A0). The mask is input via the $R$ bus. All bits in the source word that are in the same bit position as ones in the mask are reset. Bytes with their $\overline{S I O}$ inputs programmed low perform the Reset Bit instruction. Bytes with their $\overline{\mathrm{SIO}}$ inputs programmed high or floating pass $S$ unaltered.

Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-A0 <br> Mask |
| No | No | No | Yes |

Available S Bus Source
Operands (MSH)

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| No | Yes | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |  |
| $\overline{S I O O}$ | No | Byte-select |  |
| $\overline{S I O 1}$ | No | Byte-select |  |
| $\overline{S I O 2}$ | No | Byte-select |  |
| $\overline{S I O 3}$ | No | Byte-select |  |
| Cn | No | Inactive |  |

## Status Signals

```
ZERO \(=1\) if result (selected bytes) \(=0\)
    \(N=0\)
\(O V R=0\)
    \(C=0\)
```


## EXAMPLE（assumes a 32－bit configuration）

Set bits 3－0 of bytes 1 and 2 of register file 8 to zero and store the result back in register 8.

| Instr <br> Code $17-10$ | $\begin{gathered} \text { Mask } \\ \text { (LSH) } \\ \text { A3-AO } \end{gathered}$ | Oprd <br> Addr <br> B5－B0 | $\begin{aligned} & \text { Oprd Sel } \\ & \quad E B 1- \\ & \overline{E A} E B O \end{aligned}$ | Mask <br> （MSH） C3-C0 | SELMQ | $\frac{D}{\overline{W E 3}}$ | Destinatio <br> SELRF1－ <br> SELRFO | $\begin{aligned} & \overline{O E A} \text { Sele } \\ & \overline{O E A} \end{aligned}$ | cts <br> $\overline{\mathrm{OEB}}$ | $\overline{\text { OEY }} \overline{\text { OEYO }}$ | $\overline{O E S}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\mathrm{SIO3}}$ | $\frac{1 E S I O 3}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 00011000 | 1111 | 001000 | $\times 00$ | 0000 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 | 1001 | 0000 |

Assume register file 8 holds A083BEBE（Hex）．
Source $00001111000011110000111100001111 \quad \mathrm{Rn} \leftarrow \mathrm{C}-\mathrm{CO}:: \mathrm{A} 3-\mathrm{AO}$
Source $10100000100000111011111010111110 \quad \mathrm{Sn} \leftarrow \mathrm{RF}(3) \mathrm{n}$
ALU $10100000100000001011000010111110 \quad \mathrm{Fn} \leftarrow \mathrm{Sn}$ AND Rn
Destination 10100000100000001011000010111110 RF（8）n $\leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
$\dagger^{\prime} F=A L U$ result
$\mathrm{n}=\mathrm{nth}$ byte
Register file 8 gets F if byte selected， S if byte not selected．

## FUNCTION

Sets bits in selected bytes of S－bus data using mask in C3－CO：：A3－A0．

## DESCRIPTION

The register addressed by $\mathrm{B} 5-\mathrm{BO}$ is both the source and destination for this instruction． The so＇irce word is passed on the $S$ bus to the ALU，where it is compared to an 8－bit mask，consisting of a concatenation of the C3－CO and A3－AO address ports （C3－C0：：A3－A0）．The mask is input via the R bus．All bits in the source word that are in the same bit position as ones in the mask are forced to a logical one．Bytes with their $\overline{S I O}$ inputs programmed low perform the Set Bit instruction．Bytes with their $\overline{S I O}$ inputs programmed high or floating pass $S$ unaltered．

Available R Bus Source Operands

| RF | A3－AO |  |  |
| :---: | :---: | :---: | :---: |
| （A5－AO） | Immed | DA－Port | C3－CO <br> I3－AO <br> A3 <br> Mask |
| No | No | No | Yes |

Available S Bus Source Operands（MSH）

| RF <br> （B5－BO） | DB－Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> （C5－C0） | RF <br> （B5－BO） | Y－Port |
| :---: | :---: | :---: |
| No | Yes | Yes |

Shift Operations

| ALU | MQ |
| :---: | :---: |
| None | None |

## Control／Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |  |
| $\overline{S I O O}$ | Yes | Byte－select |  |
| $\overline{\text { SIO1 }}$ | No | Byte－select |  |
| $\overline{S I O 2}$ | No | Byte－select |  |
| $\overline{S I O 3}$ | No | Byte－select |  |
| Cn | No | Inactive |  |

## Status Signals

```
ZERO \(=1\) if result (selected bytes) \(=0\)
    \(\mathrm{N}=0\)
\(O V R=0\)
    \(C=0\)
```


## EXAMPLE（assumes a 32 －bit configuration）

Set bits 3－0 of byte 1 of register file 1 to zero and store the result back in register 1.

| Instr <br> Code <br> 17－IO | Mask <br> （LSH） <br> A3－AO | Oprd <br> Addr <br> B5－B0 | $\begin{aligned} & \text { Oprd Sel } \\ & \quad \text { EB1- } \\ & \overline{E A} E B O \end{aligned}$ | Mask <br> （MSH） <br> C3－CO | ion Sele |  |  |  |  |  |  | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\text { SIO3－}}$ | $\frac{\text { IESIO3－}}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3－ | RF |  |  | EY |  |  |  |  |  |
|  |  |  |  |  | SELMQ | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{O E B}$ | OEYO | $\overline{\mathrm{OES}}$ |  |  |  |  |
| 00001000 | 1111 | 000001 | $\times 00$ | 0000 | 0 | 0000 | 10 | X | X | XXXX | 0 | $\times$ | 110 | 1101 | 0000 |

## Assume register file 8 holds A083BEBE（Hex）．

Source $00001111000011110000111100001111 \quad \mathrm{Rn} \leftarrow$ C3－CO：：A3－AO
Source $10100000100000111011111010111110 \quad \mathrm{Sn} \leftarrow \mathrm{RF}(1) \mathrm{n}$
ALU $10100000100000111011111110111110 \quad$ Fn $\leftarrow$ Sn OR Rn
Destination $10100000100000111011111110111110 \quad \mathrm{RF}(1) \mathrm{n} \leftarrow \mathrm{Fn}$ or $\mathrm{Sn}^{\dagger}$
${ }^{\dagger} \mathrm{F}=\mathrm{ALU}$ result
$\mathrm{n}=$ nth byte
Register file 1 gets F if byte selected， S if byte not selected．

## FUNCTION

Performs arithmetic left shift on result of ALU operation specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is shifted one bit to the left. A zero is filled into bit 0 of the least significant byte of each word unless the $\overline{\mathrm{SIO}}$ input is programmed low; this will force bit 0 to one. Bit 7 is dropped from the most significant byte in each word, which may be 1,2, or 4 bytes long, depending on the configuration selected.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the MQ register. If SSF is low, the MO register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Available Destination Operands (ALU Shifter)

| $\begin{gathered} \mathrm{RF} \\ \text { (C5-C0) } \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ \text { (B5-BO) } \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high; passes ALU result if low. <br> $\overline{\mathrm{SIOO}}$ Yes |
| $\overline{\mathrm{SIO1}}$ | Yes | one in LSB if low. |
| $\overline{\mathrm{SIO2}}$ | Yes each word if high; fills a |  |
| $\overline{\mathrm{SIO} 3}$ | Yes |  |
| Cn | No | Affects arithmetic operation programmed in bits <br> I3-10 of instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR = 1 if signed arithmetic overflow or if MSB XOR MSB-1 \(=1\) before shift
    \(\mathrm{C}=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Perform the computation $A=2(A+B)$, where $A$ and $B$ are single-precision, two's complement numbers. Let $A$ be stored in register 1 and $B$ be input via the DB bus.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-B0 | $\begin{array}{\|c} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{\text { EA EBO }} \end{array}$ | Dest <br> Addr <br> C5-C0 | Destination Selects$\qquad$ |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\text { S103- }}{\text { S100 }}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 01000001 | 000001 | x $\times$ X $\times$ x $\times$ | 010 | 000001 | 0 | 0000 | 10 | X | X | XxxX | 0 | 0 | 110 | 1110 | 0000 | 1 |

Assume register file 1 holds 1308C618 (Hex), DB bus holds 44007530 (Hex).

Source $00010011000010001100011000011000 \quad R \leftarrow R F(1)$

Source $01000100000000000111010100110000 \quad \mathrm{~S} \leftarrow \mathrm{DB}$ bus

Intermediate
Result

```
010101110000 1001001110110100 1000
```

ALU Shifter $\leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn}$

Destination 10101110000100100111011010010001
$\mathrm{RF}(1) \leftarrow \mathrm{ALU}$ shift result

## FUNCTION

Performs arithmetic left shift on MQ register (LSH) and result of ALU operation (MSH) specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is used as the upper half of a double-precision word, the contents of the MQ register as the lower half.

The contents of the MQ register are shifted one bit to the left. A zero is filled into bit $O$ of the least significant byte of each word unless the $\overline{\mathrm{SIO}}$ input for the word is set to zero; this will force bit 0 to one. Bit 7 of the most significant byte in the MO shifter is passed to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte in the ALU shifter is dropped.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX and MQ register. If SSF is low, the ALU output and MO register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Arithmetic Left | Arithmetic Left |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | Yes | Passes shift result if high; passes ALU result if low. <br> $\overline{S I O O}$ |
| Yes | Fills a zero in LSB of each word if high; fills a |  |
| $\overline{\text { SIO1 }}$ | Yes | one in LSB if low. |
| $\overline{S I O 2}$ | Yes |  |
| $\overline{S I O 3}$ | Yes |  |
| No | No | Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR \(=1\) if signed arithmetic overflow or if MSB XOR MSB-1 \(=1\) before shift
    \(\mathrm{C}=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Perform the computation $A=2(A+B)$ ，where $A$ and $B$ are two＇s complement numbers． Let $A$ be a double precision number residing in register $1(\mathrm{MSH})$ and the MQ register （LSH）．Let $B$ be a single precision number which is input through the DB bus．


Assume register file 1 holds 2408C618（Hex），DB bus holds 26007530 （Hex），and MQ register holds 50A99AOE（Hex）．

## MSH

Source $00100100000010001100011000011000 \quad R \leftarrow R F(1)$

Source $00100110000000000111010100110000 \quad S \leftarrow D B$ bus
Intermediate
Result
01001010000010010011101101001000 ALU Shifter $\leftarrow R+S+C n$

Destination $10010100000100100111011010010000 \quad \mathrm{RF}(1) \leftarrow \mathrm{ALU}$ shift register

## LSH

Source 01010000101010011001101000001110 MQ shifter $\leftarrow$ MQ register

Destination 10100001010100110011010000011101 MQ register $\leftarrow$ MQ shift result

## FUNCTION

Performs circular left shift on result of ALU operation specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is rotated one bit to the left. Bit 7 of the most significant byte in each word is passed to bit 0 of the least significant byte in the word, which may be 1, 2, or 4 bytes long.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to Y MUX. If SSF is low, $F$ is passed unaltered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Circular Left | None |

## Available Destination Operands (ALU Shifter)

| $\begin{gathered} \mathrm{RF} \\ \text { (C5-C0) } \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ (\mathrm{~B} 5-\mathrm{BO}) \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\frac{\text { SSF }}{}$ | Yes | Passes shift result if high; passes ALU result if low. |
| $\overline{\text { SIOO }}$ | No | Bit 7 of ALU result |
| $\overline{S I O 1}$ | No | Bit 15 of ALU result |
| $\overline{S I O 2}$ | No | Bit 23 of ALU result |
| $\overline{S I O 3}$ | No | Bit 31 of ALU result |
| Cn | No | Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR \(=1\) if signed arithmetic overflow
        C \(=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Perform a circular left shift of register 6 and store the result in register 1.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | Oprd Sel $\overline{\mathrm{EA}} \mathrm{~EB} 1-$ | Dest <br> Addr C5-CO | SELMQ | $\frac{\overline{W E 3}}{\overline{W E O}}$ | Destinatio <br> SELRF1－ <br> SELRFO | Sele $\overline{\mathrm{OEA}}$ | $\mathrm{cts}$ $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\mathrm{OES}}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01100110 | 000110 | XX $\times \times \times \mathrm{X}$ | 000 | 000001. | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1 |

Assume register file 6 hoids 3788C618（Hex）．

| Source | 00110111100010001100011000011000 | $R \leftarrow R F(6)$ |
| ---: | ---: | ---: |
| Intermediate |  |  |
| Result | 00110111100010001100011000011000 | ALU Shifter $\leftarrow \mathrm{R}+\mathrm{Cn}$ |
| Destination | 01101111000100011000110000110000 | $R F(1) \leftarrow$ ALU shifter result |

## SLCD

## FUNCTION

Performs circular left shift on MQ register (LSH) and result of ALU operation specified in lower nibble of instruction field (MSH).

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is used as the upper half of a double-precision word, the contents of the MQ register as the lower half.

The contents of the MQ and ALU registers are rotated one bit to the left. Bit 7 of the most significant byte in the MQ shifter is passed to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte is passed to bit 0 of the least significant byte in the MO shifter.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to $Y$ MUX. If SSF is low, $F$ is passed unaltered and the MQ register is not changed.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Circular Left | Circular Left |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high; passes ALU result if low. <br> $\overline{\text { SIOO }}$ <br> $\overline{\text { SIO1 }}$ |
| No | Bit 7 of ALU result |  |
| $\overline{\text { SIO2 }}$ | No | Bit 15 of ALU result |
| $\overline{S I O 3}$ | No | Bit 23 of ALU result |
| Cn | No | Bit 31 of ALU result <br> Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |

## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result = 0
    N = 1 if MSB of result = 1
        = 0 if MSB of result = 0
    OVR = 1 if signed arithmetic overflow
    C = 1 if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Perform a circular left double precision shift of data in register 6 (MSH) and MQ (LSH), and store the result back in register 6 and the MQ register.

| Instr <br> Code <br> 17-I0 | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr B5-BO | Oprd Sel$\overline{E A} E B 0$ | Dest <br> Addr C5-CO | SELMQDestination Selects    <br> $\overline{W E 3}$ SELRF1-   <br> $\overline{\text { WEO }}$    <br> SELRFO $\overline{O E A}$ $\overline{O E B}$ $\overline{O E Y O}$ <br> OES    |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 01110110 | 000110 | XX XXXX | 000 | 000110 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1. |

Assume register file 6 holds 3708C618 (Hex) and MQ register holds 50A99A0E (Hex).
MSH


## LSH

Source 01010000101010011001101000001110 MQ register $\leftarrow M Q$ register

Destination
10100001010100110011010000011100 MO register $\leftarrow$ MO shift result

## FUNCTION

Converts data on the $S$ bus from sign magnitude to two's complement or vice versa.

## DESCRIPTION

The $S$ bus provides the source word for this instruction. The number is converted by inverting $S$ and adding the result to the carry-in, which should be programmed high for proper conversion; the sign bit of the result is then inverted. An error condition will occur if the source word is a negative zero (negative sign and zero magnitude). In this case, SMTC generates a positive zero, and the OVR pin is set high to reflect an illegal conversion.

The sign bit of the selected operand in the most significant byte is tested; if it is high, the converted number is passed to the destination. Otherwise the operand is passed unaltered.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> Mask |
| :---: | :---: | :---: | :---: |
| Immed | Mo | No | No |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands
Shift Operations

| $\begin{gathered} \hline \mathrm{RF} \\ (\mathrm{C} 5-\mathrm{CO}) \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ \text { (B5-BO) } \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | None |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{S I O O}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{S I O 2}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Should be programmed high for proper conversion |

## Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
OVR = 1 if input of most significant byte is 80(Hex) and results in all other
        bytes are 00 (Hex).
    C=1 if S = 0
```


## EXAMPLES (assumes a 32-bit configuration)

Convert the two's complement number in register 1 to sign magnitude representation and store the result in register 4.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-A0 | Oprd <br> Addr <br> B5-BO | $\begin{gathered} \text { Oprd Sel } \\ \quad \mathrm{EB} 1- \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{gathered}$ | Dest <br> Addr C5-CO | SELMQ | $\frac{\overline{W E 3}}{\overline{W E O}}$ | Destinatio <br> SELRF1- <br> SELRFO | on Sele $\overline{\mathrm{OEA}}$ | cts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y B}} \overline{\overline{O E Y O}}$ | $\overline{\mathrm{OES}}$ | Cn | CF2- CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01011000 | XX XXXX | 000001 | $\times 00$ | 000100 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Example 1: Assume register file 1 holds C3F6D840 (Hex).

Source $11000011111101101101100001000000 \quad S \leftarrow R F(1)$

Destination $10111100000010010010011111000000 \quad R F(4) \leftarrow S^{\prime}+\mathrm{Cn}$

Example 2: Assume register file 1 holds 550927CO (Hex).

Source
01010101000010010010011111000000
$S \leftarrow R F(1)$

Destination
01010101000010010010011111000000
$R F(4) \leftarrow S$

## FUNCTION

Computes one of $\mathrm{N}-1$ signed or N mixed multiplication iterations for computing an N -bit by N -bit product. Algorithms for signed and mixed multiplication using this instruction are given in the "Other Arithmetic Instructions" section.

## DESCRIPTION

SMULI checks to determine whether the multiplicand should be added with the present partial product. The instruction evaluates:

$$
\begin{array}{ll}
\mathrm{F} \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn} & \text { if the addition is required } \\
\mathrm{F} \leftarrow \mathrm{~S} & \text { if no addition is required }
\end{array}
$$

A double precision right shift is performed. Bit 0 of the least significant byte of the ALU shifter is passed to bit 7 of the most significant byte of the MQ shifter; carry-out is passed to the most significant bit of the ALU shifter.

The $S$ bus should be loaded with the contents of an accumulator and the $R$ bus with the multiplicand. The Y bus result should be written back to the accumulator after each iteration of UMULI. The accumulator should be cleared and the MO register loaded with the multiplier before the first iteration.

Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-AO) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

## Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | No |


| ALU | MO |
| :---: | :---: |
| Right | Right |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Passes LSB from ALU shifter to MSB of MO shifter． |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{\text { SIO3 }}$ | No |  |
| Cn | Yes | Should be programmed low |

Status Signals
てE881つももLNS
ZERO $=1$ if result $=0$
$N=1$ if $M S B=1$
$O V R=0$
$C=1$ if carry－out

## FUNCTION

Performs the final iteration for computing an N -bit by N -bit signed product. An algorithm for signed multiplication using this instruction is given in the "other Arithmetic Instructions' section.

## DESCRIPTION

SMULI checks the present multiplier bit (the least significant bit of the MO register) to determine whether the multiplicand should be added with the present partial product. The instruction evaluates:

$$
\begin{array}{ll}
\mathrm{F} \leftarrow \mathrm{R}^{\prime}+\mathrm{S}+\mathrm{Cn} & \text { if the addition is required } \\
\mathrm{F} \leftarrow \mathrm{~S} & \text { if no addition is required }
\end{array}
$$

with the correct sign in the product.
A double precision right shift is performed. Bit 0 of the least significant byte of the ALU shifter is passed to bit 7 of the most significant byte of the MQ shifter.

The $S$ bus should be loaded with the contents of an register file holding the previous iteration result; the R bus must be loaded with the multiplicand. After executing SMULT, the $Y$ bus contains the most significant half of the product, and MQ contains the least significant half.

Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-C0 <br> $::$ <br> A3-A0 <br> Mask |
| Yes | No | Yes | No |

Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Available Destination Operands
Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | No |


| ALU | MQ |
| :---: | :---: |
| Right | Right |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Passes LSB from ALU shifter to MSB of MO shifter． |
| $\overline{\mathrm{SIO1}}$ | No |  |
| $\overline{\mathrm{SIO2}}$ | No |  |
| $\overline{\mathrm{SIO3}}$ | No | Should be programmed low |
| Cn | Yes |  |

Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
    OVR=0
    C = 1 if carry-out
```


## FUNCTION

Tests the two most significant bits of the MO register. If they are the same, shifts the number to the left.

## DESCRIPTION

This instruction is used to normalize a two's complement number in the MQ register by shifting the number one bit position to the left and filling a zero into the LSB (unless the $\overline{\mathrm{SIO}}$ input for that word is low). Data on the $S$ bus is added to the carry, permitting the number of shifts performed to be counted and stored in one of the register files.

The shift and the $S$ bus increment are inhibited whenever normalization is attempted on a number already normalized. Normalization is complete when overflow occurs.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-C0 <br> $::$ <br> (A3-AO) |
| :---: | :---: | :---: | :---: |
| Immed | Dask |  |  |
| Mo | No | No | No |

Available S Bus Source Operands (Count)

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | No | No |

Available Destination Operands
(Count)

| RF <br> (C5-C0) | RF <br> $(B 5-B 0)$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

Shift Operations
(Conditional)

| ALU | MQ |
| :---: | :---: |
| No | Left |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Passes internally generated end－fill bit． |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No |  |
| $C n$ | Yes | Increments S bus（shift count）if set to one． |

## Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB of MQ register = 1
    OVR = 1 if MSB of MO register XOR 2nd MSB = 1
    C = 1 if carry-out = 1
```


## EXAMPLE（assumes a 32－bit configuration）

Normalize the number in the MO register，storing the number of shifts in register 3.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | $\begin{array}{r} \text { Oprd Sel } \\ \quad \text { EB1- } \\ \overline{\mathrm{EA}} \mathrm{EBO} \end{array}$ | Dest <br> Addr C5-C0 | SELMO | $\begin{aligned} & \overline{W E 3} \\ & \overline{W E O} \end{aligned}$ | Destinati <br> SELRF1－ <br> SELRFO | on Sele $\overline{O E A}$ | cts $\overline{\mathrm{OEB}}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 00100000 | XX XXXX | 000011 | $\times 00$ | 000011 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume register file 3 holds 00000003 （Hex）and MQ register holds 3699D84E（Hex）．

## Operand



## Count

$$
\begin{aligned}
& \text { Source } 00000000000000000000000000000011 \quad S \leftarrow R F(3) \\
& \text { Destination } 00000000000000000000000000000100 \quad \mathrm{RF}(3) \leftarrow \mathrm{S}+\mathrm{Cn}
\end{aligned}
$$

## FUNCTION

Performs arithmetic right shift on result of ALU operation specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is shifted one bit to the right. The sign bit of the most significant byte is retained unless it is inverted as a result of overflow. Bit 0 of the least significant byte is dropped.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to the Y MUX.
*A list of ALU operations that can be used with this instruction is given in Table 15.
Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Arithmetic Right | None |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-CO) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shifted output if high; passes ALU result <br> if low. |
| $\overline{\text { SIOO }}$ | No | LSB is shifted out from each word, which may be <br> $\overline{S I O 1}$ |
| No | 1,2, or 4 bytes long depending on selected <br> configuration |  |
| $\overline{S I O 2}$ | No | No |
| Cn | No | Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
\mathrm{~N} & =1 \text { if MSB of result }=1 \\
& =0 \text { if } M S B \text { of result }=0 \\
\text { OVR } & =0 \\
C & =1 \text { if carry-out condition }
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Perform the computation $A=(A+B) / 2$, where $A$ and $B$ are single-precision numbers.
Let $A$ reside in register 1 and $B$ be input via the $D B$ bus.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr A5-AO | Oprd <br> Addr <br> B5-B0 | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{E A} \mathrm{EBO} \end{array}$ | Dest <br> Addr <br> C5-CO |  |  |  |  |  |  |  | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 00000001 | 000001 | X X XXXX | 010 | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1 |

Assume register file 1 holds 6A08C618 (Hex) and DB bus holds 51007530 (Hex).
Source $01101010000010001100011000011000 \quad R \leftarrow R F(1)$

Source $01010001000000000111010100110000 \quad S \leftarrow D B$ bus

Intermediate ${ }^{\ddagger}$
Result
10111011000010010011101101001000
ALU Shifter $\leftarrow \mathbf{R}+\mathbf{S}+\mathrm{Cn}$

Destination
01011101100001001001110110100100
$\mathrm{RF}(1) \leftarrow \mathrm{ALU}$ shift result
${ }^{\ddagger}$ After the intermediate operation (ADD), overflow has occurred and OVR status signal is set high. When the arithmetic right shift is executed, the sign bit is corrected (see Table 16 for shift definition notes).

## FUNCTION

Performs arithmetic right shift on MO register (LSH) and result of ALU operation (MSH) specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is used as the upper half of a double precision word, the contents of the MQ register as the lower half.

The contents of the ALU are shifted one bit to the right. The sign bit of the most significant byte is retained unless the sign bit is inverted as a result of overflow. Bit 0 of the least significant byte in the ALU shifter is passed to bit 7 of the most significant byte of the MQ register. Bit 0 of the MO register's least significant byte is dropped.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to the Y MUX.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Arithmetic Right | Arithmetic Right |

Available Destination Operands (ALU Shifter)
$\left.\begin{array}{|c|c|c|}\hline \text { RF } \\ \text { (C5-C0) }\end{array} \begin{array}{c}\text { RF } \\ \text { (B5-B0) }\end{array}\right)$ Y-Port $\mid$ Yes

Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shifted output if high; passes ALU result <br> if low. <br> LSB of ALU shifter is passed to MSB of MO shifter, <br> and LSB of MO shifter is dropped. |
| $\overline{\text { SIOO }}$ | No |  |
| $\overline{\text { SIO1 }}$ | No | Affects arithmetic operation specified in bits I3-IO of <br> $\overline{S I O 2}$ <br> $\overline{\text { SIO3 }}$ <br> No <br> instruction field. |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if MSB of result }=1 \\
& =0 \text { if MSB of result }=0 \\
\text { OVR } & =0 \\
C & =1 \text { if carry-out condition }
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Perform the computation $A=(A+B) / 2$ ，where $A$ and $B$ are two＇s complement numbers． Let $A$ be a double precision number residing in register 1 （MSH）and MO（LSH）．Let $B$ be a single precision number which is input through the DB bus．

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－B0 | Oprd Sel | Dest | Destination Selects |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | EB1－ | Addr |  | WE3－ | LRF1－ |  |  | OEY3－ |  |  |  |  |
|  |  |  | EA EBO | C5－C0 | SELMQ | WEO | SELRFO | OEA | OEB | OEYO | OES |  |  |  |
| 00010001 | 000001 | XX XXXX | 010 | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1 |

Assume register file 1 holds 4A08C618（Hex），and DB bus holds 51007530 （Hex）， and MQ register holds 17299AOF（Hex）．

MSH
Source $01001010000010001100011000011000 \quad R \leftarrow R F(1)$
Source $01010001000000000111010100110000 \quad S \leftarrow D B$ bus


## LSH

Source 00010111001010011001101000001111 MQ shifter $\leftarrow$ MQ register

Destination 00001011100101001100110100000111 MQ register $\leftarrow$ MQ shift result

[^13]
## FUNCTION

Performs circular right shift on result of ALU operation specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is shifted one bit to the right. Bit 0 of the least significant byte is passed to bit 7 of the most significant byte in the same word, which may be 1, 2, or 4 bytes long depending on the selected configuration.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to the Y MUX.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Circular Right | None |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high; passes ALU result <br> if low. |
| $\overline{S I O O}$ | No | Rotates LSB to MSB of the same word, which may <br> be 1, 2, or 4 bytes long depending on configuration |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No | Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |
| $\overline{S I O 3}$ | No | No |
| Cn |  |  |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR \(=1\) if signed arithmetic overflow
    \(\mathrm{C}=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Perform a circular right shift of register 6 and store the result in register 1.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－B0 | Oprd Sel$\overline{E A} E B O$ | Dest <br> Addr <br> C5－CO | $$ |  |  |  |  |  |  | C | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 10000110 | 000110 | XX XXXX | 0 XX | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | 1 |

Assume register file 6 holds 3788C618（Hex）．
Source $00110111100010001100011000011000 \quad R \leftarrow R F(6)$

Intermediate
Result
00110111100010001100011000011000
ALU Shifter $\leftarrow R+C n$

Destination $\square$
00011011110001000110001100001100
$\mathrm{RF}(1) \leftarrow \mathrm{ALU}$ shift result

## FUNCTION

Performs circular right shift on MQ register (LSH) and result of ALU operation (MSH) specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is used as the upper half of a double precision word, the contents of the MQ register as the lower half.

The contents of the ALU and MO shifters are rotated one bit to the right. Bit 0 of the least significant byte in the ALU shifter is passed to bit 7 of the most significant byte of the MQ shifter. Bit 0 of the least significant byte is passed to bit 7 of the most significant byte of the ALU shifter.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MXU and MQ register. If SSF is low, the Y MUX and MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Circular Right | Circular Right |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-CO) | $R F$ <br> (B5-BO) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high; passes ALU result and <br> retains MQ register if low. |
| $\overline{\text { SIOO }}$ | No | Rotates LSB of ALU shifter to MSB of MO shifter, <br> and LSB of MO shifter to MSB of ALU shifter |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No | Affects arithmetic operation specified in bits I3-10 of <br> instruction field. |
| Cn | No | No |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result = 0
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR = 1 if signed arithmetic overflow
    \(\mathrm{C}=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Perform a circular right double precision shift of the data in register 6 (MSH) and MQ (LSH), and store the result back in register 6 and the MQ register.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-B0 | Oprd Sel$\overline{E A} E B O$ | Dest <br> Addr $\mathrm{C} 5-\mathrm{CO}$ | SELMO | Destination Se |  |  |  |  | OES | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | $\overline{W E 3}$ | SELRF1- |  |  | OEY3- |  |  |  |
|  |  |  |  |  |  | WEO | SELRFO | $\overline{\text { OEA }}$ | $\overline{O E B}$ | OEYO |  |  |  |
| 10010110 | 000110 | XX XXXX | 0 XX | 000110 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 |

Assume register file 6 holds 3788C618 (Hex) and MQ register holds 50A99AOF (Hex). MSH

| Source | 00110111000010001100011000011000 | $\mathrm{R} \leftarrow \mathrm{RF}(6)$ |
| :---: | :---: | :---: |
| Intermediate Result | 00110111000010001100011000011000 | ALU shifter $\leftarrow \mathrm{R}+\mathrm{Cn}$ |
| Destination | 10011011100001000110001100001100 | $\mathrm{RF}(6) \leftarrow \mathrm{ALU}$ shift result |

## LSH



## FUNCTION

Performs logical right shift on result of ALU operation specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is shifted one bit to the right. A zero is placed in the bit 7 of the most significant byte of each word unless the $\overline{\mathrm{SIO}}$ input for the word is programmed low; this will force the sign bit to one. The LSB is dropped from the word, which may be 1,2 , or 4 bytes long depending on selected configuration.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX. If SSF is low, the ALU result will be passed unshifted to the Y MUX.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Logical Right | None |

## Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals ${ }^{\ddagger}$

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF |  | Passes shift result if high or floating; passes ALU <br> result if low. |
| $\overline{\text { SIOO }}$ | Yes | Fills a zero in MSB of the word if high or floating; <br> fills a one in MSB if low. |
| $\overline{S I O 1}$ | Yes |  |
| $\overline{S I O 2}$ | Yes | Yes |
| Cn |  | Inactive |

[^14]
## EXAMPLE (assumes a 32-bit configuration)

Perform a logical right single precision shift on data on the DA bus, and store the result in register 1.

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-BO | $\begin{array}{\|r\|} \hline \text { Oprd } \\ \text { Sel } \\ \\ \\ \hline \text { EA } 1- \\ \hline \end{array}$ | Dest <br> Addr <br> C5-CO | $\begin{array}{lllll\|}  & \overline{y y y y} & \text { Destination Selects } \\ \text { SELMO } & \overline{W E 3} & \text { SELRF1- } & & \overline{\mathrm{OEY}}- \\ & & \\ \hline \text { SELRFO } & \overline{\mathrm{OEA}} & \text { OEB } & \overline{\mathrm{OEYO}} & \overline{\mathrm{OES}} \end{array}$ |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\frac{\mathrm{SIO3}}{\mathrm{SIOO}}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ | SSF |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 00100110 | x $\times$ x $x \times x$ | Xx XXXX | 1 XX | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 0 | 110 | xxx1 | 0000 | 1 |

Assume DA bus holds 2DA8C615.

Source
$00101101101010001100011000010101 \quad R \leftarrow$ DA bus
ntermediate
Result
00101101101010001100011000010101
ALU Shifter $\leftarrow R+C n$

Destination
00010110110101000110001100001010
RF(1) $\leftarrow$ ALU shift result

## FUNCTION

Performs logical right shift on MQ register (LSH) and result of ALU operation (MSH) specified in lower nibble of instruction field.

## DESCRIPTION

The result of the ALU operation specified in instruction bits $13-10$ is used as the upper half of a double precision word, the contents of the MQ register as the lower half.

The ALU result is shifted one bit to the right. A zero is placed in the sign bit of the most significant byte unless the $\overline{\mathrm{SIO}}$ input for that word is programmed low; this will force the sign bit to one. Bit 0 of the least significant byte is passed to bit 7 of the most significant byte of the MQ shifter. Bit 0 of the least significant byte of the MO shifter is dropped.

The shift may be made conditional on SSF. If SSF is high or floating, the shift result will be sent to the Y MUX and MQ register. If SSF is low, the ALU result and MQ register will not be altered.
*A list of ALU operations that can be used with this instruction is given in Table 15.

## Shift Operations

| ALU Shifter | MQ Shifter |
| :---: | :---: |
| Logical Right | Logical Right |

Available Destination Operands (ALU Shifter)

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | Yes | Passes shift result if high; passes ALU result and <br> retains MO |
| $\overline{S I O O}$ | Yes | Fills a zero in MSB if high or floating; <br> fills a one MSB if low. |
| $\overline{S I O 1}$ | Yes |  |
| $\overline{S I O 2}$ | Yes | Affects arithmetic operation specified in bits I3-IO of <br> instruction field. |
| Cn | Yes <br> No |  |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(\mathrm{N}=1\) if MSB of result \(=1\)
        \(=0\) if MSB of result \(=0\)
OVR \(=1\) if signed arithmetic overflow
    \(\mathrm{C}=1\) if carry-out condition
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Perform a logical right double precision shift of the data in register $1(\mathrm{MSH}$ ) and MQ (LSH), filling a one into the most significant bit, and store the result back in register 1 and the MO register.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr B5-BO | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{E A} E B O \end{array}$ | Dest <br> Addr <br> C5-C0 |  |  |  |  |  |  |  | Cn |  | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\mathrm{SIO}}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 00110110 | XX X X X $\times$ | 000001 | $\times 00$ | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 |  |  | 110 | 1110 | 0000 |

Assume register file 1 holds 2DA8C615 (Hex) and MQ register holds 50A99A0E (Hex).
MSH

| Source | 00101101101010001100011000010101 | $R \leftarrow R F(1)$ |
| ---: | :--- | ---: |
|  |  |  |
| Intermediate <br> Result | 00101101101010001100011000010101 | ALU Shifter $\leftarrow \mathrm{S}+\mathrm{Cn}$ |
|  |  |  |
| Destination | 10010110110101000110001100001010 | $R F(1) \leftarrow$ ALU shift result |

## LSH

Source 01010000101010011001101000001110 MQ shifter $\leftarrow$ MQ register

Destination 10101000010101001100110100000111 MQ register $\leftarrow$ MQ shift result

## FUNCTION

Subtracts four-bit immediate data on A3-AO with carry from S-bus data.

## DESCRIPTION

Immediate data in the range 0 to 15 , supplied by the user at $A 3-A 0$, is inverted and added with carry to S .

Available R Bus Source Operands (Constant)

| RF | A3-A0 |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-CO <br> $::$ <br> A3-A0 <br> Mask |
| No | Yes | No | No |

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |



## Control/Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Inactive |
| $\overline{S I O 1}$ | No | Inactive |
| $\overline{\text { SIO2 }}$ | No | Inactive |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | Yes | Two's complement subtraction if programmed high. |

## Status Signals

```
ZERO = 1 if result = 0
    N = 1 if MSB = 1
    OVR = 1 if arithmetic signed overflow
    C = 1 if carry-out
```


## EXAMPLE（assumes a 32－bit configuration）

Subtract the value 12 from data on the DB bus，and store the result into register file 1.

| Instr <br> Code <br> 17－10 | Oprd <br> Addr <br> A5－A0 | Oprd <br> Addr <br> B5－B0 | Oprd Sel <br> EB1－ <br> $\overline{\mathrm{EA}}$ EBO | Dest <br> Addr <br> C5－C0 | SELMQ | $\begin{aligned} & \overline{W E 3}- \\ & \overline{W E O} \end{aligned}$ | Destinati SELRF1－ SELRFO | on Sele <br> $\overline{O E A}$ | cts <br> $\overline{\text { OEB }}$ | $\overline{\overline{O E Y 3}} \overline{\overline{O E Y O}}$ | $\overline{\text { OES }}$ | Cn | CF2－ CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 01111000 | 001100 | XX XXXX | $\times 10$ | 000001 | 0 | 0000 | 10 | X | X | XXXX | 0 | 1 | 110 |

Assume bits A3－AO hold C（Hex）and DB bus holds 24000100 （Hex）．
Source $00000000000000000000000000001100 \quad R \leftarrow$ A3－AO

Source $00100100000000000000000100000000 \quad S \leftarrow D B$ bus

Destination $00100100000000000000000011110100 \quad \mathrm{RF}(1) \leftarrow \mathrm{R}^{\prime}+\mathrm{S}+\mathrm{Cn}$

## SUBR

## FUNCTION

Subtracts data on the $R$ bus from $S$ with carry.

## DESCRIPTION

Data on the $R$ bus is subtracted with carry from data on the $S$ bus. The result appears at the ALU and MO shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  | Mask | Imes |
| :---: |
| Yo |
| Yes |

Available S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{\text { SIOO }}$ | No | instruction field. |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| $\overline{\text { SIO3 }}$ | No |  |
| Cn | Yes | Two's complement subtraction if programmed high. |

## Status Signals ${ }^{\dagger}$

```
ZERO = 1 if result \(=0\)
    \(\mathrm{N}=1\) if \(\mathrm{MSB}=1\)
    OVR = 1 if signed arithmetic overflow
        C \(=1\) if carry-out
```

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and N （negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Subtract data in register 1 from data on the DB bus，and store the result in the MQ register．

| Instr <br> Code $17-10$ | Oprd <br> Addr <br> A5－AO | Oprd <br> Addr <br> B5－B0 | Oprd Sel $\overline{E A} E B O$ | Dest <br> Addr C5-CO | SELMQ | WE3－ <br> $\overline{W E O}$ | Destinatio <br> SELRF1－ <br> SELRFO | $\begin{aligned} & \text { n Sele } \\ & \overline{\mathrm{OEA}} \end{aligned}$ | cts <br> $\overline{O E B}$ | $\begin{aligned} & \overline{\text { OEY3 }}-\overline{\text { OEYO }} \end{aligned}$ | $\overline{\mathrm{OES}}$ | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11100010 | 000001 | XX XXXX | $0 \quad 10$ | XX XXXX | 1 | XXXX | XX | X | X | XXXX | 0 | 1 | 110 |

Assume register file 1 holds 150084DO（Hex）and DB bus holds 4900C350（Hex）．
Source
00010101000000001000010011010000
$R \leftarrow R F(1)$

Source $01001001000000001100001101010000 \quad S \leftarrow$ DB bus

Destination 00110100000000000011111010000000 MQ register $\leftarrow R^{\prime}+\mathrm{S}+\mathrm{Cn}$

## FUNCTION

Subtracts data on the $S$ bus from $R$ with carry.

## DESCRIPTION

Data on the $S$ bus is subtracted with carry from data on the $R$ bus. The result appears at the ALU and MO shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-C0 <br> $::$ <br> A3-A0 <br> Mask |
| Yes | No | Yes | No |

## Available S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{S I O O}$ | No | instruction field. |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Two's complement subtraction if programmed high. |
| Cn | Yes |  |

## Status Signals ${ }^{\dagger}$

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result }=0 \\
N & =1 \text { if MSB }=1 \\
\text { OVR } & =1 \text { if signed arithmetic overflow } \\
C & =1 \text { if carry-out }
\end{aligned}
$$

${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

## EXAMPLE（assumes a 32－bit configuration）

Subtract data on the DB bus from data in register 1，and store the result in the MQ register．


Assume register file 1 holds 150084DO（Hex）and DB bus holds 4900C350（Hex）．
Source $00010101000000001000010011010000 \quad R \leftarrow R F(1)$

Source
01001001000000001100001101010000
$S \leftarrow D B$ bus

Destination
11001011111111111100000110000000
MO register $\leftarrow \mathrm{R}+\mathrm{S}^{\prime}+\mathrm{Cn}$

## FUNCTION

Tests bits in selected bytes of S-bus data for zeros using mask in C3-CO::A3-A0.

## DESCRIPTION

The S bus is the source word for this instruction. The source word is passed to the ALU, where it is compared to an 8-bit mask, consisting of a concatenation of the C3-CO and A3-A0 address ports (C3-C0::A3-A0). The mask is input via the R bus. The test will pass if the selected byte has zeros at all bit locations specified by the ones of the mask. Bytes are selected by programming the $\overline{\mathrm{SIO}}$ inputs low. Test results are indicated on the ZERO output, which goes to one if the test passes. Register write is internally disabled during this instruction.

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Inactive |  |
| $\overline{S I O O}$ | Yes | Byte Select |  |
| $\overline{S I O 1}$ | Yes | Byte Select |  |
| $\overline{S I O 2}$ | Yes | Byte Select |  |
| $\overline{S I O 3}$ | Yes | Byte Select |  |
| $C n$ | No | Inactive |  |

## Status Signals

$$
\begin{aligned}
\text { ZERO } & =1 \text { if result (selected bytes) }=\text { Pass } \\
N & =0 \\
\text { OVR } & =0 \\
C & =0
\end{aligned}
$$

## EXAMPLE (assumes a 32-bit configuration)

Test bits 7, 6 and 5 of bytes 0 and 2 of data in register 3 for zeroes.

| Instr <br> Code $17-10$ | $\begin{gathered} \text { Mask } \\ \text { (LSH) } \\ \text { A3-AO } \end{gathered}$ | Oprd <br> Addr <br> B5-B0 | Oprd Sel <br> EB1- <br> $\overline{E A}$ EBO | Mask <br> (MSH) <br> C3-C0 | - Destination Selects |  |  |  |  |  |  | Cn | $\begin{aligned} & \text { CF2- } \\ & \text { CFO } \end{aligned}$ | $\overline{\text { SIO3- }}$ | $\frac{\overline{\text { IESIO3 }}}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | WE3 | LRF1 |  |  | OEY |  |  |  |  |  |
|  |  |  |  |  | SELMO | WEO | SELRFO | $\overline{O E A}$ | $\overline{\text { OEB }}$ | OEYO | OES |  |  |  |  |
| 00111000 | 0000 | 000011 | $\times 00$ | 1110 | X | XXXX | XX | X | X | XXXX | 0 | X | 110 | 1010 | 0000 |

Assume register file 3 holds $881 \mathrm{CDO03}$ (Hex).
Source $11100000111000001110000011100000 \quad R \leftarrow$ Mask (C3-CO::A3-AO)

Source $10001000000111001101000000000011 \quad \mathrm{SN} \leftarrow \mathrm{RF}(3) \mathrm{n}^{\dagger}$
${ }^{\dagger} \mathrm{n}=\mathrm{nth}$ byte

Output | 1 |
| :---: |
| ZERO $\leftarrow 1$ |

## FUNCTION

Tests bits in selected bytes of S-bus data for ones using mask in C3-CO::A3-A0.

## DESCRIPTION

The $S$ bus is the source word for this instruction. The source word is passed to the ALU, where it is compared to an 8-bit mask, consisting of a concatenation of the C3-C0 and A3-A0 address ports (C3-C0::A3-A0). The mask is input via the R bus. The test will pass if the selected byte has ones at all bit locations specified by the ones of the mask. Bytes are selected by programming the $\overline{\mathrm{SIO}}$ inputs low. Test results are indicated on the ZERO output, which goes to one if the test passes. Register write is internally disabled for this instruction.

Available R Bus Source Operands
$\begin{array}{|c|c|c|c|}\hline \text { RF } \\ \text { (A5-AO) }\end{array}$ A3-AO $\begin{array}{c}\text { Immed }\end{array}$ DA-Port $\left.\begin{array}{c}\text { C3-C0 } \\ \text { A3-AO } \\ \text { Mask }\end{array}\right\}$

## Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Control/Data Signals

| Signal | User <br> Programmable |  | Use |
| :--- | :--- | :--- | :--- |
| SSF | No | Inactive |  |
| $\overline{\text { SIOO }}$ | Yes | Byte Select |  |
| $\overline{S I O 1}$ | Yes | Byte Select |  |
| $\overline{\text { SIO2 }}$ | Yes | Byte Select |  |
| $\overline{S I O 3}$ | Yes | Byte Select |  |
| Cn | No | Inactive |  |

## Status Signals

```
ZERO = 1 if result (selected bytes) = Pass
    \(N=0\)
    OVR \(=0\)
    \(C=0\)
```


## EXAMPLE (assumes a 32-bit configuration)

Test bits 7, 6 and 5 of bytes 1 and 2 of data in register 3 for ones.

| Instr <br> Code $17-10$ | Mask <br> (LSH) $\mathrm{A} 3-\mathrm{AO}$ | Oprd <br> Addr <br> B5-B0 | $\begin{array}{r} \text { Oprd Sel } \\ \text { EB1- } \\ \overline{E A} E B O \end{array}$ | Mask <br> (MSH) СЗ-C0 |  Destination Selects    <br> $\overline{W E 3}$ SELRF1-    <br> SELMQ $\overline{\text { WEY3 }}$    <br>      |  |  |  |  |  |  | C | $\begin{aligned} & \mathrm{CF} 2 \\ & \mathrm{CFO} \end{aligned}$ | $\frac{\mathrm{SIO}}{\mathrm{SIOO}}$ | $\frac{\text { IESIO3- }}{\text { IESIOO }}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 00101000 | 0000 | 000011 | $\times 00$ | 1110 | X | XXXX | XX | X | X | XXXX | 0 | X | 110 | 1001 | 0000 |

Assume register file 3 holds 881CFOO3 (Hex).

${ }^{\dagger} n=n$th byte

## FUNCTION

Performs one of $\mathrm{N}-2$ iterations of nonrestoring unsigned division by a test subtraction of the N -bit divisor from the 2 N -bit dividend. An algorithm using this instruction can be found in the "Other Arithmetic Instructions" section.

## DESCRIPTION

UDIVI performs a test subtraction of the divisor from the dividend to generate a quotient bit. The test subtraction may pass or fail and is corrected in the subsequent instruction if it fails. Similarly a failed test from the previous instruction is corrected during evaluation of the current UDIVI instruction (see the "Other Arithmetic Instructions'section for more details).

The R bus must be loaded with the divisor, the S bus with the most significant half of the result of the previous instruction (UDIVI during iteration or UDIVIS at the beginning of iteration). The least significant half of the previous result is in the MQ register.

UDIVI checks the result of the previous pass/fail test and then evaluates:

$$
\begin{array}{ll}
F \leftarrow R+S & \text { if the test is failed } \\
F \leftarrow R^{\prime}+S+C n & \text { if the test is passed }
\end{array}
$$

A double precision left shift is performed; bit 7 of the most significant byte of the MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is circulated into the least significant bit of the MO shifter.

## Available R Bus Source Operands

| RF | A3-A0 |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-C0 <br> A3-A0 <br> Mask |
| Yes | No | Yes | No |

## Recommended S Bus Source Operands

| RF <br> (B5-B0) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

## Recommended Destination Operands Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MO |
| :---: | :---: |
| Left | Left |

## Control/Data Signals

| Signal | User Programmable | Use |
| :---: | :---: | :---: |
| SSF | No | Inactive |
| SIOO | No | Passes internally generated end-fill bit. |
| SIO1 | No |  |
| $\overline{\mathrm{SIO}}$ | No |  |
| $\overline{\mathrm{SIO}}$ | No |  |
| Cn | Yes | Should be programmed high. |
| Status Signals |  |  |
| ZERO $=1$ if result $=0$ |  |  |
| $N=0$ |  |  |
| $O V R=0$ |  |  |
| $C=1$ if carry-out |  |  |

## FUNCTION

Computes the first quotient bit of nonrestoring unsigned division. An algorithm using this instruction is given in the "Other Arithmetic Instructjions" section.

## DESCRIPTION

UDIVIS computes the first quotient bit during nonrestoring unsigned division by subtracting the divisor from the dividend. The resulting remainder due to subtraction may be negative; the subsequent UDIVI instruction may have to restore the remainder during the next operation.

The $R$ bus must be loaded with the divisor and the $S$ bus with the most significant half of the remainder. The result on the $Y$ bus should be loaded back into the register file for use in the next instruction. The least significant half of the remainder is in the MQ register.

UDIVIS computes:

$$
\mathrm{F} \leftarrow \mathrm{R}^{\prime}+\mathrm{S}+\mathrm{Cn}
$$

A double precision left shift is performed; bit 7 of the most significant byte of the MQ shifter is transferred to bit 0 of the least significant byte of the ALU shifter. Bit 7 of the most significant byte of the ALU shifter is lost. The unfixed quotient bit is circulated into the least significant bit of the MO shifter.

## Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> (A5-AO) <br> Immed |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

## Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands Shift Operations

| $\begin{gathered} \mathrm{RF} \\ (\mathrm{C} 5-\mathrm{CO}) \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ (\mathrm{~B} 5-\mathrm{BO}) \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MO |
| :---: | :---: |
| Left | Left |

## Control／Data Signals

| Signal | User <br> Programmable |  |
| :--- | :--- | :--- |
| $\overline{\text { SSF }}$ | No | Use |
| $\overline{S I O O}$ | No | Passes internally generated end－fill bit． |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{\text { SIO3 }}$ | No |  |
| Cn | Yes | Should be programmed high． |

Status Signals

```
ZERO = 1 if intermediate result =0
    N = 0
    OVR = 1 if divide overflow
        C = 1 if carry-out
```


## FUNCTION

Solves the final quotient bit during nonrestoring unsigned division. An algorithm using this instruction is given in the "Other Arithmetic Instructions" section.

## DESCRIPTION

UDIVIT performs the final subtraction of the divisor from the remainder during nonrestoring signed division. UDIVIT is preceded by $\mathrm{N}-1$ iterations of UDIVI, where $N$ is the number of bits in the dividend.

The R bus must be loaded with the divisor, the S bus must be loaded with the most significant half of the result of the last UDIVI instruction. The least significant half lies in the MO register. The $Y$ bus result must be loaded back into the register file for use in the subsequent DIVRF instruction.

UDIVIT checks the results of the previous pass/fail test and evaluates:

$$
\begin{array}{ll}
Y \leftarrow R+S & \text { if the test is failed } \\
Y \leftarrow R^{\prime}+S+C n & \text { if the test is passed }
\end{array}
$$

The contents of the MQ register are shifted one bit to the left; the unfixed quotient bit is circulated into the least significant bit.

Available R Bus Source Operands

| RF | A3-AO | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> (A5-AO) |
| :---: | :---: | :---: | :---: |
| Immed |  |  |  | Mask | Yes |
| :---: |
| No |

Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MO <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

Recommended Destination Operands Shift Operations

| $\begin{gathered} \mathrm{RF} \\ (\mathrm{C} 5-\mathrm{CO}) \end{gathered}$ | $\begin{gathered} \mathrm{RF} \\ \text { (B5-BO) } \end{gathered}$ | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| None | Left |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Inactive |
| $\overline{\text { SIOO }}$ | No | Passes internally generated end－fill bit． |
| $\overline{\text { SIO1 }}$ | No |  |
| $\overline{\text { SIO2 }}$ | No |  |
| SIO3 | No |  |
| Cn | Yes | Should be programmed high． |

Status Signals

```
ZERO = 1 if intermediate result =0
    N}=
    OVR=O
    C = 1 if carry-out
```


## FUNCTION

Performs one of N unsigned multiplication iterations for computing an N -bit by N -bit product. An algorithm for unsigned multiplication using this instruction is given in the "Other Arithmetic Instructions" section.

## DESCRIPTION

UMULI checks to determine whether the multiplicand should be added with the present partial product. The instruction evaluates:

$$
\begin{array}{ll}
\mathrm{F} \leftarrow \mathrm{R}+\mathrm{S}+\mathrm{Cn} & \text { if the addition is required } \\
\mathrm{F} \leftarrow \mathrm{~S} & \text { if no addition is required }
\end{array}
$$

A double precision right shift is performed. Bit 0 of the least significant byte of the ALU shifter is passed to bit 7 of the most significant byte of the MO shifter; carry-out is passed to the most significant bit of the ALU shifter.

The $S$ bus should be loaded with the contents of an accumulator and the $R$ bus with the multiplicand. The Y bus result should be written back to the accumulator after each iteration of UMULI. The accumulator should be cleared and the MQ register loaded with the multiplier before the first iteration.

## R Bus Source Operands

| RF <br> (A5-AO) | A3-AO <br> Immed | DA-Port | C3-CO <br> $::$ <br> A3-AO <br> Mask |
| :---: | :---: | :---: | :---: |
| Yes | No | Yes | No |

## Recommended S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | No |

## Recommended Destination Operands <br> Shift Operations

| RF <br> (C5-C0) | RF <br> (B5-B0) | Y-Port |
| :---: | :---: | :---: |
| Yes | No | Yes |


| ALU | MQ |
| :---: | :---: |
| Right | Right |

## Control／Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Holds LSB of MQ． |
| $\overline{S I O O}$ | No | Passes internal input（shifted bit）． |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Should be programmed low． |
| Cn | Yes |  |

Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(N=1\) if \(M S B=1\)
    \(O V R=0\)
    \(C=1\) if carry-out
```

${ }^{\dagger}$ Valid only on final execution of multiply iteration

## FUNCTION

Evaluates the logical expression R XOR S.

## DESCRIPTION

Data on the $R$ bus is exclusive ORed with data on the $S$ bus. The result appears at the ALU and MQ shifters.
*The result of this instruction can be shifted in the same microcycle by specifying a shift instruction in the upper nibble (17-14) of the instruction field. The result may also be passed without shift. Possible instructions are listed in Table 15.

## Available R Bus Source Operands

| RF | A3-AO |  |  |
| :---: | :---: | :---: | :---: |
| (A5-A0) | Immed | DA-Port | C3-C0 <br> $::$ <br> A3-AO <br> Mask |
| Yes | No | Yes | No |

Available S Bus Source Operands

| RF <br> (B5-BO) | DB-Port | MQ <br> Register |
| :---: | :---: | :---: |
| Yes | Yes | Yes |

Available Destination Operands

| RF <br> (C5-C0) | RF <br> (B5-BO) | Y-Port | ALU <br> Shifter | MQ <br> Shifter |
| :---: | :---: | :---: | :---: | :---: |
| Yes | No | Yes | Yes | Yes |

## Control/Data Signals

| Signal | User <br> Programmable | Use |
| :--- | :--- | :--- |
| SSF | No | Affect shift instructions programmed in bits 17-14 of |
| $\overline{S I O O}$ | No | instruction field. |
| $\overline{S I O 1}$ | No |  |
| $\overline{S I O 2}$ | No |  |
| $\overline{S I O 3}$ | No | Inactive |
| Cn | No |  |

## Status Signals ${ }^{\dagger}$

```
ZERO \(=1\) if result \(=0\)
    \(N=1\) if MSB \(=1\)
    \(O V R=0\)
    \(C=0\)
```

${ }^{\dagger} \mathrm{C}$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

## EXAMPLE (assumes a 32-bit configuration)

Exclusive OR the contents of register 3 and register 5, and store the result in register 5.

| Instr <br> Code <br> 17-10 | Oprd <br> Addr <br> A5-AO | Oprd <br> Addr <br> B5-B0 | Oprd Sel <br> EB1- <br> $\overline{\mathrm{EA}} \mathrm{EBO}$ | Dest <br> Addr <br> C5-C0 | SELMQ | $\overline{\text { WE3- }}$ | Destinat <br> SELRF1- <br> SELRFO | on Sel $\overline{\mathrm{OEA}}$ | cts $\overline{\mathrm{OEB}}$ | $\frac{\overline{O E Y 3}}{\text { OEYO }}$ | $\overline{\mathrm{OES}}$ | Cn | CF2- CFO |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 11111001 | 000011 | 000101 | 000 | 000101 | 0 | 0000 | 10 | X | X | XXXX | 0 | X | 110 |

Assume register file 3 holds 33F6D840 (Hex) and register file 5 holds 90F6D842 (Hex)..


Destination 10100011000000000000000000000010
$R F(5) \leftarrow R$ XOR $S$

## Overview <br> 1

## SN74ACT8818 16-Bit Microsequencer <br> 2

SN74ACT8832 32-Bit Registered ALU ..... 3
SN74ACT8836 $32-\times 32$-Bit Parallel Multiplier ..... 4
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support8
Mechanical Data

## SN74ACT8836 32-Bit by 32-Bit Multiplier/Accumulator

The SN74ACT8836 is a 32-bit integer multiplier/accumulator (MAC) that accepts two 32-bit inputs and computes a 64-bit product. An on-board adder is provided to add or subtract the product or the complement of the product from the accumulator.

To speed-up calculations, many modern systems off-load frequently-performed multiply/accumulate operations to a dedicated single-cycle MAC. In such an arrangement, the 'ACT8836 MAC can accelerate 32-bit microprocessors, building block processors, or custom CPUs. The 'ACT8836 is well-suited for digital signal processing applications, including fast fourier transforms, digital filtering, power series expansion, and correlation.

9ع88コロナLNS

## SN74ACT8836 <br> 32-BIT BY 32-BIT MULTIPLIER|ACCUMULATOR

- Performs Full 32-Bit by 32-Bit Multiply/Accumulate in Flow-Through Mode in 60 ns (Max)
- Can be Pipelined for 36 ns (Max) Operation
- Performs 64-Bit by 64-Bit Multiplication in Five Cycles
- Supports Division Using Newton-Raphson Approximation
- Signed, Unsigned, or Mixed-Mode Multiply Operations
- EPIC ${ }^{\text {M }}$ (Enhanced-Performance Implanted CMOS) $1-\mu \mathrm{m}$ Process
- Multiplier, Multiplicand, and Product Can be Complemented
- Accumulator Bypass Option
- TTL I/O Voltage Compatibility
- Three Independent 32-Bit Buses for Multiplicand, Multiplier, and Product
- Parity Generation/Checking
- Master/Slave Fault Detection
- Single 5-V Power Supply
- Integer or Fractional Rounding


## description

The 'ACT8836 is a 32 -bit by 32 -bit parallel multiplier/accumulator suitable for low-power, high-speed operations in applications such as digital signal processing, array processing, and numeric data processing. High speed is achieved through the use of a Booth and Wallace Tree architecture.

Data is input to the chip through two registered 32 -bit DA and DB input ports and output through a registered 32-bit $Y$ output port. These registers have independent clock enable signals and can be made transparent for flowthrough operations.
The device can perform two's complement, unsigned, and mixed-data arithmetic. It can also operate as a 64 -bit by 64 -bit multiplier. Five clock cycles are required to perform a 64 -bit by 64 -bit multiplication and multiplex the 128 -bit result. Division is supported using Newton-Raphson approximation.

A multiply/accumulate mode is provided to add or subtract the accumulator from the product or the complement of the product. The accumulator is 67 bits wide to accommodate possible overflow. A warning flag (ETPERR) indicates whether overflow has occurred.

A rounding feature in the 'ACT8836 allows the result to be truncated or rounded to the nearest 32 -bits. To ensure data integrity, byte parity checking is provided at the input ports, and a parity generator and master/slave error detection comparator are provided at the output port.

The SN74ACT8836 is characterized for operation from $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$.
logic symbol

functional block diagram (positive logic)


gB PACKAGE PIN ASSIGNMENTS

| PIN |  | PIN |  | PIN |  | PIN |  | PIN |  | PIN |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| NO． | NAME | NO． | NAME | NO． | NAME | NO． | NAME | NO． | NAME | NO． | NAME |
| A1 | Y8 | B12 | YETP1 | D14 | PYO | H12 | COMPL | M3 | DB18 | P5 | DB25 |
| A2 | Y10 | B13 | YETPO | D15 | ETPERR | H13 | FTO | M7 | PB1 | P6 | DB29 |
| A3 | Y11 | B14 | YETP2 | E1 | SELREG | H14 | EA | M8 | PAO | P7 | DB31 |
| A4 | Y13 | B15 | PY3 | E2 | Y3 | H15 | $\overline{\text { CKEA }}$ | M10 | DA6 | P8 | PERRA |
| A5 | Y14 | C1 | YO | E3 | GND | J1 | DB2 | M13 | DA16 | P9 | PA2 |
| A6 | Y16 | C2 | Y4 | E13 | GND | J2 | DB3 | M14 | DA17 | P10 | DA2 |
| A7 | Y18 | C3 | EB | E14 | PY2 | J3 | DB5 | M15 | DA25 | P1 1 | DA8 |
| A8 | Y19 | C4 | Y5 | E15 | RND1 | J4 | DB7 | N1 | DB10 | P12 | DA12 |
| A9 | Y21 | C5 | $V_{C C}$ | F1 | SFTO | J12 | DA26 | N2 | DB19 | P13 | DA14 |
| A10 | Y23 | C6 | GND | F2 | Y1 | J13 | DA24 | N3 | DB20 | P14 | DA11 |
| A11 | Y25 | C7 | Y15 | F3 | GND | J14 | DA30 | N4 | DB21 | P15 | DA21 |
| A12 | Y27 | C8 | GND | F13 | GND | J15 | DA31 | N5 | DB23 | R1 | DB14 |
| A13 | Y28 | C9 | Y22 | F14 | MSERR | K1 | DB4 | N6 | DB27 | R2 | DB26 |
| A14 | Y30 | C10 | GND | F15 | DASGN | K2 | DB9 | N7 | $V_{\text {CC }}$ | R 3 | DB28 |
| A15 | PY1 | C11 | $\mathrm{V}_{\text {CC }}$ | G1 | SELD | K3 | DB11 | N8 | GND | R4 | DB30 |
| B1 | Y2 | C12 | CKEY | G2 | SGNEXT | K13 | DA22 | N9 | DAO | R 5 | PBO |
| B2 | Y6 | C13 | $\overline{\text { OEY }}$ | G3 | WELS | K14 | DA28 | N10 | DA4 | R6 | PB2 |
| B3 | SELY | C14 | ACCO | G4 | SFT1 | K15 | DA29 | N11 | DA10 | R 7 | PB3 |
| B4 | Y7 | C15 | PERRY | G12 | RNDO | L1 | DB6 | N12 | DA13 | R 8 | PERRB |
| B5 | Y9 | D1 | WEMS | G13 | DBSGN | L2 | DB15 | N13 | DA15 | R9 | PA1 |
| B6 | Y12 | D2 | TP1 | G14 | $\overline{\text { CKEI }}$ | L3 | DB13 | N14 | DA19 | R10 | PA3 |
| B7 | Y17 | D3 | TPO | G15 | FT1 | L13 | DA18 | N15 | DA23 | R11 | DA1 |
| B8 | Y20 | D7 | GND | H1 | CLK | L14 | DA20 | P1 | DB12 | R12 | DA3 |
| B9 | Y26 | D8 | $V_{C C}$ | H2 | $\overline{\text { CKEB }}$ | L15 | DA27 | P2 | DB16 | R13 | DA5 |
| B10 | Y29 | D9 | Y24 | H3 | DBO | M1 | DB8 | P3 | DB24 | R14 | DA7 |
| B11 | Y31 | D13 | ACC1 | H4 | DB1 | M2 | DB17 | P4 | DB22 | R15 | DA9 |


| PIN |  | 1/0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  |  |
| ACCO | C14 | 1 | Accum |
| ACC1 | D13 | 1 | Accumulate mode opcode (see Table 2) |
| CLK | H1 | 1 | System clock |
| $\overline{\text { CKEA }}$ | H15 | 1 | Clock enable for A register, active low |
| $\overline{\mathrm{CKEB}}$ | H2 | 1 | Clock enable for B register, active low |
| $\overline{\text { CKEI }}$ | G14 | 1 | Clock enable for I register, active low |
| $\overline{\text { CKEY }}$ | C12 | 1 | Clock enable for Y register, active low |
| COMPL | H12 | 1 | Product complement control; high complements multiplier result, low passes multiplier unaltered to accumulator. |
| DAO | N9 |  |  |
| DA1 | R11 |  |  |
| DA2 | P10 |  |  |
| DA3 | R12 |  |  |
| DA4 | N10 |  |  |
| DA5 | R13 |  |  |
| DA6 | M10 |  |  |
| DA7 | R14 |  |  |
| DA8 | P11 |  |  |
| DA9 | R15 |  |  |
| DA10 | N11 |  |  |
| DA11 | P14 |  |  |
| DA12 | P12 |  |  |
| DA13 | N12 |  |  |
| DA14 | P13 |  |  |
| DA15 | N13 | 1 | DA port input data bits 0 through 31 |
| DA16 | M13 |  |  |
| DA17 | M14 |  |  |
| DA18 | L13 |  |  |
| DA19 | N14 |  |  |
| DA20 | L14 |  |  |
| DA21 | P15 |  |  |
| DA22 | K13 |  |  |
| DA23 | N15 |  |  |
| DA24 | $J 13$ |  |  |
| DA25 | M15 |  |  |
| DA26 | J12 |  |  |
| DA27 | L15 |  |  |
| DA28 | K14 |  |  |
| DA29 | K15 |  |  |
| DA30 | J14 |  |  |
| DA31 | J15 |  |  |
| DASGN | F15 | 1 | Sign magnitude control; high identifies DA input data as two's complement, low identifies DA input data as unsigned |


| PIN |  | I/O | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  |  |
| DBO | H3 | 1 | DB port input data bits 0 through 31 |
| DB1 | H4 |  |  |
| DB2 | J1 |  |  |
| DB3 | J2 |  |  |
| DB4 | K1 |  |  |
| DB5 | J3 |  |  |
| DB6 | L1 |  |  |
| DB7 | J4 |  |  |
| DB8 | M1 |  |  |
| D89 | K2 |  |  |
| DB10 | N1 |  |  |
| DB11 | K3 |  |  |
| DB12 | P1 |  |  |
| DB13 | L3 |  |  |
| DB14 | R1 |  |  |
| DB15 | L2 |  |  |
| DB16 | P2 |  |  |
| DB17 | M2 |  |  |
| DB18 | M3 |  |  |
| DB19 | N2 |  |  |
| DB20 | N3 |  |  |
| DB21 | N4 |  |  |
| DB22 | P4 |  |  |
| DB23 | N5 |  |  |
| DB24 | P3 |  |  |
| DB25 | P5 |  |  |
| DB26 | R2 |  |  |
| DB27 | N6 |  |  |
| DB28 | R3 |  |  |
| DB29 | P6 |  |  |
| DB30 | R4 |  |  |
| DB31 | P7 |  |  |
| DBSGN | G13 | 1 | Sign magnitude control; high identifies DB input data as two's complement, low identifies DB input data as unsigned. |
| $\overline{E A}$ | H14 | 1 | Core multiplier operand select. A high on this signal selects DA register for input on the R bus; a Iow selects the swap MUX. |
| $\overline{E B}$ | C3 | 1 | Core multiplier operand select. A high on this signal selects DB register for input on the S bus; a Iow selects the swap MUX. |
| ETPERR | D15 | 0 | Equality check result. A low on this signal indicates that bits 67 through 64 of the core multiplier results are equal to bit 63 . |
| $\begin{aligned} & \text { FTO } \\ & \text { FT1 } \end{aligned}$ | $\begin{aligned} & \mathrm{H} 13 \\ & \mathrm{G} 15 \end{aligned}$ | 1 | Feedthrough control signals for A, B, I, Pipeline and Y registers (see Table 4). |

Texas

## SN74ACT8836 32-BIT BY 32-BIT MULTIPLIER|ACCUMULATOR

|  |  |  |  |
| :---: | :---: | :---: | :---: |
| NAME | NO. | 1/0 | DESCRIPTION |
| GND | C6 |  | Ground pins. All ground pins should be used and connected. |
| GND | C8 |  |  |
| GND | C10 |  |  |
| GND | D7 |  |  |
| GND | E3 |  |  |
| GND | E13 |  |  |
| GND | F3 |  |  |
| GND | F13 |  |  |
| GND | N8 |  |  |
| MSERR | F14 | 0 | Master/slave error flag. This signal goes high when the contents of the $Y$ output multiplexer and the value at the external port are not equal. |
| $\overline{\text { OEY }}$ | C13 | 1 | Y, YETP2-YETPO, and PY3-PYO output enable, active low. |
| PAO | M8 | 1 | Parity input data bus for DA input data |
| PA1 | R9 |  |  |
| PA2 | P9 |  |  |
| PA3 | R10 |  |  |
| PBO | R5 | 1 | Parity input data bus for DB input data |
| PB1 | M7 |  |  |
| PB2 | R6 |  |  |
| PB3 | R7 |  |  |
| PYO | D14 | I/O | $Y$ output parity data bus. Outputs data from parity generator $(\overline{\mathrm{OEY}}=\mathrm{L})$ or inputs external parity data $(\overline{O E Y}=H)$. |
| PY1 | A15 |  |  |
| PY2 | E14 |  |  |
| PY3 | B15 |  |  |
| PERRA | P8 | 0 | DA port parity status pin. Goes high if even-parity test on any byte fails. |
| PERRB | R8 | 0 | DB port parity status pin. Goes high if even-parity test on any byte fails. |
| PERRY | C15 | 0 | Y port parity status pin. Goes high if even-parity test on any byte fails. |
| RNDO | G12 | 1 | Multiplier/accumulator rounding control; high rounds integer result; low leaves result unaltered. |
| RND1 | E15 | 1 | Multiplier/accumulator rounding control; high rounds fractional result; low leaves result unaltered. |
| SELD | G1 | 1 | D multiplexer select. High selects DA and DB ports; low selects multiplier core output. |
| SELREG | E1 | 1 | Write enable for temporary register and accumulator. High enables the temporary register; low enables the accumulator. |
| SELY | B3 | 1 | $Y$ multiplexer select. High selects most significant 32 bits of $Y$ register output; low selects least significant 32 bits. |
| SGNEXT | G2 | 1 | Sign extend control for multiplexer. A low fills shift matrix bits 66-64 with zeros; a high fills DA31 in bits 66-64. |
| SFTO | F1 | 1 | Shift multiplexer control (see Table 4). |
| SFT 1 | G4 |  | Shift multiplexer control (see Table 4). |
| TPO | D3 | 1 |  |
| TP1 | D2 | 1 | Test pins (see Table 5) |
| $V_{\text {CC }}$ | C5 |  |  |
| $V_{\text {CC }}$ | C11 |  | Supply voltage (5 V) |
| $V_{\text {CC }}$ | D8 |  | Supply vortage (5 V) |
| $V_{\text {CC }}$ | N7 |  |  |
| WEMS | D1 | 1 | Write enable for most significant 32 bits of temporary register and accumulator active low. |
| WELS | G3 | 1 | Write enable for least significant 32 bits of temporary register and accumulator active low. |

## 32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR

|  |  |  |  |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  | S |
| YO | C1 | 1/0 | Y port data bus. Outputs data from Y register $(\overline{\mathrm{OEY}}=\mathrm{L})$; inputs data to master/slave comparator $(\overline{O E Y}=H)$. |
| Y1 | F2 |  |  |
| Y2 | B1 |  |  |
| Y3 | E2 |  |  |
| Y4 | C2 |  |  |
| Y5 | C4 |  |  |
| Y6 | B2 |  |  |
| Y7 | B4 |  |  |
| Y8 | A1 |  |  |
| Y9 | B5 |  |  |
| Y10 | A2 |  |  |
| Y11 | A3 |  |  |
| Y12 | B6 |  |  |
| Y13 | A4 |  |  |
| Y14 | A5 |  |  |
| Y15 | C7 |  |  |
| Y16 | A6 |  |  |
| Y17 | B7 |  |  |
| Y18 | A7 |  |  |
| Y19 | A8 |  |  |
| Y20 | B8 |  |  |
| Y21 | A9 |  |  |
| Y22 | C9 |  |  |
| Y23 | A10 |  |  |
| Y24 | D9 |  |  |
| Y25 | A11 |  |  |
| Y26 | B9 |  |  |
| Y27 | A12 |  |  |
| Y28 | A13 |  |  |
| Y29 | B10 |  |  |
| Y30 | A14 |  |  |
| Y31 | B11 |  |  |
| YETPO | B13 | 1/0 | Data bus for extended precision product. Outputs three most significant bits of the 67 -bit multiplier core result; inputs external data to master/slave comparator. |
| YETP1 | B12 |  |  |
| YETP2 | B14 |  |  |

TABLE 1. INSTRUCTION INPUTS

| Signal | High | Low |
| :--- | :--- | :--- |
| DASGN | Identifies DA Input data as two's complement | Identifies DA input data as unsigned |
| DBSGN | Identifies DB input data as two's complement | Identifies DB input data as unsigned |
| RNDO | Rounds integer result | Leaves integer result unaltered |
| RND1 | Rounds fractional result | Leaves fractional result unaltered |
| COMPL | Complements the product from the multiplier <br> before passing it to the accumulator | Passes the product from the multiplier to the <br> accumulator unaltered |
| ACCO | See Table 2 | See Table 2 |
| ACC1 |  |  |

## SN74ACT8836 32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR

TABLE 2. MULTIPLIER/ADDER CONTROL INPUTS

| $\mathbf{A C C 1}$ | $\mathbf{A C C O}$ | $\overline{\mathrm{EA}}$ | $\overline{\mathbf{E B}}$ | Operation |
| :---: | :---: | :---: | :---: | :--- |
| 0 | 0 | X | X | $\pm(\mathrm{R} \times \mathrm{S})+0$ |
| 0 | 1 | $X$ | $X$ | $\pm(\mathrm{R} \times \mathrm{S})+\mathrm{ACC}$ |
| 1 | 0 | $X$ | $X$ | $\pm(\mathrm{R} \times \mathrm{S})-\mathrm{ACC}$ |
| 1 | 1 | 0 | 0 | $\pm 1 \times 1+0$ |
| 1 | 1 | 0 | 1 | $\pm 1 \times \mathrm{DB}+0$ |
| 1 | 1 | 1 | 0 | $\pm \mathrm{DA} \times 1+0$ |
| 1 | 1 | 1 | 1 | $\pm \mathrm{DA} \times \mathrm{DB}+0$ |

ACC is the data stored in the accumulator

TABLE 3. SHIFTER CONTROL INPUTS

| SFT1 | SFT0 | Shifter Operation |
| :---: | :---: | :--- |
| L | L | Pass data without shift |
| L | H | Shift one bit left; fill with zero |
| H | L | Swap upper and lower halves of temporary register |
| H | H | Shift 32 bits right; fill with sign bit |

TABLE 4. FLOWTHROUGH CONTROL INPUTS

| Control Inputs |  | Registers Bypassed |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | FT1 | FT0 | Pipeline | Y | I | A |
| B |  |  |  |  |  |  |
| L | L | Yes | Yes | Yes | Yes | Yes |
| L | H | Yes | No | No | No | No |
| H | L | Yes | Yes | No | No | No |
| H | H | No | No | No | No | No |

TABLE 5. TEST PIN CONTROL INPUTS

| TP1 | TPO | Operation |
| :---: | :---: | :--- |
| L | L | All outputs and I/Os forced low |
| L | H | All outputs and I/Os forced high |
| H | L | All outputs placed in a high impedance state |
| H | H | Normal operation (default state) |

## architectual elements

Included in the functional block diagram of the 'ACT8836 are the following blocks.

1. Two 32-bit registered input data ports $D A$ and $D B$
2. A parity checker at the DA and DB inputs
3. An instruction decoder (I register)
4. A flowthrough decoder that permits selected registers to be bypassed to support up to three levels of pipelining
5. R and $S$ multiplexers to select operands for the multiplier/ adder from DA and DB inputs, registers $A$ and $B$, or temporary register
6. A D multiplexer that selects the operand for the shifter from the 67-bit sign-extended DA and DB inputs or the multiplier/adder output
7. A shifter block that operates on DA/DB input data or on multiplier/adder outputs for scaling or Newton-Raphson division
8. A Y output multiplexer that selects the most significant half or the least significant half of the multiplier/ adder result for output at the registered Y port
9. An extended precision error check that tests for overflow
10. A master/slave comparator and parity generator/comparator at the $Y$ output port for master/slave and parity checking
11. Registers at the external data and instruction input ports and the shifter and multiplier/adder output port to support pipe-lining

## input data parity checker

An even-parity check is performed on each byte of input data at the DA, DB and $Y$ ports. If the parity test fails for any byte, a high appears at the parity error output pin (PERRA for DA data, PERRB for DB data, PERRY for $Y$ data).

## $A$ and $B$ registers

Register A can be loaded with data from the DA bus, which normally holds a 32-bit multiplicand. Register $B$ is loaded from the DB bus which holds a 32-bit multiplier. Separate clock enables, $\overline{C K E A}$ and $\overline{C K E B}$, allow the registers to be loaded separately. This is useful when performing double precision multiplication or using the temporary register as an input to the multiplier/adder. The registers can be made transparent using the FT inputs (see Table 4).

## instruction register

Instruction inputs to the device are shown in Table 1. These signals control signed, unsigned, and mixed multiplication modes, fractional and integer rounding, accumulator operations and complementing of products. They can be latched into instruction register I when clock enable $\overline{\mathrm{CKEI}}$ is low.
Sign control inputs DASGN and DBSGN identify DA and DB input data as signed (high) or unsigned (low).
Rounding inputs RNDO and RND1 control rounding operations in the multiplier/adder. A low on these inputs passes the results unaltered. If a high appears on RND1, the result will be rounded by adding a one to bit 30. RND1 should be set high if the multiplier/adder result is to be shifted in order to maintain precision of the least significant bit following the shift operation. If a high appears on RNDO, the result will be rounded by adding a one to bit 31 . This code should be used when the adder result will not be shifted.

A complement control, COMPL, is used to complement the product from the muliplier before passing it to the accumulator. The complement will occur if COMPL is high; the product will be passed unaltered if COMPL is low.
ACC1-ACCO control the operation of the multiplier/adder. Possible operations are shown in Table 2.


INPUT REGISTERS AND PARITY CHECK

## R, $S$, and swap multiplexers

The $R$ and S multiplexers select the multiplier/adder operands from external data or from the temporary register.
When $\overline{\mathrm{EA}}$ is low, the R multiplexer selects data from the swap multiplexer. When $\overline{\mathrm{EA}}$ is high, the R multiplexer selects data from DA or the A register, depending on the state of the flowthrough control inputs (see Table 4). When $\overline{E B}$ is low, the $S$ multiplexer selects data from the swap multiplexer. When $\overline{E B}$ is high, the $S$ multiplexer switches data from DB or the B register, depending on the state of the flowthrough control inputs.
$\overline{E A}$ and $\overline{E B}$ are also used in conjunction with the multiplier/adder control inputs to force a numeric one on the R or S inputs (see Table 2).

The swap multiplexers are controlled by the shifter control inputs. When SFT1 is high and SFTO is low, the most significant half of the temporary register is available to the S multiplexer, and the least significant half is available to the R multiplexer. When SFT1-SFTO are set to other values, the most significant half of the temporary register is available to the R multiplexer, and the least significant half is available to the S multiplexer.

## multiplier/adder

The multiplier performs 32 -bit multiplication and generates a 67 -bit product. The product can be latched in the pipeline to increase cycle speed. The product is complemented when COMPL is set high as shown in Table 1. The adder computes the sum or the difference of the accumulator and the product and gives a 67-bit sum. Bits 66-64 are used for overflow and sign extension.

## D multiplexer

The D multiplexer selects input data for the shifter. Two sources are available to the multiplexer: a 64-bit word formed by concatenating DA and DB bus data, and the 67-bit sum from the multiplier/adder. If SELD is high, external DA/DB data is selected; if SELD is low, the sum is selected.

If the 64-bit word is selected for input to the shifter, three bits are added to the word based on the state of the sign extend signal (SGNEXT). If SGNEXT is low, bits 66-64 are zero-filled; if SGNEXT is high, bits 66-64 are filled with the value on DA31.

## temporary register and accumulator (Figure 1)

Output from the shifter will be stored in the temporary register if SELREG is high and in the accumulator register if SELREG is low. The 64-bit temporary register can be used to store temporary data, constants and scaled binary fractions.
Separate clock controls, $\overline{\text { WELS }}$ and $\overline{\text { WEMS }}$, allow the most significant and least significant halves of the shifter output to be loaded separately. The 32 least significant bits of the selected register are loaded when $\overline{W E L S}$ is low; the most significant bits when $\overline{W E M S}$ is low. When $\overline{W E L S}$ and $\overline{W E M S}$ are both low, the entire word from the shifter is loaded into the selected register.


FIGURE 1. TEMPORARY REGISTER AND ACCUMULATOR

## shifter

The shifter can be used to multiply by two for Newton-Raphson operations or perform a 32 -bit shift for double precision multiplication. The shifter is controlled by two SFT inputs, as shown in Table 3.

## Y register

Final or intermediate multiplier/adder results will be clocked into $Y$ register when $\overline{\text { CKEY }}$ is low.
Results can be passed directly to the $Y$ output multiplexer using flowthrough decoder signals to bypass the register (see Table 4).

## $\mathbf{Y}$ multiplexer and $\mathbf{Y}$ output multiplexer

The Y multiplexer allows the 64-bit result or the contents of the Y register to be switched to the Y bus, depending upon the state of the flowthrough control outputs. The upper 32 bits are selected for output when the Y output multiplexer control SELY is high; the lower 32 bits are selected for output when SELY is low. Note that the $Y$ output multiplexer can be switched at twice the clock rate so that the 64 -bit result can be output in one clock cycle.

## flowthrough decoder

To enable the device to operate in pipelined or flowthrough modes, on-chip registers can be bypassed using flowthrough control signals FT1 and FTO. Up to three levels of pipeline can be supported, as shown in Table 4.


## 32－BIT BY 32－BIT MULTIPLIER｜ACCUMULATOR



FIGURE 3．OUTPUT ERROR CONTROL

## extended precision check

Three extended product outputs，YETP2－YETPO，are provided to recover three bits of precision during overflow．An extended precision check error signal（ETPERR）goes high whenever overflow occurs．If sign controls DASGN and DBSGN are both low，indicating an unsigned operation，the extended precision bits 66－64 are compared for equality．Under all other sign control conditions，bits 66－63 are compared for equality．

## master slave comparator

A master／slave comparator is provided to compare data bytes from the Y output multiplexer with data bytes on the external Y port when $\overline{\mathrm{OEY}}$ is high．A comparison of the three extended precision bits of the multiplier／adder result or Y register output with external data in the YETP1－YETPO port is performed simultaneously．If the data is not equal，a high signal is generated on the master slave error output pin （MSERR）．A similar comparison is performed for parity using the PY3－PYO inputs．This feature is useful in fault－tolerant design where several devices vote to ensure hardware integrity．

## test pins

Two pins，TP1－TP0，support system testing．These may be used，for example，to place all outputs in a high－impedance state，isolating the chip from the rest of the system（see Table 5）．

## data formats

The＇ACT8836 performs single－precision and double－precision multiplication in two＇s complement，unsigned magnitude，and mixed formats for both integer and fractional numbers．

Input formats for the multiplicand（R）and multiplier（S）are given below，followed by output formats for the fully extended product．The fully extended product（PRDT）is 67 bits wide．It includes the extended product（XTP）bits YETP1－YETPO，the most significant product（MSP）bits Y63－Y32，and the least significant product（LSP）bits Y31－Y0．

This can be represented in notational form as follows:

```
    PRDT = XTP : : MSP : : LSP
or
    PRDT = YETP2 - YETPO : : Y63 - YO
```

Table 6 shows the output formats generated by two's complement, unsigned and mixed-mode multiplications.

TABLE 6. GENERATED OUTPUT FORMATS

|  | Two's Complement | Unsigned Magnitude |
| :--- | :--- | :--- |
| Two's Complement | Two's Complement | Two's Complement |
| Unsigned Magnitude | Two's Complement | Unsigned Magnitude |

## examples

Representative examples of single-precision multiplication, double-precision multiplication, and division using Newton-Raphson binary division algorithm are given below.

## single-precision multiplication

Microcode for the multiplication of two signed numbers is shown in Figure 1. In this example, the result is rounded and the 32 most significant bits are output on the $Y$ bus. A second instruction (SELY = 0) would be required to output the least significant half if rounding were not used.
Unsigned and mixed mode single-precision multiplication are executed using the same code. (The sign controls must be modified accordingly.) Following are the input and output formats for signed, unsigned, and mixed mode operations.

## Two's Complement Integer Inputs

Input Operand A

| 31 | 30 | 29 | $\ldots$ | $\ldots$ | $\ldots$ | $\ldots$ | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $2^{31}$ <br> (Sign) | $2^{30}$ | $2^{29}$ | $\ldots \ldots \ldots \ldots \ldots \ldots$ | 0 |  |  |  |  |


| 31 | 30 | 29 | $\ldots$ | $\ldots$ | $\ldots$ | $\ldots$ | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $-2^{31}$ <br> (Sign) | $2^{30}$ | $2^{29}$ | $\ldots \ldots \ldots \ldots \ldots \ldots \ldots$ | $2^{2}$ | $2^{1}$ | $2^{0}$ |  |  |

# Unsigned Integer Inputs 

Input Operand A
Input Operand B

| 31 | 30 | 29 | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $2^{31}$ | $2^{30}$ | 229 | $2^{2}$ | 21 | 20 |


| 31 | 30 | 29 | $\ldots$ | $\ldots$ | $\ldots$ | $\ldots$ | 2 | 1 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $2^{31}$ | $2^{30}$ | $2^{29}$ | $\ldots \ldots$ | $\ldots$ | $\ldots$ | $\ldots$ | $\ldots$ | $2^{2}$ |
| $2^{1}$ | $2^{0}$ |  |  |  |  |  |  |  |

Two's Complement Fractional Inputs

Input Operand A

| 31 | 30 | 29 | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $\begin{array}{r} -2^{0} \\ (\text { Sign }) \end{array}$ | 2-1 | $2^{-2}$ | 2-29 | 2-30 | 2-31 |

Input Operand B


## SN74ACT8836

## 32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR



Two's Complement Integer Outputs


Unsigned Integer Outputs

| Extended |
| :---: |
| Product |

(YETP2-YETPO) | 66 | 65 | 64 |
| :---: | :---: | :---: |
| 266 | 265 | 264 |

| Most Significant Product <br> (Y63-Y32) |  |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 63 62 61 $\ldots \ldots \ldots \ldots$ 30 31 32 <br> $2^{63}$ $2^{62}$ $2^{61}$ $\ldots \ldots \ldots \ldots$ $2^{34}$ $2^{33}$ $2^{32}$ |  |  |  |  |  |  |  |


| Least Significant Product(Y31-Y0) |  |  |  |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 31 | 30 | 29 | . | 2 | 1 | 0 |
| 231 | $2^{30}$ | 229 | . . . . . . . | $2^{2}$ | 21 | $2^{0}$ |

Two's Complement Fractional Outputs
Extended
Product
(YETP2-YETPO)


Unsigned Fractional Outputs
Extended
Product
(YETP2-YETPO)
Most Significant Product
(Y63-Y32)


| 63 | 62 | 61 | $\ldots \ldots \ldots$. | 30 | 31 | 32 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $2^{-1}$ | $2^{-2}$ | $2^{-3}$ |  | $2^{-30}$ | $2^{-31}$ | $2^{-32}$ |


| 31 | 30 | 29 | $\ldots \ldots \ldots$ | 2 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $2^{-33}$ | $2^{-34}$ | $2^{-35}$ |  | $2^{-62}$ | $2^{-63}$ | $2^{-64}$ |

## double-precision multiplication

To simplify discussion of double-precision multiplication, the following example implements an algorithm using one 'ACT8836 device. It should be noted that even higher speeds can be achieved through the use of two 'ACT8836s to implement a parallel multiplier.

The example is based on the following algorithm where $A$ and $B$ are 64 -bit signed numbers.
Let
$A_{m}=a_{s}, a 62, a 61, \ldots, a_{32}$
and
$A_{l}=a_{31}, a_{30}, a_{29}, \ldots, a_{0}\left(a_{0}=L S B\right)$
Therefore:

$$
A=\left(A_{m} \times 2^{32}\right)+A_{l}
$$

Likewise:

$$
B=\left(B_{m} \times 2^{32}\right)+B_{l}
$$

Thus:

$$
\begin{aligned}
A \times B & =\left[\left(A_{m} \times 2^{32}\right)+A_{1}\right] \times\left[\left(B_{m} \times 2^{32}\right)+B_{l}\right] \\
& \left.=\left(A_{m} \times B_{m}\right) 2^{64}+\left(A_{m} \times B_{1}+A_{l}\right] \times B_{m}\right) 2^{32}+A_{1} \times B_{l}
\end{aligned}
$$

Therefore, four products and three summations with rank adjustments are required.
Basic implementation of this algorithm uses a single 'ACT8836. The result is a two's complement 128 -bit product. Microcode signals to implement the algorithm are shown in Figure 4.
The first instruction cycle computes the first product, $\mathrm{A}_{\mid} \times \mathrm{B}_{\mid}$. The least significant half of the result is output through the Y port for storage in an external RAM or some other 32 -bit register; this will be the least significant 32 -bit portion of the final result.

The instruction also uses the shifter to shift the $A_{I} \times B_{\mid}$product 32 bits to the right in order to adjust for ranking in the next multiplication-addizion sequence. The least significant half of the shift result is stored in the lower 32-bit portion of the accumulator; the upper 32 bits contain the zero and fill.

The second instruction produces the second product, $A_{l} \times B_{m}$, adds it to the contents of the accumulator, and stores the result in the accumulator for use in the third instruction.

Instruction 3 computes $A_{m} \times B_{1}$, adds the result to the accumulator, and outputs the least significant 32 bits of the addition for use as bits 63-32 of the final product.

This instruction also shifts the result 32 bits to the right to provide the necessary rank adjustment and stores the shift result (the most significant half of the addition result) in the lower 32 bits of the accumulator. Bits ACC63-ACC32 are filled with zeros; the sign is extended into the three upper bits (ACC66-ACC64).

Instruction 4 computes the fourth product ( $\mathrm{Am} \times \mathrm{Bm}$ ), adds it to the accumulator, and outputs the least significant half at the Y port for use as bits $95-64$ of the final product.
This example assumes that the chip is operating in feed-through mode. A fifth instruction is therefore required to perform the fourth iteration again so that bits 127-96 of the final product can be output.

Example 1．Single Precision Multiply，32－Bit Result


Example 2．Double－Precision Multiply，64－Bit Result


Example 3．Newton－Raphson Division

$* N=\frac{32}{2 m+1}$ Where $m=$ number of bits in the seed（assuming 32－bits of precision）
FIGURE 4．MICROCODED EXAMPLES

## Newton-Raphson binary division algorithm

The following explanation illustrates how to implement the Newton-Raphson binary division algorithm using the 'ACT8836 multiplier/accumulator. The Newton-Raphson algorithm is an iterative procedure that generates the reciprocal of the divisor through a convergence method.
Consider the equation $Q=A / B$. This equation can be rewritten as $Q=A \times(1 / B)$. Therefore, the quotient O can be computed by simply multiplying the dividend A by the reciprocal of the divisor (B). Finding the divisor reciprocal $1 / \mathrm{B}$ is the objective of the Newton-Raphson algorithm.

To calculate $1 / B$ the Newton-Raphson equation, $X i+1=X i(2-B X i)$ is calculated in an iterative process. In the equation, B represents the divisor and X represents successively closer approximations to the reciprocal $1 / \mathrm{B}$. The following sequence of computation illustrates the iterative nature of the Newton-Raphson algorithm.

$$
\begin{array}{ll}
\text { Step } 1 & X 1=X 0(2-B X 0) \\
\text { Step } 2 & X 2=X 1(2-B X 1) \\
\text { Step } 3 & X 3=X 2(2-B X 2) \\
\text { Step } n & X n=X n-1(2-B X n-1)
\end{array}
$$

The successive approximation of Xi , for all i , approaches the reciprocal $1 / B$ as the number of iterations increases; that is

$$
\lim _{i \rightarrow n} X i=1 / B
$$

The iterative operation is executed until the desired tolerance or error is reached. The required accuracy for $1 / B$ can be determined by subtracting each xi from its corresponding $\mathrm{xi}+1$. If the difference $\mid \mathrm{Xi}+1$ $-\mathrm{Xi}_{\mathrm{i}} \mid$ is less than or equal to a predetermined round off error, then the process is terminated. The desired tolerance can also be achieved by executing a fixed number of iterations based on the accuracy of the initial guess of $1 / B$ stored in RAM of PROM.
The initial guess, XO , is called the seed approximation. The seed must be supplied to the Newton-Raphson process externally and must fall within the range of $0<X O<2 / B$ if $B$ is greater than 0 or $2 / B<X O<0$ if B is less than 0 .

To perform the Newton-Raphson binary division algorithm using the 'ACT8836, the divisor, B, must be a positive fraction. As a positive fraction, B is limited within the range of $1 / 2 \leq \mathrm{B}<1$.
Since $X i$ from Newton-Raphson must lie between $0<X i<2 / B$ and since the range of the positive fraction $B$ is $1 / 2 \leq B<1$, then the limits of $X i$ become $1 \leq X i<2$.
The range of $-B X i$ will therefore be $-2 \leq-B X i \leq-1 / 2$.
The limits of - BXi are shown in Table 7 as they would appear in the 'ACT8836 extended bit, binary fraction format.

TABLE 7. LIMITS OF - BX $\mathbf{i n}_{\mathbf{i}}$ 'ACT8836 EXTENDED BIT FORMAT

| Extended Bits |  |  |  |  |  |  |  |  |  | 63 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 66 | 65 | 64 | 62 | 61 | $\ldots \ldots$ | 2 | 1 | 0 |  |  |
| -2 | 1 | 1 | 1 | 0 | 0 | 0 | $\ldots \ldots$ | 0 | 0 | 0 |
| $-1 / 2$ | 1 | 1 | 1 | 1 | 1 | 0 | $\ldots \ldots$ | 0 | 0 | 0 |

The diagram indicates that -BXi is always of the form:

$$
111 \text { d0.d1 d2...........dn-2 dn-1 }
$$

The next step in Newton-Raphson is to complete the $2-B X i$ equation. The fractional representation of 2 is:

$$
0010.00 \ldots \ldots \ldots .00
$$

Completion of the $2-\mathrm{BXi}$ equation is shown in Table 8.
TABLE 8. COMPLETION OF 2-BXi EQUATION

| Extended Bits |  |  |  | 63 | 62 | 61 | 1 | 0 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | 66 | 65 | 64 |  |  |  |  |  |
|  | 1 | 1 | 1 | do | $\mathrm{d}_{1}$ | $\mathrm{d}_{2}$ | $\mathrm{d}_{\mathrm{n}-2}$ | $\mathrm{d}_{\mathrm{n}-1}$ |
| + | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| $=$ | 0 | 0 | 0 | do | $\mathrm{d}_{1}$ | $\mathrm{d}_{2}$ | $\mathrm{d}_{\mathrm{n}-2}$ | $\mathrm{d}_{\mathrm{n}-1}$ |

Since this step only affects the extended bits (66-64) on the 'ACT8836, this step can be skipped. The following algorithm can therefore be used to perform Newton-Raphson binary division with the 'ACT8836.

Assuming $B$ is on the $D B$ bus (or stored in the $B$ register) and $X i$ is stored in the temporary register:
Step 1
Accumulator $\leftarrow-$ (DB $\times$ temporary register)

$$
=2-B X i
$$

Step 2
Temporary Register $\leftarrow$ Left shift one bit of
(accumulator times temporary register)
$=X i+1$
$=X i(2-B X i)$
Step 3
Repeat Steps 1 and 2 until $\left|X_{i}+1-X_{i}\right| \leq$ a predetermined round-off error
Two cycles are required for each iteration. The left shift that is performed in Step 2 is required to realign Xi after the signed fraction multiply. Microcode for this example is shown in Figure 4.

| absolute maximum ratings over operating free-air temperature range (unless otherwise noted) ${ }^{\boldsymbol{\dagger}}$ |  |
| :---: | :---: |
| Supply voltage, VCC | $0.5 \vee$ to 6 V |
| Input clamp current, $\mathrm{I}_{\mathrm{K}}\left(\mathrm{V}_{1}<0\right.$ or $\left.\mathrm{V}_{1}>\mathrm{V}_{\mathrm{CC}}\right)$ | $\pm 20 \mathrm{~mA}$ |
| Output clamp current, IOK ( $\mathrm{V}_{\mathrm{O}}<0$ or $\left.\mathrm{V}_{\mathrm{O}}>\mathrm{V}_{\mathrm{CC}}\right)$ | $\pm 50 \mathrm{~mA}$ |
| Continuous output current, $\mathrm{IO}\left(\mathrm{V}_{\mathrm{O}}=0\right.$ to $\left.\mathrm{V}_{\mathrm{CC}}\right)$ | $\pm 50 \mathrm{~mA}$ |
| Continous current through $\mathrm{V}_{\mathrm{CC}}$ or GND pins | $\pm 100 \mathrm{~mA}$ |
| Operating free-air temperature range | $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$ |
| Storage temperature range | $-65^{\circ} \mathrm{C}$ to $150^{\circ} \mathrm{C}$ |

$\dagger$ Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings only and functional operation of the device at these or any other conditions beyond those indicated under "recommended operating conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.
recommended operating conditions

|  |  | MIN | NOM |
| :--- | :---: | :---: | :---: |
| $\mathrm{V}_{\mathrm{CC}}$ | MAPply voltage | 4.5 | 5 |
| $\mathrm{~V}_{\text {IH }}$ | High-level input voltage | 5.5 | V |
| $\mathrm{~V}_{\text {IL }}$ | Low-level input voltage | 2 | V CC |
| $\mathrm{I}_{\mathrm{OH}}$ | High-level output current | V |  |
| $\mathrm{I}_{\mathrm{OL}}$ | Low-level output current | 0 | V |
| $\mathrm{~V}_{\text {I }}$ | Input voltage | -8 | mA |
| $\mathrm{~V}_{\mathrm{O}}$ | Output voltage | mA |  |
| $\mathrm{dt} / \mathrm{dv}$ | Input transition rise or fall rate | 0.8 |  |
| $\mathrm{~T}_{\mathrm{A}}$ | Operating free-air temperature | 0 | O |

electrical characteristics over recommended operating free-air temperature range (unless otherwise noted)

| PARAMETER | TEST CONDITIONS | $\mathrm{V}_{\mathrm{Cc}}$ | $\mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C}$ |  |  | $\mathrm{T}_{\mathrm{A}}=0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$ |  | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | MIN | TYP | MAX | MIN | MAX |  |
| $\mathrm{V}_{\mathrm{OH}}$ | ${ }^{\mathrm{I}} \mathrm{OH}=-20 \mu \mathrm{~A}$ | 4.5 V | 4.4 |  |  | 4.4 |  | $\begin{aligned} & \mathrm{v} \\ & \mathrm{v} \end{aligned}$ |
|  |  | 5.5 V | 5.4 |  |  | 5.4 |  |  |
|  | $\mathrm{IOH}^{\prime}=-8 \mathrm{~mA}$ | 4.5 V | 3.8 |  |  | 3.7 |  |  |
|  |  | 5.5 V | 4.8 |  |  | 4.7 |  |  |
| $\mathrm{V}_{\text {OL }}$ | ${ }^{\mathrm{I} L}=20 \mu \mathrm{~A}$ | 4.5 V |  |  | 0.1 |  | 0.1 | $\begin{aligned} & v \\ & v \end{aligned}$ |
|  |  | 5.5 V |  |  | 0.1 |  | 0.1 |  |
|  | $\mathrm{I}^{\mathrm{OL}}=8 \mathrm{~mA}$ | 4.5 V |  |  | 0.32 |  | 0.4 |  |
|  |  | 5.5 V |  |  | 0.32 |  | 0.4 |  |
| 1 | $V_{1}=V_{\text {CC }}$ or 0 | 5.5 V |  |  | 0.1 | $\pm 1.0$ |  | $\mu \mathrm{A}$ |
| ${ }^{\text {I CC }}$ | $\mathrm{v}_{1}=\mathrm{v}_{\text {cc }}$ or $0, \mathrm{I}_{0}$ | 5.5 V |  |  | 50 |  | 100 | $\mu \mathrm{A}$ |
| $\mathrm{C}_{\mathrm{i}}$ | $\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}$ or 0 | 5 V |  | 5 | 10 |  | 10 | pF |
| $\Delta_{\text {c }} \mathrm{CC}^{\ddagger}$ | One input at 3.4 V , <br> other inputs at 0 or $V_{C C}$ | 5.5 V |  |  | 1 |  | 1 | mA |
| IOZH | $\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}$ or 0 | 5 V |  |  | 0.5 |  | 5 | $\mu \mathrm{A}$ |
| IOZL | $\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}$ or 0 | 5 V | -0.5 |  |  | -5 |  | $\mu \mathrm{A}$ |

[^15]SN74ACT8836

## 32－BIT BY 32－BIT MULTIPLIER｜ACCUMULATOR

## setup and hold times

|  | PARAMETER | MIN | MAX | UNIT |
| :---: | :---: | :---: | :---: | :---: |
| $\mathrm{t}_{\text {su }} 1$ | Instruction before CLK $\uparrow$ | 14 |  |  |
| $\mathrm{t}_{\text {su } 2}$ | Data before CLK $\uparrow$ | 12 |  |  |
| $\mathrm{t}_{\text {su }}$ | $\overline{\text { CKEA }}$ before CLK $\uparrow$ | 14 |  |  |
| $\mathrm{t}_{\text {su4 }}$ | $\overline{\text { CKEB }}$ before CLK $\uparrow$ | 14 |  |  |
| $\mathrm{t}_{\text {su }}$ | CKEI before CLK $\uparrow$ | 10 |  |  |
| $\mathrm{t}_{\text {su }}$ 6 | $\overline{\text { CKEY }}$ before CLK $\uparrow$ | 19 |  |  |
| $\mathrm{t}_{\text {su }}$ | SELREG before CLK $\uparrow$ | 12 |  |  |
| $\mathrm{t}_{\text {su8 }}$ | WEMS before CLK $\uparrow$ | 11 |  |  |
| $\mathrm{t}_{\text {su9 }}$ | $\overline{\text { WELS }}$ before CLK $\uparrow$ | 11 |  | ns |
| th1 | Instruction after CLKT | 0 |  |  |
| th2 | Data after CLK $\uparrow$ | 0 |  |  |
| th3 | $\overline{\text { CKEA }}$ after CLK $\uparrow$ | 0 |  |  |
| th4 | $\overline{\mathrm{CKEB}}$ after CLK $\uparrow$ | 0 |  |  |
| th5 | $\overline{\text { CKEl after CLK } \uparrow+}$ | 0 |  |  |
| th6 | $\overline{\text { CKEY }}$ after CLK $\uparrow$ | 0 |  |  |
| th7 | SELREG after CLK $\uparrow$ | 0 |  |  |
| th8 | $\overline{\text { WEMS }}$ after CLK $\uparrow$ | 0 |  |  |
| th9 | $\overline{\text { WELS }}$ after CLK $\dagger$ | 0 |  |  |

switching characteristics over recommended ranges of supply voltage and free-air temperature (see
Figure 2) for load circuit and voltage waveforms)

| PARAMETER | FROM (INPUT) | то (OUTPUT) | FT MODE (FT1 - FTO) | MIN TYP MAX | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: |
| ${ }^{\text {tpd1 }}{ }^{\dagger}$ | CLK | PIPE | 11 | 36 | ns |
| ${ }_{\text {tpd2 }}{ }^{\dagger}$ | PIPE | Y REG | 11 | 36 |  |
| ${ }^{\text {tpd }}{ }^{\dagger}$ | PIPE | ACCUM | 11 | 36 |  |
| ${ }^{\text {tpd4 }}{ }^{\dagger}$ | Y REG | $Y$ | All modes | 18 |  |
| ${ }_{\text {t }}^{\text {d }}$ 5 | SELY | Y | All modes | 18 |  |
| ${ }^{\text {tpd }}{ }^{\text { }}$ | CLK | Y REG | 01 | 54 |  |
| ${ }^{\text {tpd }} 7^{\dagger}{ }^{\text {¢ }}$ | CLK | ACCUM | 10 or 01 | 67 |  |
| ${ }_{\text {t }}$ d8 8 | CLK | Y | 10 | 67 |  |
| ${ }^{\text {tpd9 }}$ | DATA | Y | 00 | 60 |  |
| ${ }^{\text {p }}$ d $10{ }^{\dagger}$ | DATA | ACCUM | 00 | 56 |  |
| $\mathrm{t}_{\mathrm{pd}} 11$ | CLK | YETP | 11 or 10 | 18 |  |
| ${ }^{\text {tpd12 }}$ | CLK | ETPERR | 11 or 10 | 18 |  |
| ${ }^{\text {tpd13 }}$ | CLK | YETP | 00 | 67 |  |
| $t_{\text {pd } 14}$ | CLK | ETPERR | 01 | 67 |  |
| ${ }^{\text {t }}$ d15 | DATA | YETP | 00 | 60 |  |
| ${ }^{\text {tpd }} 16$ | DATA | ETPERR | 00 | 60 |  |
| $t_{\text {pd1 }}$ | PA | PERRA | All modes | 20 |  |
| ${ }_{\text {tpd18 }}$ | DA | PERRA | All modes | 20 |  |
| $\mathrm{t}_{\mathrm{pd}} 19$ | PB | PERRB | All modes | 20 |  |
| ${ }_{\text {tpd }}$ 20 | DB | PERRB | All modes | 20 |  |
| $\mathrm{t}_{\mathrm{pd} 21}$ | PY | PERRY | All modes | 20 |  |
| $\mathrm{t}_{\mathrm{pd} 22}$ | $Y$ | MSERR | All modes | 22 |  |
| $\mathrm{t}_{\mathrm{pd} 23}$ | YETP | MSERR | All modes | 22 |  |
| $\mathrm{t}_{\text {en } 2}$ | $\overline{\mathrm{OEY}}$ | YETP | All modes | 20 |  |
| ten1 | $\overline{\text { OEY }}$ | Y | All modes | 20 |  |
| ${ }^{\text {dis }} 1$ | $\overline{\mathrm{OEY}}$ | YETP | All modes | 15 |  |
| ${ }^{\text {dis2 }}$ | $\overline{\mathrm{OEY}}$ | $Y$ | All modes | 15 |  |

clock requirements

|  |  | SN74ACT8836 |  | UNIT |
| :---: | :---: | :---: | :---: | :---: |
|  |  | MIN | MAX |  |
| $t_{\text {w }} 1$ | CLK high | 5 |  | ns |
| $\mathrm{t}_{\mathrm{w}}$ 2 | CLK low | 20 |  |  |

$\dagger$ These parameters cannot be measured but can be inferred from device operation and other measurable parameters.

## SN74ACT8836

## 32-BIT BY 32-BIT MULTIPLIER/ACCUMULATOR

PARAMETER MEASUREMENT INFORMATION

CKEA, CKEB
CKEI, CKEY $Z$ 亿



FIGURE 6. FULL FLOWTHROUGH MODE, ACCUMULATOR MODE (FT = 00)

## PARAMETER MEASUREMENT INFORMATION



PARAMETER MEASUREMENT INFORMATION


FIGURE 8．FLOWTHROUGH PIPE ONLY，ACCUMULATOR MODE（FT＝01）

## PARAMETER MEASUREMENT INFORMATION



FIGURE 9. FLOWTHROUGH PIPE AND Y ONLY (FT = 10)


FIGURE 10. FLOWTHROUGH PIPE AND Y ONLY, ACCUMULATOR MODE (FT = 10)


PARAMETER MEASUREMENT INFORMATION


FIGURE 11．ALL REGISTERS ENABLED（FT＝11）

## PARAMETER MEASUREMENT INFORMATION




| PARAMETER |  | $\mathbf{R}_{\mathrm{L}}$ | $C_{L}{ }^{\dagger}$ | $S_{12}$ | $\mathrm{S}_{2}$ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $t_{\text {en }}$ | tPZH | $1 \mathrm{k} \Omega$ | 50 pF | OPEN | CLOSED |
|  | tPZL |  |  | CLOSED | OPEN |
| ${ }^{\text {t }}$ dis | tPHZ | $1 \mathrm{k} \Omega$ | 50 pF | OPEN | CLOSED |
|  | tPLZ |  |  | CLOSED | OPEN |
| $t_{\text {pd }}$ |  | - | 50 pF | OPEN | OPEN |

${ }^{\dagger} C_{L}$ includes probe and test fixture capacitance

All input pulses are supplied by generators having the following characteristics: PRR $\leq 1 \mathrm{MHz}, Z_{\text {out }}=50 \Omega, t_{r}=50 \Omega, t_{f}=6 \mathrm{~ns}$.
FIGURE 13. LOAD CIRCUIT

## Overview

1
SN74ACT8818 16-Bit Microsequencer 2

SN74ACT8832 32-Bit Registered ALU
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support8
Mechanical Data

# SN74ACT8837 64-Bit Floating Point Unit 

- Multiplier and ALU in One Chip
- 65-ns Pipelined Performance
- Low-Power EPIC ${ }^{\text {™ }}$ CMOS
- Meets IEEE Standard for 32- and 64-Bit Multiply, Add, and Subtract
- Three-Port Architecture, 64-Bit Internal Bus
- Pipelined or Flowthrough Operation
- Floating Point-to-Integer and Integer-to-Floating Point Conversions
- Supports Division Using Newton-Raphson Algorithm
- Parity Generation/Checking

The SN74ACT8837 single-chip floating point processor performs high-speed 32and 64-bit floating point operations. More than just a coprocessor, the 'ACT8837 integrates on one chip, two double-precision floating point functions, an ALU and multiplier.

The wide dynamic range and high precision of floating point format minimize the need for scaling and overflow detection. Computationally-intense applications, such as high-end graphics and digital signal processing, need doubleprecision floating point accuracy to maintain data integrity. Floating point processors in general-purpose computing must often support double-precision formats to match existing software.

By integrating its two functions on one chip, the 'ACT8837 reduces data routing problems and processing overhead. Its three data ports and 64-bit internal bus structure let the user load two operands and take a result in a single clock cycle.

[^16]Lع88ㄱV* LNS

## Contents

Page
Introduction ..... 5-13
Understanding the 'ACT8837 Floating Point Unit ..... 5-13
Microprogramming the 'ACT8837 ..... 5-13
Support Tools ..... 5-14
Design Support ..... 5-14
Systems Expertise ..... 5-15
'ACT8837 Logic Symbol ..... 5-16
'ACT8837 Pin Descriptions ..... 5-17
'ACT8837 Specification Tables ..... 5-24
SN74ACT8837 Floating Point Unit ..... 5-27
Data Flow ..... 5-27
Input Data Parity Check ..... 5-27
Temporary Input Register ..... 5-29
RA and RB Input Registers ..... 5-29
Multiplier/ALU Multiplexers ..... 5-30
Pipelined ALU ..... 5-31
Pipelined Multiplier ..... 5-31
Product, Sum, and C Registers ..... 5-31
Parity Generators ..... 5-31
Master/Slave Comparator ..... 5-34
Status and Exception Generator/Register ..... 5-34
Flowthrough Mode ..... 5-37
5
Fast and IEEE Modes ..... 5-37
Rounding Mode ..... 5-38
Test Pins ..... 5-38
Summary of Control Inputs ..... 5-38

## Contents (Continued)

Page
Instruction Set ..... 5-40
Loading External Data Operands ..... 5-40
Configuration Controls (CONFIG1-CONFIGO) ..... 5-40
CLKMODE Settings ..... 5-40
Internal Register Operations ..... 5-41
Data Register Controls (PIPES2-PIPESO) ..... 5-41
C Register Controls (SRCC, CLKC) ..... 5-42
Operand Selection (SELOP7-SELOPO) ..... 5-42
Rounding Controls (RND1-RNDO) ..... 5-43
Status Exceptions ..... 5-43
Handling of Denormalized Numbers (FAST) ..... 5-45
Data Output Controls (SELMS/(̄S, $\overline{\mathrm{OEY}}$ ) ..... 5-47
Status Output Controls (SELST1-SELSTO, $\overline{\mathrm{OES}}, \overline{\mathrm{OEC}}$ ) ..... 5-47
Stalling the Device (HALT) ..... 5-47
Instruction Inputs (19-IO) ..... 5-48
Independent ALU Operations ..... 5-48Independent Multiplier Operations5-48
Chained Multiplier/ALU Operations ..... 5-51
Microprogramming the 'ACT8837 ..... 5-52
Single-Precision Operations ..... 5-52
Single-Precision ALU Operations ..... 5-52
Single-Precision Multiplier Operations ..... 5-52
Sample Single-Precision Microinstructions ..... 5-53
Double-Precision Operations ..... 5-58
Double-Precision ALU Operations ..... 5-58
Double-Precision ALU Operations with CLKMODE $=0$ ..... 5-60
Double-Precision ALU Operations with CLKMODE $=1$ ..... 5-66
Double-Precision Multiplier Operations ..... 5-73
Double-Precision Multiplication with CLKMODE $=0$ ..... 5-73
Double-Precision Multiplication with CLKMODE $=1$ ..... 5-79
Chained Multiplier/ALU Operations ..... 5-86
Fully Pipelined Double-Precision Operations ..... 5-87
Mixed Operations and Operands ..... 5-90
Matrix Operations ..... 5-92
Representation of Variables ..... 5-92
Sample Matrix Transformation ..... 5-93
Microinstructions for Sample Matrix Manipulation ..... 5-100

## Contents (Concluded)

Page
Sample Microprograms for Binary Division and Square Root ..... 5-105
Binary Division Using the Newton-Raphson Algorithm ..... 5-105
Single-Precision Newton-Raphson Binary Division ..... 5-108
Double-Precision Newton-Raphson Binary Division ..... 5-111
Binary Square Root Using the Newton-Raphson Algorithm ..... 5-114
Single-Precision Square Root Using a Double-Precision Seed ROM ..... 5-114
Double-Precision Square Root ..... 5-117
Glossary ..... 5-123
Implementing a Double-Precision Seed ROM ..... 5-124

Lع88 $10 \forall \succ L N S$

## List of Illustrations

Figure Page
1 'ACT8837 Floating Point Unit ..... 5-28
2 Single-Precision Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=0$ ) ..... 5-53
3 Single-Precision Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=0$ ) ..... 5-54
4 Single-Precision Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=0$ ) ..... 5-55
5 Single-Precision Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=0$ ) ..... 5-57
6 Double-Precision ALU Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=0$ ) ..... 5-59
7 Double-Precision ALU Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=0$ ) ..... 5-61
8 Double-Precision ALU Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=0$ ) ..... 5-63
9 Double-Precision ALU Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=0$ ) ..... 5-65
10 Double-Precision ALU Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=1$ ) ..... 5-67
11 Double-Precision ALU Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=1$ ) ..... 5-68
12 Double-Precision ALU Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=1$ ) ..... 5-70
13 Double-Precision ALU Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=1$ ) ..... 5-72
14 Double-Precision Multiplier Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=0$ ) ..... 5-74
15 Double-Precision Multiplier Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=0$ ) ..... 5-75
16 Double-Precision Multiplier Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=0$ ) ..... 5-76
17 Double-Precision Multiplier Operation, All Registers Enabled (PIPES = 000, CLKMODE = 0) ..... 5-78
18 Double-Precision Multiplier Operation, All Registers Disabled (PIPES = 111, CLKMODE = 1) ..... 5-80

## List of Illustrations (Concluded)

Figure Page
19 Double-Precision Multiplier Operation, Input Registers Enabled (PIPES = 110, CLKMODE = 1) ..... 5-81
20 Double-Precision Multiplier Operation, Input and Output Registers Enabled (PIPES = 010, CLKMODE = 1) ..... 5-83
21 Double-Precision Multiplier Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=1$ ) ..... 5-85
22 Mixed Operations and Operands (PIPES2-PIPESO = 110, CLKMODE = 0) ..... 5-91
23 Mixed Operations and Operands (PIPES2-PIPESO = 000, CLKMODE = 1) ..... 5-92
24 Sequence of Matrix Operations ..... 5-96
25 Resultant Matrix Transformation ..... 5-103
26 IEEE Double-Precision Seed ROM for Newton-Raphson Division and Square Root ..... 5-125

## List of Tables

Table Page
1 'ACT8837 Pin Grid Allocations ..... 5-17
2 'ACT8837 Pin Functional Description ..... 5-18
3 Double-Precision Input Data Configuration Modes ..... 5-29
4 Single-Precision Input Data Configuration Mode ..... 5-30
5 Double-Precision Input Data Register Sources ..... 5-30
6 Multiplier Input Selection ..... 5-30
7 ALU Input Selection ..... 5-30
8 Independent ALU Operations, Single Operand ( $19=0,16=0$ ) ..... 5-32
9 Independent ALU Operations, Two Operands ( $19=0,15=0$ ) ..... 5-33
10 Independent Multiplier Operations ( $19=0,16=1$ ) ..... 5-33
11 Independent Multiplier Operations Selected by 14-I2 ( $19=0,16=1$ ) ..... 5-34
12 Operations Selected by $18-17(19=0,16=1)$ ..... 5-34
13 Chained Multiplier/ALU Operations (19 = 1) ..... 5-35
14 Comparison Status Outputs ..... 5-36
15 Status Outputs ..... 5-36
16 Status Output Selection (Chain Mode) ..... 5-37
17 Pipeline Controls (PIPES2-PIPESO) ..... 5-37
18 Rounding Modes ..... 5-38
19 Test Pin Control Inputs ..... 5-38
20 Control Inputs ..... 5-39
21 IEEE Floating-Point Representations ..... 5-44
22 Handling Wrapped Multiplier Outputs ..... 5-46
23 Independent ALU Operations with One Operand ..... 5-49
24 Independent ALU Operations with Two Operands ..... 5-50
25 Independent Multiplier Operations ..... 5-50
26 Chained Multiplier/ALU Operations ..... 5-51
27 Single-Precision Sum of Products (PIPES2-PIPESO $=010$ ) ..... 5-86
28 Sample Microinstructions for Single-Precision Sum of Products ..... 5-87

## List of Tables (Concluded)

Table Page
29 Pseudocode for Fully Pipelined Double-Precision Sum of Products (CLKM $=0$, CONFIG $=10$, PIPES $=000$, CLKC $\longrightarrow$ SYSCLK) ..... 5-88
30 Pseudocode for Fully Pipelined Double-Precision Product of Sums (CLKM = 0, CONFIG = 10, PIPES = 000, CLKC $\longrightarrow$ SYSCLK) ..... 5-89
31 Microinstructions for Sample Matrix Manipulation ..... 5-101
32 Single-Precision Matrix Multiplication (PIPES2-PIPESO = 010) ..... 5-102
33 Fully Pipelined Sum of Products (PIPES2-PIPESO = 000) ..... 5-104
34 Sample Data Values and Representations ..... 5-106
35 Binary Division Using the Newton-Raphson Algorithm ..... 5-107
36 Single-Precision Newton-Raphson Binary Division ..... 5-109
37 Double-Precision Newton-Raphson Binary Division. ..... 5-111
5-115
38 Single-Precision Binary Square Root
5-118
39 Double-Precision Binary Square Root

## Introduction

Each of these floating point units (FPU), the SN74ACT8837 combines a multiplier and an arithmetic-logic unit in a single microprogrammable VLSI device. The 'ACT8837 is implemented in Texas Instruments one-micron CMOS technology to offer high speed and low power consumption in an FPU with exceptional flexibility and functional integration. The FPU can be microprogrammed to operate in multiple modes to support a variety of floating point applications.

The 'ACT8837 is fully compatible with the IEEE standard for binary floating point arithmetic, STD 754-1985. This FPU performs both single- and double-precision operations, including division and square-root using the Newton-Raphson algorithm.

## Understanding the 'ACT8837 Floating Point Unit

To support floating point processing in IEEE format, the 'ACT8837 may be configured for either single- or double-precision operation. Instruction inputs can be used to select three modes of operation, including independent ALU operations, independent multiplier operations, or simultaneous ALU and multiplier operations.

Three levels of internal data registers are available. The device can be used in flowthrough mode (all registers disabled), pipelined mode (all registers enabled), or in other available register configurations. An instruction register, a 64-bit constant register, and a status register are also provided.

The FPU can handle three types of data input formats. The ALU accepts data operands in integer format or IEEE floating point format. In the 'ACT8837, integers are converted to normalized floating point numbers with biased exponents prior to further processing. A third type of operand, denormalized numbers, can also be processed after the ALU has converted them to "wrapped" numbers, which are explained in detail in a later section. The 'ACT8837 multiplier operates only on normalized floating-point numbers or wrapped numbers.

## Microprogramming the 'ACT8837

The 'ACT8837 is a fully microprogrammable device. Each FPU operation is specified by a microinstruction or sequence of microinstructions which set up the control inputs of the FPU so that the desired operation is performed.

The microprogram which controls operation of the FPU is stored in the microprogram memory (or control store). Execution of the microprogram is controlled by a microsequencer such as the TI SN74ACT8818 16-bit microsequencer. A discussion of microprogrammed architecture and the operation of the 'ACT8818 is presented in this Data Manual.

## Support Tools

Texas Instruments has developed a functional evaluation model of the 'ACT8837 in software which permit designers to simulate operation of the FPU. To evaluate the functions of an FPU, a designer can create a microprogram with sample data inputs, and the simulator will emulate FPU operation to produce sample data output files, as well as several diagnostic displays to show specific aspects of device operation. Sample microprogram sequences are included in this section.

Texas Instruments has also designed a family of low-cost real-time evaluation modules (EVM) to aid with initial hardware and microcode design. Each EVM is a small selfcontained system which provides a convenient means to test and debug simple microcode, allowing software and hardware evaluation of components and their operation.

At present, the 74AS-EVM-8 Bit-Slice Evaluation Module has been completed, and a 16-bit EVM is in an advanced stage of development. EVMs and support tools for devices in the VLSI family are planned for future development.

## Design Support

TI's '8837 64-bit floating point unit is supported by a variety of tools developed to aid in design evaluation and verification. These tools will streamline all stages of the design process, from assessing the operation and performance of the '8837 to evaluating a total system application. The tools include a functional model, behavioral model, and microcode development software and hardware. Section 8 of this manual provides specific information on the design tools supporting TI's SN74ACT8800 Family.

## Systems Expertise

Texas Instruments VLSI Logic applications group is available to help designers analyze TI's high-performance VLSI products, such as the ' 883764 -bit floating point unit. The group works directly with designers to provide ready answers to device-related questions and also prepares a variety of applications documentation.

The group may be reached in Dallas, at (214) 997-3970.

## 'ACT8837 Logic Symbol



## 'ACT8837 Pin Descriptions

Pin descriptions and grid allocations for the 'ACT8837 are given on the following pages.

## 208 PIN . . . GB PACKAGE <br> (TOP VIEW)



Table 1. 'ACT8837 Pin Grid Allocations

| PIN |  | PIN |  | PIN |  | PIN |  | PIN |  | PIN |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| NO. | NAME | NO. | NAME | NO. | NAME | NO. | NAME | NO. | NAME | NO. | NAME |
| A1 | NC | C2 | YO | E3 | FAST | $J 15$ | NC | P1 | NC | S1 | NC |
| A2 | NC | C3 | Y3 | E4 | GND | J16 | SRCC | P2 | PIPESO | S2 | PBO |
| A3 | Y5 | C4 | Y6 | E14 | GND | J17 | BYTEP | P3 | RESET | S3 | DB0 |
| A4 | Y8 | C5 | Y9 | E15 | AGTB | K1 | SELOP3 | P4 | PB1 | S4 | DB4 |
| A5 | Y11 | C6 | Y12 | E16 | AEQB | K2 | SELOP4 | P5 | DB1 | S5 | DB11 |
| A6 | Y14 | C7 | Y15 | E17 | MSERR | K3 | SELOP5 | P6 | DB5 | S6 | D812 |
| A7 | Y17 | C8 | Y18 | F1 | 15 | K4 | GND | P7 | DB9 | S7 | DB15 |
| A8 | Y20 | C9 | Y23 | F2 | 13 | K14 | GND | P8 | DB16 | S8 | DB19 |
| A9 | Y21 | C10 | Y26 | F3 | RNDO | K15 | PA1 | P9 | DB21 | S9 | D823 |
| A 10 | Y24 | C11 | Y30 | F4 | GND | K16 | PA2 | P10 | DB28 | S10 | DB26 |
| A11 | Y27 | C12 | PY1 | F14 | GND | K17 | PA3 | P11 | DAO | S11 | DB30 |
| A12 | Y29 | C13 | UNDER | F15 | PERRA | L1 | SELOP6 | P12 | DA4 | S12 | DA2 |
| A13 | PYO | C14 | INEX | F16 | $\overline{\mathrm{OEY}}$ | L2 | SELOP7 | P13 | DA8 | S13 | DA6 |
| A14 | PY3 | C15 | DENIN | F17 | OES | L3 | CLK | P14 | DA12 | S14 | DA10 |
| A15 | IVAL | C16 | SRCEX | G1 | 17 | L4 | $V_{\text {CC }}$ | P15 | DA19 | S15 | DA14 |
| A16 | NC | C17 | CHEX | G2 | 16 | L14 | GND | P16 | DA22 | S16 | DA15 |
| A17 | NC | D1 | 11 | G3 | 14 | L15 | DA30 | P17 | DA23 | S17 | DA17 |
| B1 | NC | D2 | RND1 | G4 | $\mathrm{V}_{\mathrm{CC}}$ | L16 | DA31 | R1 | PIPES 1 | T1 | NC |
| B2 | Y2 | D3 | Y1 | G14 | $\mathrm{V}_{\text {CC }}$ | L17 | PAO | R2 | HALT | T2 | PB3 |
| B3 | Y4 | D4 | GND | G15 | OEC | M1 | ENRB | R3 | PB2 | T3 | DB3 |
| B4 | Y7 | D5 | $V_{\text {CC }}$ | G16 | SELMS/LS | M2 | ENRA | R4 | DB2 | T4 | D87 |
| B5 | Y10 | D6 | GND | G17 | TP1 | M3 | CLKC | R5 | DB6 | T5 | DB8 |
| B6 | Y13 | D7 | GND | H1 | 19 | M4 | GND | R6 | DB10 | T6 | DB13 |
| B7 | Y16 | D8 | $V_{\text {CC }}$ | H2 | NC | M14 | $V_{\text {CC }}$ | R7 | DB14 | T7 | DB17 |
| B8 | Y19 | D9 | GND | H3 | 18 | M15 | DA27 | R8 | DB18 | T8 | DB20 |
| B9 | Y22 | D10 | GND | H4 | GND | M16 | DA28 | R9 | DB22 | T9 | DB24 |
| B10 | Y25 | D11 | $V_{C C}$ | H14 | GND | M17 | DA29 | R10 | DB27 | T10 | DB25 |
| B11 | Y28 | D12 | GND | H15 | TPO | N1 | CONFIGO | R11 | DB31 | T11 | D829 |
| B12 | Y31 | D13 | GND | H16 | SELST 1 | N2 | CONFIG1 | R12 | DA3 | T12 | DA1 |
| B13 | PY2 | D14 | $V_{C C}$ | H17 | SELSTO | N3 | CLKMODE | R13 | DA7 | T13 | DA5 |
| B14 | OVER | D15 | STEX1 | J1 | SELOP2 | N4 | PIPES2 | R14 | DA11 | T14 | DA9 |
| B15 | RNDCO | D16 | STEXO | J2 | SELOP1 | N14 | DA18 | R15 | DA16 | T15 | DA13 |
| B16 | DENORM | D17 | UNORD | J3 | SELOPO | N15 | DA24 | R16 | DA20 | T16 | NC |
| B17 | NC | E1 | 12 | J4 | $V_{\text {CC }}$ | N16 | DA25 | R17 | DA21 | T17 | NC |
| C1 | PERRB | E2 | 10 | J14 | $\mathrm{V}_{\mathrm{CC}}$ | N17 | DA26 |  |  |  |  |

Table 2. 'ACT8837 Pin Functional Description

| PIN |  | 1/0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | No. |  |  |
| AEQB | E16 | 1/O | Comparison status 1 zero detect pin. When high, indicates that $A$ and $B$ operands are equal during a compare operation in the ALU. If not a compare, a high signal indicates a zero result. |
| AGTB | E15 | 1/0 | Comparison status pin. When high, indicates that $A$ operand is greater than B operand. |
| BYTEP | J17 | 1 | When high, selects parity generation for each byte of input (four parity bits for each bus). When low, selects parity generation for whole 32-bit input (one parity bit for each bus). |
| CHEX | C17 | I/O | Status pin indicating an exception during a chained function. If 16 is low, indicates the multiplier is the source of the exception. If 16 is high, indicates the ALU is the source of the exception. |
| CLK | L3 | 1 | Master clock for all registers except C register |
| CLKC | M3 | 1 | C register clock |
| CLKMODE | N3 | 1 | Selects whether temporary register loads only on rising clock edge (CLKMODE $=\mathrm{L}$ ) or on falling edge (CLKMODE = H). |
| CONFIGO CONFIG1 | $\begin{aligned} & \mathrm{N} 1 \\ & \mathrm{~N} 2 \end{aligned}$ | 1 | Select data sources for RA and RB registers from DA bus, DB bus and temporary register. |
| DAO | P11 |  |  |
| DA1 | T12 |  |  |
| DA2 | S12 |  |  |
| DA3 | R12 |  |  |
| DA4 | P12 |  |  |
| DA5 | T13 |  |  |
| DA6 | S13 |  |  |
| DA7 | R13 |  |  |
| DA8 | P13 |  |  |
| DA9 | T14 |  |  |
| DA10 | S14 |  |  |
| DA11 | R14 | 1 | 64-bit temporary register or loaded directly into an |
| DA12 | P14 |  | input register. |
| DA13 | T15 S15 |  |  |
| DA15 | S16 |  |  |
| DA16 | R15. |  |  |
| DA17 | S17 |  |  |
| DA18 | N14 |  |  |
| DA19 | P15 |  |  |
| DA20 | R16 |  |  |
| DA21 | R17 |  |  |
| DA22 | P16 |  |  |
| DA23 | P17 |  |  |

Table 2. 'ACT8837 Pin Functional Description (Continued)

| PIN |  | 1/0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  |  |
| DA24 | N15 |  |  |
| DA25 | N16 |  |  |
| DA26 | N17 |  |  |
| DA27 | M15 |  | 64-bit temporary register or loaded directly into an |
| DA28 | M16 |  | input register |
| DA29 | M17 |  |  |
| DA30 | L15 |  |  |
| DA31 | L16 |  |  |
| DBO | S3 |  |  |
| DB1 | P5 |  |  |
| DB2 | R4 |  |  |
| DB3 | T3 |  |  |
| DB4 | S4 |  |  |
| DB5 | P6 |  |  |
| DB6 | R5 |  |  |
| DB7 | T4 |  |  |
| DB8 | T5 |  |  |
| DB9 | P7 |  |  |
| DB10 | R6 |  |  |
| DB11 | S5 |  |  |
| DB12 | S6 |  |  |
| DB13 | T6 |  |  |
| DB14 | R7 |  |  |
| DB15 | S7 | 1 | 64-bit temporary register or loaded directly into an |
| DB16 | P8 T7 |  | input register |
| DB18 | R8 |  |  |
| DB19 | S8 |  |  |
| DB20 | T8 |  |  |
| DB21 | P9 |  |  |
| DB22 | R9 |  |  |
| DB23 | S9 |  |  |
| DB24 | T9 |  |  |
| DB25 | T10 |  |  |
| DB26 | S10 |  |  |
| DB27 | R10 |  |  |
| DB28 | P10 |  |  |
| DB29 | T11 |  |  |
| DB30 | S11 |  |  |
| DB31 | R11 |  |  |
| DENIN | C15 | I/O | Status pin indicating a denormal input to the multiplier. When DENIN goes high, the STEX pins indicate which port had the denormal input. |

Table 2. 'ACT8837 Pin Functional Description (Continued)

| PIN |  | 1/0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | No. |  |  |
| DENORM | B16 | 1/0 | Status pin indicating a denormal output from the ALU or a wrapped output from the multiplier. In FAST mode, causes the result to go to zero when DENORM is high. |
| ENRA | M2 | 1 | When high, enables loading of RA register on a rising clock edge if the RA register is not disabled (see PIPESO below). |
| ENRB | M1 | 1 | When high, enables loading of RB register on a rising clock edge if the RB register is not disabled (see PIPESO below). |
| FAST | E3 | 1 | When low, selects gradual underflow (IEEE mode). When high, selects sudden underflow, forcing all denormalized inputs and outputs to zero. |
| GND | D4 |  |  |
| GND | D6 |  |  |
| GND | D7 |  |  |
| GND | D9 |  |  |
| GND | D10 |  |  |
| GND | D12 |  |  |
| GND | D13 |  |  |
| GND | E4 |  | Ground pins. NOTE: All ground pins should be |
| GND | E14 |  | used and connected. |
| GND | F4 |  |  |
| GND | F14 |  |  |
| GND | H4 |  |  |
| GND | H14 |  |  |
| GND | K4 |  |  |
| GND | K14 |  |  |
| GND | L14 |  |  |
| GND | M4 |  |  |
| HALT | R2 | 1 | Stalls operation without altering contents of instruction or data registers. Active low. |
| 10 | E2 |  |  |
| 11 | D1 |  |  |
| 12 | E1 |  |  |
| 13 | F2 |  |  |
| 14 | G3 | 1 | Instruction inputs |
| 15 | F1 |  |  |
| 16 | G2 |  |  |
| 17 | G1 |  |  |
| 18 | H3 |  |  |
| 19 | H1 |  |  |
| INEX | C14 | 1/0 | Status pin indicating an inexact output |

Table 2．＇ACT8837 Pin Functional Description（Continued）

| PIN |  | I／O | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | No． |  |  |
| IVAL | A15 | 1／0 | Status pin indicating that an invalid operation or a nonnumber（ NaN ）has been input to the multiplier or ALU． |
| MSERR | E17 | 0 | Master／Slave error output pin |
| NC | A1 |  | No internal connection．Pins should be left floating． |
|  | A2 |  |  |
|  | A16 |  |  |
|  | A17 |  |  |
|  | B1 |  |  |
|  | B17 |  |  |
|  | H2 |  |  |
|  | J15 |  |  |
|  | P1 |  |  |
|  | S1 |  |  |
|  | T1 |  |  |
|  | T16 |  |  |
|  | T17 |  |  |
| OEC | G15 | 1 | Comparison status output enable．Active low． |
| OES | F17 | 1 | Exception status and other status output enable． Active Iow． |
| $\overline{\text { OEY }}$ | F16 | 1 | Y bus output enable．Active low． |
| OVER | B14 | 1／O | Status pin indicating that the result is greater the largest allowable value for specified format （exponent overflow）． |
| PAO | L17 | 1 | Parity inputs for DA data |
| PA1 | K15 |  |  |
| PA2 | K16 |  |  |
| PA3 | K17 |  |  |
| PBO | S2 | 1 | Parity inputs for DB data |
| PB1 | P4 |  |  |
| PB2 | R3 |  |  |
| PB3 | T2 |  |  |
| PERRA | F15 | 0 | DA data parity error output．When high，signals a byte or word has failed an even parity check． |
| PERRB | C1 | 0 | DB data parity error output．When high，signals a bytc or word has failed an even parity check． |
| PIPESO | P2 | 1 | When low，enables instruction register，RA and RB input registers．When high，puts instruction register，RA and RB registers in flowthrough mode． |
| PIPES 1 | R1 | 1 | When low，enables pipeline registers in ALU and multiplier．When high，puts pipeline registers in flowthrough mode． |

Table 2．＇ACT8837 Pin Functional Description（Continued）

| PIN |  | 1／0 | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO． |  |  |
| PIPES2 | N4 | 1 | When low，enables status register，product（ P ）and sum（S）registers．When high，puts status register， $P$ and $S$ registers in flowthrough mode． |
| PYO | A13 | 1／0 | Y port parity data |
| PY1 | C12 |  |  |
| PY2 | B13 |  |  |
| PY3 | A14 |  |  |
| RESET | P3 | 1 | Clears internal states and status with no effect to data registers．Active low． |
| RNDO RND1 | $\begin{aligned} & \text { F3 } \\ & \text { D2 } \end{aligned}$ | 1 | Rounding mode control pins．Select four IEEE rounding modes（see Table 18）． |
| RNDCO | B15 | 1 | When high，indicates the mantissa of a wrapped number has been increased in magnitude by rounding． |
| SELMS／［̄］ | G16 | 1 | When low，selects LSH of 64－bit result to be output on the Y bus．When high，selects MSH of 64－bit result． |
| SELOPO | J3 | 1 | Select operand sources for multiplier and ALU <br> （See Tables 6 and 7） |
| SELOP1 | J2 |  |  |
| SELOP2 | J1 |  |  |
| SELOP3 | K1 |  |  |
| SELOP4 | K2 |  |  |
| SELOP5 | K3 |  |  |
| SELOP6 | L1 |  |  |
| SELOP7 | L2 |  |  |
| $\begin{aligned} & \text { SELSTO } \\ & \text { SELST1 } \end{aligned}$ | $\begin{aligned} & \mathrm{H} 17 \\ & \mathrm{H} 16 \end{aligned}$ | 1 | Select status source during chained operation （see Table 16） |
| SRCC | J16 | 1 | When low，selects ALU as data source for C register．When high，selects multiplier as data source for C register． |
| SRCEX | C16 | I／O | Status pin indicating source of status，either ALU（SRCEX $=\mathrm{L}$ ）or multiplier（SRCEX $=\mathrm{H}$ ） |
| $\begin{aligned} & \text { STEXO } \\ & \text { STEX1 } \end{aligned}$ | $\begin{aligned} & \text { D16 } \\ & \text { D15 } \end{aligned}$ | I／O | Status pins indicating that a nonnumber（ NaN ）or denormal number has been input on A port （STEX1）or B port（STEX0）． |
| TPO TP1 | $\begin{aligned} & \mathrm{H} 15 \\ & \text { G17 } \end{aligned}$ | 1 | Test pins（see Table 19） |
| UNDER | C13 | I／O | Status pin indicating that a result is inexact and less than minimum allowable value for format （exponent underflow）． |
| UNORD | D17 | I／O | Comparison status pin indicating that the two inputs are unordered because at least one of them is a nonnumber（ NaN ）． |

Table 2. 'ACT8837 Pin Functional Description (Concluded)

|  |  | I/O | DESCRIPTION |
| :---: | :---: | :---: | :---: |
| NAME | NO. |  |  |
| $\mathrm{V}_{\text {CC }}$ | D5 |  |  |
| $V_{\text {CC }}$ | D8 |  |  |
| $\mathrm{V}_{\text {CC }}$ | D11 |  |  |
| $\mathrm{V}_{\text {CC }}$ | D14 |  |  |
| $V_{C C}$ | G4 |  | 5-V power supply |
| $V_{C C}$ | G14 |  | S-V power supply |
| $V_{\text {CC }}$ | J4 |  |  |
| $V_{C C}$ | J14 |  |  |
| $\mathrm{V}_{\mathrm{CC}}$ | L4 |  |  |
| $\mathrm{V}_{\mathrm{CC}}$ | M14 |  |  |
| YO | C2 |  |  |
| Y1 | D3 |  |  |
| Y2 | B2 |  |  |
| Y3 | C3 |  |  |
| Y4 | B3 |  |  |
| Y5 | A3 |  |  |
| Y6 | C4 |  |  |
| Y7 | B4 |  |  |
| Y8 | A4 |  |  |
| Y9 | C5 |  |  |
| Y10 | B5 |  |  |
| Y11 | A5 |  |  |
| Y12 | C6 |  |  |
| Y13 | B6 |  |  |
| Y14 | A6 |  |  |
| Y15 | C7 | 1/0 | 32-bit Y output data bus |
| Y16 | B7 |  |  |
| Y17 | A7 |  |  |
| Y18 | C8 |  |  |
| Y19 | B8 |  |  |
| Y20 | A8 |  |  |
| Y21 | A9 |  |  |
| Y22 | B9 |  |  |
| Y23 | C9 |  |  |
| Y24 | A10 |  |  |
| Y25 | B10 |  |  |
| Y26 | C10 |  |  |
| Y27 | A11 |  |  |
| Y28 | B11 |  |  |
| Y29 | A12 |  |  |
| Y30 | C11 |  |  |
| Y31 | B12 |  |  |

SN74ACT8837

## 'ACT8837 Specification Tables

## absolute maximum ratings over operating free-air temperature range (unless otherwise noted) $\dagger$

Supply voltage, VCC . . . . . . . . . . . . . . . . . . . . . . -0.5 V to 6 V
Input clamp current, $\mathrm{I}_{\mathrm{K}}\left(\mathrm{V}_{\mathrm{I}}<0\right.$ or $\left.\mathrm{V}_{\mathrm{I}}>\mathrm{V}_{\mathrm{CC}}\right)$........ $\pm 20 \mathrm{~mA}$ Output clamp current, $\mathrm{IOK}_{\mathrm{K}}\left(\mathrm{V}_{\mathrm{O}}<0\right.$ or $\mathrm{V}_{\mathrm{O}}>\mathrm{V}_{\mathrm{CC}}$ ) . . . . $\pm 50 \mathrm{~mA}$ Continuous output current, $\mathrm{IO}_{\mathrm{O}}(\mathrm{VO}=0$ to V CC ) . . . . . . . $\pm 50 \mathrm{~mA}$ Continuous current through VCC or GND pins ........ . $\pm 100 \mathrm{~mA}$ Operating free-air temperature range . . . . . . . . . . . . . $0^{\circ} \mathrm{C}$ to $70^{\circ} \mathrm{C}$ Storage temperature range . . . . . . . . . . . . . . . . $-65^{\circ} \mathrm{C}$ to $150^{\circ} \mathrm{C}$
†Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings only and functional operation of the device at these or any other conditions beyond those indicated under "recommended operating conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.
recommended operating conditions

| PARAMETER |  | SN74ACT8837 |  |  | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | MIN | NOM | MAX |  |
| $\mathrm{V}_{\mathrm{Cc}}$ | Supply voltage | 4.75 | 5.0 | 5.25 | V |
| $\mathrm{V}_{1 \mathrm{H}}$ | High-level input voltage | 2 |  | VCC | V |
| $\mathrm{V}_{\mathrm{IL}}$ | Low-level input voltage | 0 |  | - 0.8 | V |
| ${ }^{1} \mathrm{OH}$ | High-level output current |  |  | -8 | mA |
| IOL | Low-level output current |  |  | 8 | mA |
| $\mathrm{V}_{1}$ | Input voltage | 0 |  | $\mathrm{V}_{\mathrm{CC}}$ | V |
| $\mathrm{V}_{\mathrm{O}}$ | Output voltage | 0 |  | $\mathrm{V}_{\mathrm{CC}}$ | V |
| $\mathrm{dt} / \mathrm{dv}$ | Input transition rise or fall rate | 0 |  | 15 | ns/V |
| $\mathrm{T}_{\text {A }}$ | Operating free-air temperature | 0 |  | 70 | ${ }^{\circ} \mathrm{C}$ |

electrical characteristics over recommended operating free-air temperature range (unless otherwise noted)

| PARAMETER | TEST CONDITIONS | Vcc | $\mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C}$ | SN74ACT8837 | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | MIN TYP MAX | MIN TYP MAX |  |
| $\mathrm{V}_{\mathrm{OH}}$ | $\mathrm{IOH}=-20 \mu \mathrm{~A}$ | 4.5 V |  |  | V |
|  |  | 5.5 V |  |  |  |
|  | $\mathrm{IOH}=-8 \mathrm{~mA}$ | 4.5 V |  | 3.76 |  |
|  |  | 5.5 V |  | 4.76 |  |
| $\mathrm{V}_{\mathrm{OL}}$ | ${ }^{\text {IOL }}=20 \mu \mathrm{~A}$ | 4.5 V |  | $4{ }^{4}$ | V |
|  |  | 5.5 V |  |  |  |
|  | $\mathrm{IOL}=8 \mathrm{~mA}$ | 4.5 V | $\mathrm{O}^{2}$ | 0.45 |  |
|  |  | 5.5 V |  | 0.45 |  |
| 1 | $\mathrm{V}_{1}=\mathrm{V}_{\mathrm{CC}}$ or 0 | 5.5 V |  | $\pm 1$ | $\mu \mathrm{A}$ |
| ICC | $\mathrm{V}_{1}=\mathrm{V}_{\text {Cc }}$ or $0, \mathrm{I}_{0}$ | 5.5 V |  | 200 | $\mu \mathrm{A}$ |
| $\mathrm{C}_{\mathrm{i}}$ | $\mathrm{V}_{\mathrm{i}}=\mathrm{V}_{\text {cc }}$ or 0 | 5 V |  |  | pF |

## switching characteristics (see Note)

| PARAMETER |  | SN74ACT8837-65 |  | UNIT |
| :---: | :---: | :---: | :---: | :---: |
|  |  | MIN | MAX |  |
| ${ }^{\text {tpd1 }}$ | Propagation delay from DA/DB/I inputs to Y output |  | 125 | ns |
| ${ }^{\text {tpd2 }}$ | Propagation delay from input register to output buffer |  | 118 | ns |
| ${ }^{\text {tpd3 }}$ | Propagation delay from pipeline register to output buffer |  | 70 | ns |
| $\mathrm{t}_{\mathrm{pd}}$ 4 | Propagation delay from output register to output buffer |  | 30 | ns |
| $\mathrm{t}_{\mathrm{pd}}$ 5 | Propagation delay from SELMS/LS to Y output |  | 32 | ns |
| $t_{d 1}$ | Propagation delay from input register to output register |  | 95 | ns |
| $\mathrm{t}_{\mathrm{d} 2}$ | Delay time, input register to pipeline register or pipeline register to output register | 65 |  | ns |

Note: Switching data must be used with timing diagrams for different operating modes.

## setup and hold times

| PARAMETER | SN74ACT8837-65 |  | UNIT |
| :---: | :--- | :---: | :---: |
|  | MIN | 18 |  |

clock requirements

| PARAMETER |  |  | SN74ACT8837-65 |  | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | MIN | MAX |  |
| ${ }^{\text {w }}$ w | Pulse duration | CLK high | 15 |  |  |
|  |  | CLK low | 15 |  | ns |
| Clock period |  |  |  |  | ns |

## SN74ACT8837 FLOATING POINT UNIT

The SN74ACT8837 is a high-speed floating point unit implemented in TI's advanced $1-\mu \mathrm{m}$ CMOS technology. The device is fully compatible with IEEE Standard 754-1985 for addition, subtraction and multiplication operations.

The 'ACT8837 input buses can be configured to operate as two 32-bit data buses or a single 64-bit bus, providing a number of system interface options. Registers are provided at the inputs, outputs, and inside the ALU and multiplier to support multilevel pipelining. These registers can be bypassed for nonpipelined operation.

A clock mode control allows the temporary register to be clocked on the rising edge or the falling edge of the clock to support double precision operations (except multiplication) at the same rate as single precision operations. A feedback register with a separate clock is provided for temporary storage of a multiplier result, ALU result or constant.

To ensure data integrity, parity checking is performed on input data, and parity is generated for output data. A master/slave comparator supports fault-tolerant system design. Two test pin control inputs allow all I/Os and outputs to be forced high, low, or placed in a high-impedance state to facilitate system testing.

Floating point division using a Newton-Raphson algorithm can be performed in a sum-of-products operating mode, one of two modes in which the multiplier and ALU operate in parallel. Absolute value conversions, floating point to integer and integer to floating point conversions, and a compare instruction are also available.

## Data Flow

Data enters the 'ACT8837 through two 32-bit input data buses, DA and DB. The buses can be configured to operate as a single 64-bit data bus for double precision operations (see Table 7). Data can be latched in a 64-bit temporary register or loaded directly into the RA and RB registers for input to the multiplier and ALU.

Four multiplexers select the multiplier and ALU operands from the input register, C register or previous multiplier or ALU result. Results are output on the 32-bit Y bus; a Y output multiplexer selects the most significant or least significant half of the result for output. The 64-bit C register is provided for temporary storage of a result from the ALU or multiplier.

## Input Data Parity Check

When BYTEP is high, internal odd parity is generated for each byte of input data at the DA and DB ports and compared to the PA and PB parity inputs. If an odd number of bits is set high in a data byte, the parity bit for that byte is also set high. Parity bits are input on PA for DA data and PB for DB data. PAO and PBO are the parity bits for the least significant bytes of DA and DB, respectively. If the parity comparison fails for any byte, a high appears on the parity error output pin (PERRA for DA data and PERRB for DB data).


Figure 1. 'ACT8837 Floating Point Unit

A parity check can also be performed on the entire input data word by setting BYTEP low. In this mode, PAO is the parity input for DA data and PBO is the parity input for DB data.

## Temporary Input Register

A temporary input register is provided to enable double precision numbers on a single 32-bit input bus to be loaded in one clock cycle. The contents of the DA bus are loaded into the upper 32 bits of the temporary register; the contents of DB are loaded into the lower 32 bits. A clock mode signal (CLKMODE) determines the clock edge on which the data will be stored in the temporary register. When CLKMODE is low, data is loaded on the rising edge of the clock; when CLKMODE is high, data is loaded on the falling edge.

## RA and RB Input Registers

Two 64-bit registers, RA and RB, are provided to hold input data for the multiplier and ALU. Data is taken from the DA bus, DB bus and the temporary input register, according to configuration mode controls CONFIG1-CONFIGO (see Tables 3 and 5). The registers are loaded on the rising edge of clock CLK. For single-precision operations, CONFIG1-CONFIGO should ordinarily be set to 01 (see Table 4).

Table 3. Double-Precision Input Data Configuration Modes

| CONFIG1 | CONFIGO | LOADING SEQUENCE |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | DATA LOADED INTO TEMP REGISTER ON FIRST CLOCK AND RA/RB REGISTERS ON SECOND CLOCK ${ }^{\dagger}$ |  | DATA LOADED INTO RA/RB REGISTERS ON SECOND CLOCK |  |
|  |  | DA | DB | DA | DB |
| 0 | 0 | B operand (MSH) | B operand (LSH) | A operand (MSH) | A operand (LSH) |
| 0 | 1 | A operand (LSH) | $\begin{aligned} & \text { B operand } \\ & \text { (LSH) } \end{aligned}$ | A operand (MSH) | B operand (MSH) |
| 1 | 0 | A operand (MSH) | B operand (MSH) | A operand (LSH) | B operand (LSH) |
| 1 | 1 | A operand (MSH) | A operand (LSH) | B operand (MSH) | B operand (LSH) |

[^17]Table 4．Single－Precision Input Data Configuration Mode

|  |  | DATA LOADED INTO <br> RA／RB REGISTERS ON <br> FIRST CLOCK． |  |  |
| :---: | :---: | :---: | :---: | :---: |
| CONFIG1 | CONFIGO | DA | DB |  |
| 0 | 1 | A operand | B operand | Nhis mode is ordinarily <br> used for single－precision <br> operations． |

Table 5．Double－Precision Input Data Register Sources

|  |  | RA SOURCE |  | RB SOURCE |  |
| :---: | :---: | :---: | :---: | :---: | :---: |
| CONFIG1 | CONFIGO | MSH | LSH | MSH | LSH |
| 0 | 0 | DA | DB | TEMP REG | TEMP REG |
| 0 | 1 | DA | TEMP REG | DSH） | （LSH） <br> DB |
| 1 | 0 | TEMP REG REG <br> （MSH） | DA | TEMP REG | TSH） <br> （LSH） |
| 1 | 1 | DEMP REG <br> （MSH） | TEMP REG <br> （LSH） | DA | DB |

## Multiplier／ALU Multiplexers

Four multiplexers select the multiplier and ALU operands from the RA and RB registers， the previous multiplier or ALU result，or the C register．The multiplexers are controlled by input signals SELOP7－SELOPO as shown in Tables 6 and 7.

Table 6．Multiplier Input Selection

| A1（MUX1）INPUT |  |  |  | B1（MUX2）INPUT |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| SELOP7 | SELOP6 | OPERAND SOURCE | SELOP5 | SELOP4 | OPERAND SOURCE |  |
| 0 | 0 | Reserved | 0 | 0 | Reserved |  |
| 0 | 1 | C register | 0 | 1 | C register |  |
| 1 | 0 | ALU feedback | 1 | 0 | Multiplier feedback |  |
| 1 | 1 | RA input register | 1 | 1 | RB input register |  |

Table 7．ALU Input Selection

| A2（MUX3）INPUT |  |  |  | B2（MUX4）INPUT |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| SELOP3 | SELOP2 | OPERAND SOURCE | SELOP1 | SELOPO | OPERAND SOURCE |  |
| 0 | 0 | Reserved | 0 | 0 | Reserved |  |
| 0 | 1 | C register | 0 | 1 | C register |  |
| 1 | 0 | Multiplier feedback | 1 | 0 | ALU feedback |  |
| 1 | 1 | RA input register | 1 | 1 | RB input register |  |

## Pipelined ALU

The pipelined ALU contains a circuit for addition and/or subtraction of aligned operands, a pipeline register, an exponent adjuster and a normalizer/rounder. An exception circuit is provided to detect denormal inputs; these can be flushed to zero if the fast input is set high. A denorm exception flag (DENORM) goes high when the ALU output is a denormal.

The ALU may be operated independently or in parallel with the multiplier. Possible ALU functions during independent operation are given in Tables 8 and 9. Parallel ALU/multiplier functions are listed in Table 11.

## Pipelined Multiplier

The pipelined multiplier performs a basic multiply function, $A * B$. The operands can be single-precision or double-precision numbers and can be converted to absolute values before multiplication takes place. Multiplier operations are summarized in Table 10.

An exception circuit is provided to detect denormalized inputs; these are indicated by a high on the DENIN signal.

The multiplier and ALU can be operated simultaneously by setting the 19 instruction input high. Possible operations in this chained mode are listed in Table 13.

## Product, Sum, and C Registers

The results of the ALU and multiplier operations may optionally be latched into two output registers on the rising edge of the system clock (CLK). The $P$ (product) register holds the result of the multiplier operation; the $S(s u m)$ register holds the ALU result.

An additional 64-bit register is provided for temporary storage of the result of an ALU or multiplier operation before feedback to the multiplier or ALU. The data source for this C register is selected by SRCC; a high on this pin selects the multiplier result; a low selects the ALU. A separate clock, CLKC, has been provided for this register.

## Parity Generators

Even parity is generated for the Y multiplexer output, either for each byte or for each word of output, depending on the setting of BYTEP. When BYTEP is high, the parity generator computes four parity bits, one for each byte of $Y$ multiplexer output. Parity bits are output on the PY3-PYO pins; PYO represents parity for the least significant byte. A single parity bit can also be generated for the entire output data word by setting BYTEP low. In this mode, PYO is the parity output.

Table 8．Independent ALU Operations，Single Operand $(19=0,16=0)$

| CHAINED OPERATION 19 | PRECISION RA 18 | PRECISION <br> RB <br> 17 | OUTPUT SOURCE 16 | OPERAND TYPE I5 | ABSOLUTE VALUE A 14 | ALU OPERATION |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  | 13－10 | RESULT |
| $\mathrm{O}=\mathrm{Not}$ <br> Chained | $\begin{aligned} & 0=A(S P) \\ & 1=A(D P) \end{aligned}$ | $\begin{aligned} & 0=B(S P) \\ & 1=B(D P) \end{aligned}$ | $\begin{gathered} 0=\mathrm{ALU} \\ \text { result } \end{gathered}$ | $1 \text { = Single }$ <br> Operand | $\begin{gathered} 0=A \\ 1=\|A\| \end{gathered}$ | $\begin{aligned} & 0000 \\ & 0001 \\ & 0010 \\ & 0011 \\ & 0100 \\ & 0101 \\ & 0110 \\ & \\ & 0111 \\ & 1000 \\ & \\ & 1001 \\ & 1010 \\ & 1011 \\ & 1100 \\ & 1101 \\ & 1110 \\ & 1111 \\ & \hline \end{aligned}$ | Pass A operand <br> Negate A operand <br> Integer to floating point conversion ${ }^{\dagger}$ <br> Floating point to integer conversion <br> Undefined <br> Undefined <br> Floating point to floating point conversion ${ }^{\ddagger}$ <br> Undefined <br> Wrap（denormal）input operand <br> Undefined <br> Undefined <br> Undefined <br> Unwrap exact number Unwrap inexact number Unwrap rounded input Undefined |

[^18]Table 9. Independent ALU Operations, Two Operands ( $19=0,15=0$ )

| CHAINED OPERATION 19 | PRECISIONRA18 | PRECISIONRB17 | OUTPUT SOURCE 16 | OPERAND TYPE 15 | ABSOLUTE VALUE A 14 | ABSOLUTE VALUE B 13 | ABSOLUTE VALUE Y 12 | ALU OPERATION |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  |  |  |  |  |  | 11-10 | RESULT |
| $0=\text { Not }$ <br> chained | $\begin{aligned} & 0=A(S P) \\ & 1=A(D P) \end{aligned}$ | $\begin{aligned} & 0=B(S P) \\ & 1=B(D P) \end{aligned}$ | $\begin{gathered} 0=\mathrm{ALU} \\ \text { result } \end{gathered}$ | $0=\text { Two }$ operands | $\begin{gathered} 0=A \\ 1=\|A\| \end{gathered}$ | $\begin{gathered} 0=B \\ 1=\|B\| \end{gathered}$ | $\begin{gathered} 0=Y \\ 1=\|Y\| \end{gathered}$ | $\begin{aligned} & 00 \\ & 01 \\ & 10 \\ & 11 \end{aligned}$ | $\begin{gathered} A+B \\ A-B \\ \text { Compare } A, B \\ B-A \end{gathered}$ |

Table 10. Independent Multiplier Operations $(19=0,16=1)$

| CHAINED <br> OPERATION 19 | $\begin{gathered} \text { PRECISION } \\ \text { RA } \\ \text { 18 } \end{gathered}$ | $\begin{gathered} \text { PRECISION } \\ \text { RB } \\ 17 \end{gathered}$ | OUTPUT SOURCE 16 | 15 | $\begin{gathered} \text { ABSOLUTE } \\ \text { VALUE A } \\ 14^{\dagger} \end{gathered}$ | $\begin{aligned} & \text { ABSOLUTE } \\ & \text { VALUE B } \\ & 13^{\dagger} \end{aligned}$ | NEGATE RESULT $12^{\dagger}$ | WRAP A <br> 11 | WRAP B <br> 10 |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $0=\text { Not }$ <br> chained | $\begin{aligned} & 0=A(S P) \\ & 1=A(D P) \end{aligned}$ | $\begin{aligned} & 0=B(S P) \\ & 1=B(D P) \end{aligned}$ | $1=\begin{array}{r} \text { Multi- } \\ \text { plier } \\ \text { result } \end{array}$ | 0 | $\begin{gathered} 0=A \\ 1=\|A\| \end{gathered}$ | $\begin{gathered} 0=B \\ 1=\|B\| \end{gathered}$ | $\begin{gathered} 0=Y \\ 1=\|Y\| \end{gathered}$ | $0=$ Normal format <br> $1=A$ is $a$ wrapped number | $0=$ Normal format $1=B$ is a wrapped number |

[^19]Table 11. Independent Multiplier Operations Selected by $14-12(I 9=0, I 6=1)$

| ABSOLUTE VALUE A 14 | ```ABSOLUTE VALUE B 13``` | NEGATE RESULT 12 | OPERATION SELECTED |  |
| :---: | :---: | :---: | :---: | :---: |
|  |  |  | 14-12 | RESULTS |
| $0=A$ | $0=B$ | $0=Y$ | 000 | A * B |
| $1=\|A\|$ | $1=\|B\|$ | $1=-Y$ | 001 | $-(A * B)$ |
|  |  |  | 010 | $A^{*}\|B\|$ |
|  |  |  | 011 | $-(A *\|B\|)$ |
|  |  |  | 100 | $\|A\|^{*} B$ |
|  |  |  | 101 | $-\left(\|A\|^{*} B\right)$ |
|  |  |  | 110 | $\|A\|^{*}\|B\|$ |
|  |  |  | 111 |  |

Table 12. Operations Selected by $18-17(19=0,16=1)$

| PRECISION <br> SELECT RA <br> I8 | PRECISION <br> RA INPUT | PRECISION <br> SELECT RB <br> 17 | PRECISION <br> RB INPUT | PRECISION <br> OF RESULT |
| :---: | :---: | :---: | :---: | :---: |
| 0 | Single <br> Single <br> Converted <br> to Double | 1 | Single | Single |
| 1 | Double | 0 | Double | Double |
| 1 | Double | 1 | Single <br> Converted <br> to Double <br> Double | Double |
| 1 | Double |  |  |  |

## Master/Slave Comparator

A master/slave comparator is provided to compare data bytes from the $Y$ output multiplexer and the status outputs with data bytes on the external $Y$ and status ports when $\overline{\mathrm{OEY}}, \overline{\mathrm{OES}}$ and $\overline{\mathrm{OEC}}$ are high. If the data bytes are not equal, a high signal is generated on the master/slave error output pin (MSERR).

## Status and Exception Generator/Register

A status and exception generator produces several output signals to indicate invalid operations as well as overflow, underflow, nonnumerical and inexact results, in conformance with IEEE Standard 754-1985. If output registers are enabled (PIPES2 = 0), status and exception results are latched in a status register on the rising edge of the clock. Status results are valid at the same time that associated data results are valid. Status outputs are enabled by two signals, $\overline{\mathrm{OEC}}$ for comparison status and $\overline{\mathrm{OES}}$ for other status and exception outputs. Status outputs are summarized in Tables 14 and 15.

During a compare operation in the ALU, the AEQB output goes high when the A and $B$ operands are equal. When any operation other than a compare is performed, either by the ALU or the multiplier, the AEOB signal is used as a zero detect.

Table 13. Chained Multiplier/ALU Operations $(19=1)$

| CHAINED OPERATION | PRECISION <br> RA <br> 18 | PRECISION <br> RB <br> 17 | OUTPUT SOURCE <br> 16 | ADD ZERO <br> 15 | MULTIPLY BY ONE <br> 14 | NEGATE ALU RESULT 13 | NEGATE MULTIPLIER RESULT 12 | ALU OPERATIONS |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 19 |  |  |  |  |  |  |  | 11-10 | RESULT |
| 1 = Chained | $\begin{aligned} & 0=A(S P) \\ & 1=A(D P) \end{aligned}$ | $\begin{aligned} & 0=B(S P) \\ & 1=B(D P) \end{aligned}$ | $\begin{aligned} & \hline 0= \text { ALU } \\ & \text { result } \\ & 1= \text { Multi- } \\ & \text { plier } \\ & \text { result } \end{aligned}$ | $0=$ Normal operation 1 = Forces B2 input of ALU to zero | $0=$ Normal operation 1 = Forces B1 input of multiplier to one | 0 = Normal operation 1 = Negate ALU result | $\begin{aligned} & 0=\text { Normal } \\ & \text { operation } \\ & 1=\text { Negate } \\ & \text { multiplier } \\ & \text { result } \end{aligned}$ | $\begin{aligned} & \hline 00 \\ & 01 \\ & 10 \\ & 11 \end{aligned}$ | $\begin{aligned} & A+B \\ & A-B \\ & 2-A \\ & B-A \end{aligned}$ |

Table 14. Comparison Status Outputs

| SIGNAL | RESULT OF COMPARISON (ACTIVE HIGH) |
| :---: | :--- |
| AEQB | The A and B operands are equal. (A high signal on the AEQB output indicates a <br> zero result from the selected source except during a compare operation in the <br> ALU.) |
| AGTB | The A operand is greater than the B operand. (Only during a compare operation <br> in the ALU) |
| UNORD | The two inputs of a comparison operation are unordered, i.e., one or both of <br> the inputs is a NaN. |

Table 15. Status Outputs

| SIGNAL | STATUS RESULT |
| :---: | :--- |
| CHEX | $\begin{array}{l}\text { If } 16 \text { is low, indicates the multiplier is the source of an exception during a } \\ \text { chained function. If I6 is high, indicates the ALU is the source of an exception } \\ \text { during a chained function. } \\ \text { Input to the multiplier is a denorm. When DENIN goes high, the STEX pins }\end{array}$ |
| DENIN |  |
| indicate which port had a denormal input. |  |
| The multiplier output is a wrapped number or the ALU output is a denorm. In |  |
| the FAST mode, this condition causes the result to go to zero. |  |$\}$ INEX | The result of an operation is not exact. |
| :--- |
| IVAL |
| OVERA NaN has been input to the multiplier or the ALU, or an invalid operation <br> $0 * \infty$ or $\pm \infty \mp \infty$ ) has been requested. When IVAL goes high, the STEX <br> pins indicate which port had a NaN. <br> The result is greater than the largest allowable value for the specified format. <br> The mantissa of a wrapped number has been increased in magnitude by <br> Rounding and the unwrap round instruction can be used to unwrap properly <br> the wrapped number (see Table 8). <br> The status was generated by the multiplier. (When SRCEX is low, the status <br> was generated by the ALU.) <br> A NaN or a denorm has been input on the B port. |
| STEXO |
| UNDER |
| A NaN or a denorm has been input on the A port. |
| The result is inexact and less than the minimum allowable value for the |
| specified format. In the FAST mode, this condition causes the result to go to |
| zero. |

In chained mode, status results to be output are selected based on the state of the I6 (source output) pin (if I6 is low, ALU status will be selected; if 16 is high, multiplier status will be selected). If the nonselected output source generates an exception, CHEX is set high. Status of the nonselected output source can be forced using the SELST pins, as shown in Table 16.

Table 16. Status Output Selection (Chain Mode)

| SELST1- <br> SELST0 |  |
| :---: | :--- |
| 00 | STATUS SELECTED |
| 01 | Selects multiplier status |
| 10 | Selects ALU status |
| 11 | Normal operation (selection based on result source specified by I6 input) |

## Flowthrough Mode

To enable the device to operate in pipelined or flowthrough modes, registers can be bypassed using pipeline control signals PIPES2-PIPESO (see Table 17).

Table 17. Pipeline Controls (PIPES2-PIPESO)

| PIPES2- <br> PIPESO | REGISTER OPERATION SELECTED |
| :--- | :--- |
| $\times$ | $\times$ |

## FAST and IEEE Modes

The device can be programmed to operate in FAST mode by asserting the FAST pin. In the FAST mode, all denormalized inputs and outputs are forced to zero.

Placing a zero on the FAST pin causes the chip to operate in IEEE mode. In this mode, the ALU can operate on denormalized inputs and return denormals. If a denorm is input to the multiplier, the DENIN flag will be asserted, and the result will be invalid. If the multiplier result underflows, a wrapped number will be output.

## Rounding Mode

The＇ACT8837 supports the four IEEE standard rounding modes：round to nearest， round towards zero（truncate），round towards infinity（round up），and round towards minus infinity（round down）．The rounding function is selected by control pins RND1 and RNDO，as shown in Table 18.

Table 18．Rounding Modes

| RND 1 － RNDO | ROUNDING MODE SELECTED |
| :---: | :---: |
| 00 | Round towards nearest |
| 01 | Round towards zero（truncate） |
| 10 | Round towards infinity（round up） |
| 11 | Round towards negative infinity（round down） |

## Test Pins

Two pins，TP1－TPO，support system testing．These may be used，for example，to place all outputs in a high－impedance state，isolating the chip from the rest of the system （see Table 19）．

Table 19．Test Pin Control Inputs

## Summary of Control Inputs

Control input signals for the＇ACT8837 are summarized in Table 20.

Table 20. Control Inputs

| SIGNAL | HIGH | LOW |
| :---: | :---: | :---: |
| BYTEP | Selects byte parity generation and test | Selects single bit parity generation and test |
| CLK | Clocks all registers except C | No effect |
| CLKC | Clocks C register | No effect |
| CLKMODE | Enables temporary input register load on falling clock edge | Enables temporary input register load on rising clock edge |
| CONFIG1- <br> CONFIGO | See Table 3 (RA and RB register data source selects) | See Table 3 (RA and RB register data source selects) |
| ENRA | If register is not in flow through, enables clocking RA register | If register is not in flow through, holds contents of RA register |
| ENRB | If register is not in flow through, enables clocking of RB register | If register is not in flow through, holds contents of RB register |
| FAST | Places device in FAST mode | Places device in IEEE mode |
| $\overline{\text { HALT }}$ | No effect | Stalls device operation but does not affect registers, internal states, or status |
| $\overline{\text { OEC }}$ | Disables compare pins | Enables compare pins |
| $\overline{\text { OES }}$ | Disables status outputs | Enables status outputs |
| $\overline{O E Y}$ | Disables $Y$ bus | Enables Y bus |
| PIPES2- <br> PIPESO | See Table 17 (pipeline mode control) | See Table 17 (pipeline mode control) |
| RESET | No effect | Clears internal states and status but does not affect data registers |
| RND1- <br> RNDO | See Table 18 (rounding mode control) | See Table 18 (rounding mode control) |
| SELOP7SELOPO | See Tables 6 and 7 (multiplier/ALU operand selection) | See Tables 6 and 7 (multiplier/ALU operand selection) |
| SELMS/LS | Selects MSH of 64-bit result for output on the Y bus | Selects LSH of 64-bit result for output on the $Y$ bus (no effect during single precision operation) |
| SELST1- <br> SELSTO | See Table 15 (status output selection) | See Table 15 (status output selection) |
| SRCC | Selects multiplier result for input to $C$ register | Selects ALU result for input to C register |
| TP1-TPO | See Table 19 (test pin control inputs) | See Table 19 (test pin control inputs) |

See Tables 6 and 7 (multiplier/ALU operand selection)

Selects LSH of 64-bit result for output on the $Y$ bus (no effect during single precision operation)

See Table 15 (status output selection)

Selects ALU result for input to C register

See Table 19 (test pin control inputs)

## INSTRUCTION SET

Configuration and operation of the 'ACT8837 can be selected to perform single- or double-precision floating-point calculations in operating modes ranging from flowthrough to fully pipelined. Timing and sequences of operations are affected by settings of clock mode, data and status registers, input data configurations, and rounding mode, as well as the instruction inputs controlling the ALU and the multiplier. The ALU and the multiplier of the 'ACT8837 can operate either independently or simultaneously, depending on the setting of instruction inputs I9-IO and related controls.

Controls for data flow and status results are discussed separately, prior to the discussions of ALU and multiplier operations. Then, in Tables 22 through 25, the instruction inputs to the ALU and the multiplier are summarized according to operating mode, whether independent or chained (ALU and multiplier in simultaneous operation).

## Loading External Data Operands

Patterns of data input to the 'ACT8837 vary depending on the precision of the operands and whether they are being input as A or B operands. Loading of external data operands is controlled by the settings of CLKMODE and CONFIG1-CONFIGO, which determine the clock timing and register destinations for data inputs.

## Configuration Controls (CONFIG1-CONFIGO)

Three input registers are provided to handle input of data operands, either single precision or double precision. The RA, RB, and temporary registers are each 64 bits wide. The temporary register is only used during input of double-precision operands.

When single-precision or integer operands are loaded, the ordinary setting of CONFIG1CONFIGO is LH, as shown in Table 4. This setting loads each 32-bit operand in the most significant half (MSH) of its respective register. The operands are loaded into the MSHs and adjusted to double precision because the data paths internal to the device are all double precision. It is also possible to load single-precision operands with CONFIG1-CONFIGO set to HH but two clock edges are required to load both the A and $B$ operands on the DA bus.

Double-precision operands are loaded by using the temporary register to store half of the operands prior to inputting the other half of the operands on the DA and DB buses. As shown in Tables 3 and 5, four configuration modes for selecting input sources are available for loading data operands into the RA and RB registers.

## CLKMODE Settings

Timing of double-precision data inputs is determined by the clock mode setting, which allows the temporary register to be loaded on either the rising edge (CLKMODE $=\mathrm{L}$ ) or the falling edge of the clock (CLKMODE $=\mathrm{H}$ ). Since the temporary register is not used when single-precision operands are input, clock modes 0 and 1 are functionally equivalent for single-precision operations.

The setting of CLKMODE can be used to speed up the loading of double-precision operands. When the CLKMODE input is set high, data on the DA and DB buses are loaded on the falling edge of the clock into the MSH and LSH, respectively, of the temporary register. On the next rising edge, contents of the DA bus, DB bus, and temporary register are loaded into the RA and RB registers, and execution of the current instruction begins. The setting of CONFIG1-CONFIGO determines the exact pattern in which operands are loaded, whether as MSH or LSH in RA or RB.

Double-precision operation in clock mode 0 is similar except that the temporary register loads only on a rising edge. For this reason the RA and RB registers do not load until the next rising edge, when all operands are available and execution can begin.

A considerable advantage in speed can be realized by performing double-precision ALU operations with CLKMODE set high. In this clock mode both double-precision operands can be loaded on successive clock edges, one falling and one rising, and the ALU operation can be executed in the time from one rising edge of the clock to the next rising edge. Both halves of a double-precision ALU result must be read out on the $Y$ bus within one clock cycle when the 'ACT8837 is operated in clock mode 1.

## Internal Register Operations

Six data registers in the 'ACT8837 are arranged in three levels along the data paths through the multiplier and the ALU. Each level of registers can be enabled or disabled independently of the other two levels by setting the appropriate PIPES2-PIPESO inputs.

The RA and RB registers receive data inputs from the temporary register and the DA and DB buses. Data operands are then multiplexed into the multiplier, ALU, or both. To support simultaneous pipelined operations, the data paths through the multiplier and the ALU are both provided with pipeline registers and output registers. The control settings for the pipeline and output registers (PIPES2-PIPES1) are registered with the instruction inputs 19-10.

A seventh register, the constant (C) register is available for storing a 64-bit constant or an intermediate result from the multiplier or the ALU. The C register has a separate clock input (CLKC) and input source select (SRCC). The SRCC input is not registered with the instruction inputs. Depending on the operation selected and the settings of PIPES2-PIPESO, an offset of one or more cycles may be necessary to load the desired result into the C register.

Status results are also registered whenever the output registers are enabled. Duration and availability of status results are affected by the same timing constraints that apply to data results on the Y output bus.

## Data Register Controls (PIPES2-PIPESO)

Table 17 shows the settings of the registers controlled by PIPES2-PIPESO. Operating modes range from fully pipelined (PIPES2-PIPESO $=$ LLL) to flowthrough (PIPES2-PIPESO $=\mathrm{HHH}$ ).

In flowthrough mode all three levels of registers are disabled, a circumstance which may affect some double-precision operations. Since double-precision operands require two steps to input, at least half of the data must be clocked into the temporary register before the remaining data is placed on the DA and DB buses.

When all registers (except the $C$ register) are enabled, timing constraints can become critical for many double-precision operations. In clock mode 1, the ALU can perform a double-precision operation and output a result during every clock cycle, and both halves of the result must be read out before the end of the next cycle. Status outputs are valid only for the period during which the Y output data is valid.

Similarly, double-precision multiplication is affected by pipelining, clock mode, and sequence of operations. A double-precise multiply requires two cycles to execute, depending on the settings of PIPES2-PIPESO. The output may be valid for one or two cycles, depending on the precision of the next operation.

Duration of valid outputs at the Y multiplexer depends on settings of PIPES2-PIPESO and CLKMODE, as well as whether all operations and operands are of the same type. For example, when a double-precision multiply is followed by a single-precision operation, one open clock cycle must intervene between the dissimilar operations.

## C Register Controls (SRCC, CLKC)

The $C$ register loads from the $P$ or the $S$ register output, depending on the setting of SRCC, the load source select. $\mathrm{SRCC}=\mathrm{H}$ selects the multiplier as input source. Otherwise the ALU is selected when SRCC = L. In either case the C register only loads the selected input on a rising edge of the CLKC signal.

The C register does not load directly from an external data bus. One method for loading a constant without wasting a cycle is to input the value as an A operand during an operation which uses only the ALU or multiplier and requires no external data inputs. Since the B operand can be forced to zero in the ALU or to one in the multiplier, the A operand can be passed to the $C$ register either by adding zero or multiplying by one, then selecting the input source with SRCC and causing the CLKC signal to go high. Otherwise, the C register can be loaded through the ALU with the Pass A Operand instruction, which requires a separate cycle.

## Operand Selection (SELOP7-SELOPO)

As shown in Tables 6 and 7, data operands can be selected as five possible sources, including external inputs from the RA and RB registers, feedback from the $P$ and $S$ registers, and a stored value in the $C$ register. Contents of the $C$ register may be selected as either the A or the B operand in the ALU, the multiplier, or both. When an external input is selected, the RA input always becomes the A operand, and the RB input is the $B$ operand.

Feedback from the ALU can be selected as the A operand to the multiplier or as the $B$ operand to the ALU. Similarly, multiplier feedback may be used as the A operand to the ALU or the B operand to the multiplier.

Selection of operands also interacts with the selected operations in the ALU or the multiplier. ALU operations with one operand are performed only on the A operand. Also, depending on the instruction selected, the B operand may optionally be forced to zero in the ALU or to one in the multiplier.

## Rounding Controls (RND1-RND0)

Because floating point operations may involve both inherent and procedural errors, it is important to select appropriate modes for handling rounding errors. To support the IEEE standard for binary floating-point arithmetic, the 'ACT8837 provides four rounding modes selected by RND1-RNDO.

Table 18 shows the four selectable rounding modes. The usual default rounding mode is round to nearest (RND1-RNDO $=\mathrm{LL}$ ). In round-to-nearest mode, the 'ACT8837 supports the IEEE standard by rounding to even ( $L S B=0$ ) when two nearest representable values are equally near. Directed rounding toward zero, infinity, or minus infinity are also available.

Rounding mode should be selected to minimize procedural errors which may otherwise accumulate and affect the accuracy of results. Rounding to nearest introduces a procedural error not exceeding half of the least significant bit for each rounding operation. Since rounding to nearest may involve rounding either upward or downward in successive steps, rounding errors tend to cancel each other.

In contrast, directed rounding modes may introduce errors approaching one bit for each rounding operation. Since successive rounding operations in a procedure may all be similarly directed, each introducing up to a one-bit error, rounding errors may accumulate rapidly, especially in single-precision operations.

## Status Exceptions

Status exceptions can result from one or more error conditions such as overflow, underflow, operands in illegal formats, invalid operations, or rounding. Exceptions may be grouped into two classes: input exceptions resulting from invalid operations or denormal inputs to the multiplier, and output exceptions resulting from illegal formats, rounding errors, or both.

To simplify the discussion of exception handling, it is useful to summarize the data formats for representing IEEE floating-point numbers which can be input to or output from the FPU (see Table 21). Since procedures for handling exceptions vary according to the requirements of specific applications, this discussion focuses on the conditions which cause particular status exceptions to be signalled by the FPU.

Table 21. IEEE Floating-Point Representations

| TYPE OF OPERAND | EXPONENT (e) |  | FRACTION (f) (BINARY) | $\begin{gathered} \text { HIDDEN } \\ \text { BIT } \\ \hline \end{gathered}$ | VALUE OF NUMBER REPRESENTED |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  | SP (HEX) | DP (HEX) |  |  | SP (DECIMAL) ${ }^{\dagger}$ | DP (DECIMAL) ${ }^{\text { }}$ |
| Normalized Number (max) | FE | 7FE | All 1's | 1 | $(-1)^{s}\left(2^{127}\right)(2-2-23)$ | $(-1)^{s}\left(2^{1023}\right)(2-2-52)$ |
| Normalized Number (min) | 01 | 001 | All 0's | 1 | $(-1)^{s}(2-126)(1)$ | $(-1)^{s}(2-1022)(1)$ |
| Denormalized <br> Number (max) | 00 | 000 | All 1's | 0 | $(1-)^{s}(2-126)(1-2-23)$ | $(-1)^{\mathrm{s}}(2-1022)(1-2-52)$ |
| Denormalized Number (min) | 00 | 000 | 000... 001 | 0 | $(-1)^{s}(2-126)(2-23)$ | $(-1)^{s}(2-1022)(2-52)$ |
| Wrapped Number (max) | 00 | 000 | All 1's | 1 | $(-1)^{s}(2-127)(2-2-23)$ | $(-1)^{s}(2-1023)(2-2-52)$ |
| Wrapped Number (min) | EA | 7CD | All 0's | 1 | $(-1)^{s}(2-22+127)(1)$ | $(-1)^{s}(2-51+1023)(1)$ |
| Zero | 00 | 000 | Zero | 0 | $(-1)^{s}(0.0)$ | $(-1)^{s}(0.0)$ |
| Infinity | FF | 7FF | Zero | 1 | $(-1) s$ (infinity) | $(-1)^{s}$ (infinity) |
| NAN (Not a Number | FF | 7FF | Nonzero | N/A | None | None |

$\dagger_{s}=$ sign bit

IEEE formats for floating-point operands, both single and double precision, consist of three fields: sign, exponent, and fraction, in that order. The leftmost (most significant) bit is the sign bit. The exponent field is eight bits long in single-precision operands and 11 bits long in double-precision operands. The fraction field is 23 bits in single precision and 52 bits in double precision. Further details of IEEE formats and exceptions are provided in the IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Std 754-1985.

Several status exceptions are generated by illegal data or instruction inputs to the FPU. Input exceptions may cause the following signals to be set high: IVAL, DENIN, and STEX1-STEX0. If the IVAL flag is set, either an invalid operation has been requested or a NaN (Not a Number) has been input. When DENIN is set, a denormalized number has been input to the multiplier. STEX1-STEX0 indicate which port (RA, RB, or both) is the source of the exception when either a denormal is input to the multiplier (DENIN $=\mathrm{H}$ ) or a $\mathrm{NaN}($ IVAL $=\mathrm{H}$ ) is input to the multiplier or the ALU.

NaN inputs are all treated as IEEE signaling NaNs , causing the IVAL flag to be set. When output from the FPU, the fraction field from a NaN is set high (all 1 's), regardless of the original fraction field of the input NaN .

Output exception signals are provided to indicate both the source and type of the exception. DENORM, INEX, OVER, UNDER, and RNDCO indicate the exception type, and CHEX and SRCEX indicate the source of an exception. SRCEX indicates the source of a result as selected by instruction bit I6, and SRCEX is active whenever a result is output, not only when an exception is being signaled. The chained-mode exception signal CHEX indicates that an exception has be generated by the source not selected for output by 16. The exception type signaled by CHEX cannot be read unless status select controls SELST1-SELSTO are be used to force status output from the deselected source.

Output exceptions may be due either to a result in an illegal format or to a procedural error. Results too large or too small to be represented in the selected precision are signalled by OVER and UNDER. Any ALU output which has been increased in magnitude by rounding causes INEX to be set high. DENORM is set when the multiplier output is wrapped or the ALU output is denormalized. Wrapped outputs from the multiplier may be inexact or increased in magnitude by rounding, which may cause the INEX and RNDCO status signals to be set high. A denormal output from the ALU (DENORM $=H$ ) may also cause INEX to be set, in which case UNDER is also signalled.

## Handling of Denormalized Numbers (FAST)

The FAST input selects the mode for handling denormalized inputs and outputs. When the FAST input is set low, the ALU accepts denormalized inputs but the multiplier generates an exception when a denormal is input. When FAST is set high, the DENIN status exception is disabled and all denormalized numbers, both inputs and results, are forced to zero.

A denormalized input has the form of a floating-point number with a zero exponent, a nonzero mantissa, and a zero in the leftmost bit of the mantissa (hidden or implicit bit). A denormalized number results from decrementing the biased exponent field to zero before normalization is complete. Since a denormalized number cannot be input to the multiplier, it must first be converted to a wrapped number by the ALU. When the mantissa of the denormal is normalized by shifting it left, the exponent field decrements from all zeros (wraps past zero) to a negative two's complement number (except in the case of IXXX. . . , where the exponent is not decremented).

Exponent underflow is possible during multiplication of small operands even when the operands are not wrapped numbers. Setting FAST $=\mathrm{L}$ selects gradual underflow so that denormal inputs can be wrapped and wrapped results are not automatically discarded. When FAST is set high, denormal inputs and wrapped results are forced to zero immediately.

When the multiplier is in IEEE mode and produces a wrapped number as its result, the result may be passed to the ALU and unwrapped. If the wrapped number can be unwrapped to an exact denormal, it can be output without causing the underflow status flag (UNDER) to be set. UNDER goes high when a result is an inexact denormal, and a zero is output from the FPU if the wrapped result is too small to represent as a denormal (smaller than the minimum denorm). Table 22 describes the handling of wrapped multiplier results and the status flags that are set when wrapped numbers are output from the multiplier.

Table 22. Handling Wrapped Multiplier Outputs

| TYPE <br> OF RESULT | DENORM | STATUS FLAGS SET <br> INEX <br> RNDCO | UNDER | NOTES |  |
| :--- | :---: | :---: | :---: | :---: | :---: |
| Wrapped, <br> exact | 1 | 0 | 0 | 0 | Unwrap with 'Wrapped <br> exact' ALU instruction |
| Wrapped, <br> inexact <br> Wrapped, <br> increased in <br> magnitude by <br> rounding | 1 | 1 | 0 | 1 | Unwrap with 'Wrapped <br> inexact' ALU instruction |

When operating in chained mode, the multiplier may output a wrapped result to the ALU during the same clock cycle that the multiplier status is output. In such a case the ALU cannot unwrap the operand prior to using it, for example, when accumulating the results of previous multiplications. To avoid this situation, the FPU can be operated in FAST mode to simplify exception handling during chained operations. Otherwise, wrapped outputs from the multiplier may adversely affect the accuracy of the chained operation, because a wrapped number may appear to be a large normalized number instead of a very small denormalized number.

Because of the latency associated with interpreting the FPU status outputs and determining how to process the wrapped output, it is necessary that a wrapped operand be stored external to the FPU (for example, in an external register file) and reloaded to the A port of the ALU for unwrapping and further processing.

## Data Output Controls (SELMS/LS, $\overline{\mathrm{OEY}}$ )

Selection and duration of results from the Y output multiplexer may be affected by several factors, including the operation selected, precision of the operands, registers enabled, and the next operation to be performed. The data output controls are not registered with the data and instruction inputs. When the device is microprogrammed, the effects of pipelining and sequencing of operations should be taken into account.

Two particular conditions need to be considered. Depending on which registers are enabled, an offset of one or more cycles must be allowed before a valid result is available at the Y output multiplexer. Also, certain sequences of operations may require both halves of a double-precision result to be read out within a single clock cycle. This is done by toggling the SELMS/ $\overline{L S}$ signal in the middle of the clock period.

When a single-precision result is output, the SELMS/ $\overline{L S}$ signal has no effect. The SELMS/ $\overline{L S}$ signal is set low only to read out the LSH of a double-precision result. Whenever this signal is selecting a valid result for output on the $Y$ bus, the $\overline{O E Y}$ enable must be pulled low at the beginning of that clock cycle.

## Status Output Controls (SELST1-SELSTO, $\overline{\mathrm{OES}}, \overline{\mathrm{OEC}}$ )

Ordinarily, SELST1-SELST0 are set high so that status selection defaults to the output source selected by instruction input I6. The ALU is selected as the output source when 16 is low, and the multiplier when 16 is high.

When the device operates in chained mode, it may be necessary to read the status results not associated with the output source. As shown in Table 16, SELST1-SELSTO can be used to read the status of either the ALU or the multiplier regardless of the 16 setting.

Status results are registered only when the output ( P and S ) registers are enabled (PIPES2 $=\mathrm{L}$ ). Otherwise, the status register is transparent. In either case, status outputs can be read by pulling the output enables low ( $\overline{\mathrm{OES}}, \overline{\mathrm{OEC}}$, or both).

## Stalling the Device (HALT)

Operation of the 'ACT8837 can be stalled nondestructively by means of the HALT signal. Pulling the HALT input low causes the device to stall on the next low level of the clock. Register contents are unaltered when the device is stalled, and normal operation resumes at the next low clock period after the $\overline{\mathrm{HALT}}$ signal is set high. Using HALT in microprograms can save power, especially using high clock frequencies and pipelined stages.

For some operations，such as a double－precision multiply with CLKMODE $=1$ ，setting the $\overline{\text { HALT input low may interrupt loading of the RA，RB，and instruction registers，}}$ as well as stalling operation．In clock mode 1 ，the temporary register loads on the falling edge of the clock，but the $\overline{\text { HALT signal going low would prevent the RA，RB，and }}$ instruction registers from loading on the next rising clock edge．It is therefore necessary to have the instruction and data inputs on the pins when the $\overline{\text { HALT }}$ signal is set high again and normal operation resumes．

## Instruction Inputs（19－IO）

Three modes of operation can be selected with inputs 19－IO，including independent ALU operation，independent multiplier operation，or simultaneous（chained）operation of ALU and multiplier．Each operating mode is treated separately in the following sections．

## Independent ALU Operations

The ALU executes single－and double－precision operations which can be divided according to the number of operands involved，one or two．The ALU accepts integer， normalized，and denormalized numbers as operands．Table 22 shows independent ALU operations with one operand，along with the inputs $19-10$ which select each operation． Conversions from one format to another are handled in this mode，with the exception of adjustments to precision during two－operand ALU operations．Wrapping and unwrapping of operands is also done in this mode．

Table 24 presents independent ALU operations with two operands．When the operands are different in precision，one single and the other double，the settings of the precision－ selects $18-17$ will identify the single－precision operand so that it can automatically be reformatted to double precision before the selected operation is executed，and the result of the operation will be double precision．

## Independent Multiplier Operations

In this mode the multiplier operates on the RA and RB inputs which can be either single precision，double precision，or mixed．Operands may be normalized or wrapped numbers，as indicated by the settings for instruction inputs $11-10$ ．As shown in Table 25， the multiplier can be set to operate on the absolute value of either or both operands， and the result of any operation can be negated when it is output from the multiplier． Converting a single－precision denormal number to double precision does not normalize or wrap the denormal，so it is still an invalid input to the multiplier．

Table 23. Independent ALU Operations with One Operand

| ALU OPERATION |
| :--- | :--- | :--- |
| ON A OPERAND |$\quad$| INSTRUCTION |
| :--- |
| INPUTS 19-IO |$\quad$ NOTES

[^20]Table 24．Independent ALU Operations with Two Operands

| ALU OPERATIONS AND OPERANDS | INSTRUCTION INPUTS I9－IO | NOTES |
| :---: | :---: | :---: |
| Add A＋B | 0x x000 0x00 |  |
| Add $\|A\|+B$ | $0 \times \times 0010 \times 00$ |  |
| Add $\mathrm{A}+\|\mathrm{B}\|$ | $0 \mathrm{x} \times 0001 \times 00$ |  |
| Add $\|A\|+\|B\|$ | 0x x001 1x00 | $\mathrm{x}=$ Don＇t Care |
| Subtract A－B | 0x x000 0x01 | 18 selects precision of A operand： |
| Subtract $\|A\|-B$ | $0 \times \times 0010 \times 01$ | $0=A(S P)$ |
| Subtract $A-\|B\|$ | 0x x000 1x01 | $1=\mathrm{A}$（DP） |
| Subtract $\|A\|-\mid B$ | 0x x001 1x01 | 17 selects precision of $B$ operand： |
| Compare A，B | $0 \times \times 0000 \times 10$ | $0=B(S P)$ |
| Compare $\mid$ A ${ }^{\text {，}} \mathrm{B}$ | $0 \times \times 0010 \times 10$ | $1=B$（DP） |
| Compare A，$\|\mathrm{B}\|$ | 0x x000 1x10 | 12 selects either Y or its absolute value： |
| Compare $\|A\|,\|B\|$ | $0 \times \times 0011 \times 10$ | $0=Y$ |
| Subtract B－A | 0x x000 0x11 | $1=\|Y\|$ |
| Subtract B－｜A｜ | 0x x001 0x11 |  |
| Subtract $\|B\|-A$ | 0x x000 1x11 |  |
| Subtract $\|B\|-\|A\|$ | $0 \times \times 0011 \times 11$ |  |

Table 25．Independent Multiplier Operations

| MULTIPLIER OPRATION AND OPERANDS | INSTRUCTION INPUTS 19－10 | NOTES |
| :---: | :---: | :---: |
| Multiply A＊B <br> Multiply－（A＊B） <br> Multiply $\mathrm{A} *\|\mathrm{~B}\|$ <br> Multiply－（A＊｜B｜） <br> Multiply $\|A\| * B$ <br> Multiply $-(\|A\| * B)$ <br> Multiply $\|\mathrm{A}\| ⿻ 肀 二 \mathrm{~B} \mid$ <br> Multiply $-(\|A\| *\|B\|)$ | $0 x \times 10000 x x$ 0x x 100 01xx 0x x 100 10xx $0 x \times 10011 x x$ $0 x \times 10100 x x$ 0x x 101 01xx 0x x 101 10xx 0x x 101 11xx | $x=$ Don＇t Care <br> I8 selects A operand precision（ $0=S P, 1=D P$ ） <br> 17 selects B operand precision（ $0=\mathrm{SP}, 1$＝DP） <br> 11 selects A operand format （ 0 ＝Normal， 1 ＝Wrapped） <br> 10 selects B operand format <br> （ $0=$ Normal， 1 ＝Wrapped） |

## Chained Multiplier/ALU Operations

In chained mode, the 'ACT8837 performs simultaneous operations in the multiplier and the ALU. Operations include addition, subtraction, and multiplication, except multiplication of wrapped operands. Several optional operations also increase the flexibility of the device.

The B operand to the ALU can be set to zero so that the ALU passes the A operand unaltered. The B operand to the multiplier can be forced to the value 1 so that the A operand to the multiplier is passed unaltered (see Table 26).

Table 26. Chained Multiplier/ALU Operations

| CHAINED OPERATIONS |  | OUTPUT <br> SOURCE | INSTRUCTION INPUTS I9-10 | NOTES |
| :---: | :---: | :---: | :---: | :---: |
| MULTIPLIER | ALU |  |  |  |
| A * B | $A+B$ | ALU | $1 \times \times 000 \times \times 00$ |  |
| A * B | $A+B$ | Multiplier | $1 \times \times 100 \times \times 00$ |  |
| A * B | $A-B$ | ALU | $1 \times \times 000 \times \times 01$ |  |
| A * B | $A-B$ | Multiplier | $1 \mathrm{x} \times 100 \times \times 01$ |  |
| A * B | $2-A$ | ALU | $1 \times \times 000 \times 10$ | $\mathrm{x}=$ Don't Care |
| A * B | $2-A$ | Multiplier | $1 \times \times 100 \times 10$ | 18 selects precision of |
| A * B | $B-A$ | ALU | $1 \times \times 000 \times 11$ | RA inputs: |
| A * B | $B-A$ | Multiplier | $1 \times \times 100 \times 11$ | $0=R A(S P)$ |
| A * B | A +0 | ALU | $1 \mathrm{x} \times 010 \times \times 00$ | $1=\mathrm{RA}$ (DP |
| A * B | A +0 | Multiplier | $1 \times \times 110 \times \times 00$ | 17 selects precision of |
| A * B | $0-A$ | ALU | $1 \times \times 010 \times \times 11$ | RB inputs: |
| A * B | O-A | Multiplier | $1 \mathrm{x} \times 110 \times \times 11$ | $0=R B(S P)$ |
| A * 1 | $A+B$ | ALU | $1 \times \times 001 \times \times 00$ | $1=\mathrm{RB}$ (DP) |
| A * 1 | $A+B$ | Multiplier | 1x x101 xx00 | 13 negates ALU result: |
| A * 1 | $A-B$ | ALU | $1 \times \times 001 \times x 01$ | 0 = Normal |
| A * 1 | A - B | Multiplier | $1 \mathrm{x} \times 101 \times \mathrm{01}$ | 1 = Negated |
| A * 1 | $2-\mathrm{A}$ | ALU | $1 \times \times 001 \times \times 10$ | 12 negates multiplier |
| A * 1 | $2-A$ | Multiplier | $1 \mathrm{x} \times 101 \times \times 10$ | result: |
| A * 1 | $B-A$ | ALU | 1x x001 xx11 | 0 = Normal |
| A * 1 | B - A | Multiplier | $1 \mathrm{x} \times 101 \times \times 11$ | 1 = Negated |
| A * 1 | A +0 | ALU | $1 \times \times 011 \times \times 00$ |  |
| A * 1 | $A+0$ | Multiplier | $1 \mathrm{x} \times 111 \times \mathrm{x} 00$ |  |
| A * 1 | $0-\mathrm{A}$ | ALU | 1x x011 xx11 |  |
| A * 1 | O-A | Multiplier | 1x $\times 111 \times \times 11$ |  |

## MICROPROGRAMMING THE＇ACT8837

Because the＇ACT8837 is microprogrammable，it can be configured to operate on either single－or double－precision data operands，and the operations of the registers，ALU， and multiplier can be programmed to support a variety of applications．The following examples present not only control settings but the timings of the specific operations required to execute the sample instructions．

Timing of the sample operations varies with the precision of the data operands and the settings of CLKMODE and PIPES．Microinstructions and timing waveforms are given for all combinations of data precision，clock mode，and register settings．Following the presentation of ALU and multiplier operations is a brief sum－of－products operation using instructions for chained operating mode．

## Single－Precision Operations

Two single－precision operands can be loaded on the 32－bit input buses without use of the temporary register so CLKMODE has no effect on single－precision operation． Both the ALU and the multiplier execute all single－precision instructions in one clock cycle，assuming that the device is not operating in flowthrough mode（all registers disabled）．Settings of the register controls PIPES2－PIPESO determine minimum cycle time and the rate of data throughput，as evident from the examples below．

## Single－Precision ALU Operations

Precision of each data operand is indicated by the setting of instruction input 18 for single－operand ALU instructions，or the settings of I8－I7 for two－operand instructions． When the ALU receives mixed－precision operands（one operand in single precision and the other in double precision），the single－precision data input is converted to double and the operation is executed in double precision．

If both operands are single precision，a single－precision result is output by the ALU． Operations on mixed－precision data inputs produce double－precision results．

It is unnecessary to use the＇convert float－to－float＇instruction to convert the single－ precision operand prior to performing the desired operation on the mixed－precision operands．Setting I8 and I7 properly achieves the same effect without wasting an instruction cycle．

## Single－Precision Multiplier Operations

Operand precision is selected by 18 and 17 ，as for ALU operations．The multiplier can multiply the A and B operands，either operand with the absolute value of the other， or the absolute values of both operands．The result can also be negated when it is output．If both operands are single precision，a single－precision result is output． Operations on mixed－precision data inputs produce double－precision results．

## Sample Single-Precision Microinstructions

The following four single-precision microinstruction coding examples show the four register settings, ranging from flowthrough to fully pipelined. Timing diagrams accompany the sample microinstructions.

In the first example PIPES2-PIPESO are all set high so the internal registers are all disabled. This microinstruction sets up a wrapped result from the multiplier to be unwrapped by the ALU as an exact denormalized number. In flowthrough mode the 'unwrap exact' operation is performed without a clock as soon as the instruction is input. Single-precision timing in flowthrough mode is shown in Figure 2.

CLKMODE $=0 \quad$ PIPES $=111$ Operation: Unwrap A operand exact


0000101100001111 xxxx 11 xx $00001101000 \times 111111$


INSTRUCTION: FUNC(9,0), RND(1,0), FAST


DATA(31,0) A AND B INPUTS


OUTPUT(31,0), STATUS(13,0)
Figure 2. Single-Precision Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=0)$

The second example shows a microinstruction causing the ALU to compare absolute values of $A$ and $B$. Only the input registers are enabled (PIPES2-PIPESO $=110$ ) so the result is output in one clock cycle.

CLKMODE $=0 \quad$ PIPES $=110 \quad$ Operation: Compare $|\mathrm{A}|,|\mathrm{B}|$


```
0000011010 0 01 110 xxxx 1111 00 0 1 1 0 1 0 0 0 x 11 1 1 11
```



DATA 31,0$)$ A AND B INPUTS


Figure 3. Single-Precision Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=0)$

Input and output registers are enabled in the third example, which shows the subtraction B - A. Two clock cycles are required to load the operands, execute the subtraction, and output the result (see Figure 4).

CLKMODE $=0 \quad$ PIPES $=010 \quad$ Operation: Subtract $B-A$

$0000000011001010 \times x x x 111100000001000 \times 111111$


DATA(31,0) A AND B INPUTS


Figure 4. Single-Precision Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=0$ )

The fourth example shows a multiplication A * B with all registers enabled. Three clock cycles are required to generate and output the product. Once the internal registers are all loaded with data or results, a result is available from the output register on every rising edge of the clock. The floating point unit produces its highest throughput when operated fully pipelined with single-precision operands.

CLKMODE $=0 \quad$ PIPES $=000 \quad$ Operation: Multiply $\mathrm{A} * \mathrm{~B}$


00010000000010001111 xxxx $00010111000 \times 111111$


DATA(31,0] A AND B INPUTS


Figure 5. Single-Precision Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=0$ )

## Double-Precision Operations

Double-precision operations may be executed separately in the ALU or the multiplier, or simultaneously in both. Rates of execution and data throughput are affected by the settings of the register controls (PIPES2-PIPESO) and the clock mode (CLKMODE).

The temporary register can be loaded on either the rising edge (CLKMODE $=\mathrm{L}$ ) or the falling edge of the clock (CLKMODE $=\mathrm{H}$ ). Double-precision operands are always loaded by using the 64-bit temporary register to store half of the operands prior to inputting the other half of the operands on the DA and DB buses.

Input configuration is selected by CONFIG1-CONFIG0, allowing several options for the sequence in which data operands are set up in the temporary register and the RA and RB registers. Operands are then sent to either the ALU or multiplier, or both, depending on the settings for SELOP 7-0.

The ALU executes all double-precision operations in a single clock cycle. The multiplier requires two clock cycles to execute a double-precision operation. When the device operates in chained mode (simultaneous ALU and multiplier operations), the chained double-precision operation is executed in two clock cycles. The settings of PIPES2-PIPESO determine whether the result is output without a clock (flowthrough) or after up to five clocks for a double-precision multiplication (all registers enabled and CLKMODE $=\mathrm{L})$.

## Double-Precision ALU Operations

Eight examples are provided to illustrate microinstructions and timing for doubleprecision ALU operations. The settings of CLKMODE and PIPES2-PIPESO determine how the temporary register loads and which registers are enabled. Four examples are provided in each clock mode.

Double-Precision ALU Operations with CLKMODE $=0$
The first example shows that, even in flowthrough mode, a clock signal is needed to load the temporary register with half the data operands (see Figure 6). The selected
operation is executed without a clock after the remaining half of the data operands are input on the RA and RB buses：

CLKMODE $=0 \quad$ PIPES $=111 \quad$ Operation： $\operatorname{Add} \mathrm{A}+|\mathrm{B}|$




INSTRUCTION：FUNC（9，0），RND（1，0），FAST

$14-t_{s u} \rightarrow+i+t_{h 1} \rightarrow$
DATA（31，0）A AND B INPUTS

## SELMS／LיS



OUT（31，0）STATUS（13，0）
Figure 6．Double－Precision ALU Operation，All Registers Disabled （PIPES $=111$, CLKMODE $=0)$

In the second example the input register is enabled（PIPES2－PIPESO $=110$ ）．Operands $A$ and $B$ for the instruction，$|B|-|A|$ ，are loaded using CONFIG $=00$ so that $B$ is loaded first into the temporary register with MSH through the DA port and LSH through the DB port．On the second clock rising edge，the A operand is loaded in the same order directly to RA register while B is loaded from the temporary register to the RB register（see Figure 7）．

CLKMODE $=0 \quad$ PIPES $=110 \quad$ Operation：$|\mathrm{B}|-|\mathrm{A}|$



Figure 7. Double-Precision ALU Operation, Input Registers Enabled
(PIPES $=110$, CLKMODE $=0$ )

Both the input and output registers are enabled (PIPES2-PIPESO $=010$ ) in the third example. The instruction sets up the ALU to wrap a denormalized number on the DA input bus. The wrapped output can be fed back from the $S$ register to the multiplier input multiplexer by a later microinstruction. Timing for this operation is shown in Figure 8.

CLKMODE $=0 \quad$ PIPES $=010$ Operation: Wrap Denormal Input



INSTRUCTION: FUNC(9,0), RND' 1,0 ), FAST


DATA(31,0) A AND B INPUTS


Figure 8. Double-Precision ALU Operation, Input and Output Registers Enabled (PIPES $=010$, CLKMODE $=0$ )

In the fourth example with CLKMODE $=\mathrm{L}$, all three levels of internal registers are enabled. The instruction converts a double-precision integer operand to a doubleprecision floating-point operand. Figure 9 shows the timing for this operating mode. CLKMODE $=0 \quad$ PIPES $=000 \quad$ Operation: Convert Integer to Floating Point

$0110100010011000 \times x \times x 1100000110 \times 000 \times 111111$


Figure 9. Double-Precision ALU Operation, All Registers Enabled
(PIPES $=000$, CLKMODE $=0)$

## Double-Precision ALU Operations with CLKMODE = 1

The next four examples are similar to the first four except that CLKMODE $=\mathrm{H}$ so that the temporary register loads on the falling edge of the clock. When the ALU is operating independently, setting CLKMODE high enables loading of both double-precision operands on successive falling and rising clock edges.

In this clock mode a double-precision ALU operation requires one clock cycle to load data inputs and execute, and both halves of the 64-bit result must be read out on the 32-bit $Y$ bus within one clock cycle. The settings of PIPES2-PIPESO determine the number of clock cycles which elapse between data input and result output.

In the first example all registers are disabled (PIPES2-PIPESO = 111), and the addition is performed in flowthrough mode. As shown in Figure 10, a falling clock edge is needed to load half of the operands into the temporary register prior to loading the RA and RB registers on the next rising clock.

## CLKMODE $=1 \quad$ PIPES $=111 \quad$ Operation: Add $\mathrm{A}+|\mathrm{B}|$




Figure 10. Double-Precision ALU Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=1$ )

The second example executes subtraction of absolute values for both operands. Only the RA and RB registers are enabled (PIPES2-PIPESO $=110$ ). Timing is shown in Figure 11.

CLKMODE $=1 \quad$ PIPES $=110 \quad$ Operation: Subtract $|B|-|A|$


0110011011111110 xxxx $1111000110 \times 000 x$ xx 1111


DATA(31.0) A AND B INPUTS


SELMS/LS


Figure 11. Double-Precision ALU Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=1$ )

The third example shows a single denormalized operand being wrapped so that it can be input to the multiplier. Both input and output registers are enabled (PIPES2-PIPESO =010). Timing is shown in Figure 12.

CLKMODE $=1 \quad$ PIPES $=010$ Operation: Wrap Denormal Input



Figure 12．Double－Precision ALU Operation，Input and Output Registers Enabled （PIPES $=010$, CLKMODE $=1$ ）

The fourth example shows a conversion from integer to floating point format. All three levels of data registers are enabled (PIPES2-PIPESO) so that the FPU is fully pipelined in this mode (see Figure 13).

CLKMODE $=1 \quad$ PIPES $=000 \quad$ Operation: Convert Integer to Floating Point

```
                                    S
                                    E
                C CC
                LOOPPSS
                KNN I I EE
                M FF PP
                                LL
                                O II EE OO
                                RR F E E S /
                                M S S
                D G G S S 
        1 1
        9-0
                D G G S S 
                    S BEE\overline{R}
                G G S S 
                                NNANNR
                    S
                    YLLE\overline{H}
                            NOOTSSSATT
                                DD SRRCEEEEETTELPP
                                0
                                1-0 T A B CS YCSP 1-0 T T 1-0
01 10100010 1 11 000 xxxx 1100 00 0 1 1 x x 0 0 0 x xx 1 1 11
```


## Lع88コロナLNS

$\checkmark$
Load Rest of Third
Operands
Begin Third Operation
Load Pipeline
Load Output
$\downarrow$
$\downarrow$
INSTRUCTION：FUNC（ 9,0 ），RND $(1,0)$ ，FAST

SELMS／$\overline{L S}$

OUT（31，0）STATUS（13，0）

[^21]Figure 13．Double－Precision ALU Operation，All Registers Enabled
（PIPES $=000$, CLKMODE $=1$ ）

## Double-Precision Multiplier Operations

Independent multiplier operations may also be performed in either clock mode and with various registers enabled. As before, examples for the two clock modes are treated separately. A double-precision multiply operation requires two clock cycles to execute (except in flowthrough mode) and from one to three other clock cycles to load the temporary register and to output the results, depending on the setting of PIPES2-PIPESO.

Even in flowthrough mode (PIPES2-PIPESO = 111) two clock edges are required, the first to load half of the operands in the temporary register and the second to load the intermediate product in the multiplier pipeline register. Depending on the setting of CLKMODE, loading the temporary register may be done on either a rising or a falling edge.

Double-Precision Multiplication with CLKMODE $=0$
In this first example, the $A$ operand is multiplied by the absolute value of $B$ operand. Timing for the operation is shown in Figure 14:

CLKMODE $=0 \quad$ PIPES $=111 \quad$ Operation: Multiply A * $|\mathrm{B}|$



Figure 14．Double－Precision Multiplier Operation，All Registers Disabled （PIPES $=111$, CLKMODE $=0$ ）

The second example assumes that the RA and RB input registers are enabled. With CLKMODE $=0$ one clock cycle is required to input both the double-precision operands. The multiplier is set up to calculate the negative product of $|A|$ and $B$ operands:

CLKMODE $=0 \quad$ PIPES $=110 \quad$ Operation: Multiply $-(|A| * B)$



SELMS/L्ड


Figure 15. Double-Precision Multiplier Operation, Input Registers Enabled (PIPES $=110$, CLKMODE $=0$ )

Enabling both input and output registers in the third example adds an additional delay of one clock cycle, as can be seen from Figure 16. The sample instruction sets up calculation of the product of $|A|$ and $|B|$ :

CLKMODE $=0 \quad$ PIPES $=010 \quad$ Operation: Multiply $|A| *|B|$


INSTRUCTION: FUNC(9,0), RND(1,0), FAST


DATA(31,0) A AND B INPUTS


Figure 16. Double-Precision Multiplier Operation, Input and Output Registers Enabled $($ PIPES $=010$, CLKMODE $=0)$

With all registers enabled, the fourth example shows a microinstruction to calculate the negated product of operands $A$ and $B$ :

```
CLKMODE =0 PIPES = 000 Operation: Multiply -(A * B)
```



01110001000010001111 xxxx 000111 x x $0000 x$ xx 1111

SN74ACT8837


Figure 17. Double-Precision Multiplier Operation, All Registers Enabled (PIPES $=000$, CLKMODE $=0)$

## Double-Precision Multiplication with CLKMODE $=1$

Setting the CLKMODE control high causes the temporary register to load on the falling edge of the clock. This permits loading both double-precision operands within the same clock cycle. The time available to output the result is also affected by the settings of CLKMODE and PIPES2-PIPESO, as shown in the individual timing waveforms.

The first multiplication example with CLKMODE set high shows a multiplication in flowthrough mode (PIPES2-PIPESO $=111$ ). Figure 18 shows the timing for this operating mode:

CLKMODE $=1 \quad$ PIPES $=111 \quad$ Operation: Multiply $A *|B|$


01110010001111111111 xxxx 00
$0 \times x \times x 000 x \times x$
1111


Figure 18. Double-Precision Multiplier Operation, All Registers Disabled (PIPES $=111$, CLKMODE $=1$ )

In the second example, the input registers are enabled and the instruction is otherwise similar to the corresponding example for CLKMODE $=0$. Timing is shown in Figure 19. CLKMODE $=1 \quad$ PIPES $=110 \quad$ Operation: Multiply $-(|A| * B)$


01110101001111101111 xxxx $000011 \times x 000 \times x \times 1111$


With both input and output registers pipelined，the third example calculates the product of $|\mathrm{A}|$ and $|\mathrm{B}|$ ．Enabling the output register introduces a one－cycle delay in outputting the result（see Figure 20）：

CLKMODE $=1 \quad$ PIPES $=$ Q10 Operation：Multiply $|A| *|B|$

OUT $(31,0)$ STATUS（13，0）

Figure 20．Double－Precision Multiplier Operation，Input and Output Registers Enabled （PIPES＝010，CLKMODE＝1）

The fourth example shows the instruction and timing (Figure 21) to generate the negated product of the $A$ and $B$ operands. This operating mode with CLKMODE set high and all registers enabled permits use of the shortest clock period and produces the most data throughput, assuming that this is the primary operating mode in which the device is to function.

Additional considerations affecting timing and throughput are discussed in the section on mixed operations and operands.

$$
\text { CLKMODE }=1 \quad \text { PIPES }=000 \quad \text { Operation: Multiply }-(\mathrm{A} * \mathrm{~B})
$$

S S

$$
K N N \text { I I E E }
$$

L L
NNANNR$\overline{0}$YLLE $\bar{H}$O II EE OO NNANNR $\bar{O} \bar{O} \bar{O} T S S S A T T$II DGGSS PP DDSRRCLEEEETTELPP
9-0 E 1-0 2-0 7-0 1-0 T A BCSYCSP1-0 T T 1-0
01110001001110001111 xxxx 00 $011 \times 000 \mathrm{x}$ ..... 1111


INSTRUCTION: FUNC(9,0), R'ND(1,0), FAST

selms/L्̄S


Figure 21. Double-Precision Multiplier Operation, All Registers Enabled
(PIPES $=000$, CLKMODE $=1$ )
SN74ACT8837

## Chained Multiplier/ALU Operations

Simultaneous multiplier and ALU functions can be selected in chained mode to support calculation of sums of products or products of sums. Operations selectable in chained mode (see Table 25) overlap partially with those selectable in independent multiplier or ALU operating mode. Format conversions, absolute values, and wrapping or unwrapping of denormal numbers are not available in chained mode.

To calculate sums of products, the FPU can operate on external data inputs in the multiplier while the ALU operates on feedback from the previous calculation. The operand selects SELOPS7-SELOPSO can be set to select multiplier inputs from the RA and RB registers and ALU inputs from the $P$ and $S$ registers.

This mode of chained multiplier and ALU operation is used repeatedly in the division and square root calculations presented later. The sample microinstruction sequence shown in Tables 27 and 28 performs the operations for multiplying sets of data operands and accumulating the results, the basic operations involved in computing a sum of products.

Table 27 represents the operations, clock cycles, and register contents for a singleprecision sum of four products. Registers used include the RA and RB input registers and the product $(P)$ and sum ( S ) registers.

Table 27. Single-Precision Sum of Products (PIPES2-PIPESO $=010$ )

| CLOCK <br> CYCLE | MULTIPLIER/ALU OPERATIONS | PSEUDOCODE |
| :---: | :---: | :---: |
| 1 | $\begin{aligned} & \text { Load A, B } \\ & A * B \end{aligned}$ | $A \rightarrow R A, B \rightarrow R B$ |
| 2 | Pass $P(A B)$ to $S$ | $C \rightarrow R A, D \rightarrow R B$ |
|  | Load C, D $C * D$ | $A * B \rightarrow P(A B)$ |
| 3 | $S(A B)+P(C D)$ | $\mathrm{P}(\mathrm{AB})+0 \rightarrow \mathrm{~S}(\mathrm{AB})$ |
|  | Load E, F | $E \rightarrow R A, F \rightarrow R B$ |
|  | $\mathrm{E} * \mathrm{~F}$ | $C * D \rightarrow P(C D)$ |
| 4 | $S(A B+C D)+P(E F)$ | $S(A B)+P(C D) \rightarrow S(A B+C D)$ |
|  | Load G, H | $\mathrm{G} \rightarrow \mathrm{RA}, \mathrm{H} \rightarrow \mathrm{RB}$ |
|  | $G * H$ | $E * F \rightarrow P(E F)$ |
| 5 | $S(A B+C D)+E F)+P(G H)$ | $S(A B+C D)+P(E F) \rightarrow S(A B+C D+E F)$ |
|  |  | $G * H \rightarrow P(G H)$ |
| 6 | New Instruction | $S(A B+C D+E F)+P(G H) \rightarrow S(A B+C D+E F+G H)$ |

A microcode sequence to generate this sum of product is shown in Table 28. Only three instructions in chained mode are required, since the multiplier begins the calculation independently and the ALU completes it independently.

Table 28. Sample Microinstructions for Single-Precision Sum of Products


## Fully Pipelined Double-Precision Operations

Performing fully pipelined double-precision operations requires a detailed understanding of timing constraints imposed by the multiplier. In particular, sum of products and product of sums operations can be executed very quickly, mostly in chained mode, assuming that timing relationships between the ALU and the multiplier are coded properly.

Pseudocode tables for these sequences are provided, (Table 29 and Table 30) showing how data and instructions are input in relation to the system clock. The overall patterns of calculations for an extended sum of products and an extended product of sums are presented. These examples assume FPU operation in CLKMODE 0, with the CONFIG setting HL to load operands by MSH and LSH, all registers enabled (PIPES2 - PIPESO $=$ LLL), and the C register clock tied to the system clock.

In the sum of products timing table, the two initial products are generated in independent multiplier mode. Several timing relationships should be noted in the table. The first chained instruction loads and begins to execute following the sixth rising edge of the clock, after the first product P 1 has already been held in the P register for one clock. For this reason, P1 is loaded into the C register so that P1 will be stable for two clocks.

On the seventh clock, the ALU pipeline register loads with an unwanted sum, P1 +P 1 . However, because the ALU timing is constrained by the multiplier, the S register will not load until the rising edge of CLK9, when the ALU pipe contains the desired sum, P1 + P2. The remaining sequence of chained operations then execute in the desired manner.

Table 29．Pseudocode for Fully Pipelined Double－Precision Sum of Products （CLKM $=0$, CONFIG $=10$, PIPES $=000$, CLKC $\leftrightarrow S Y S C L K)$

| CLK | $\begin{gathered} \text { DA } \\ \text { BUS } \end{gathered}$ | $\begin{gathered} \text { DB } \\ \text { BUS } \end{gathered}$ | TEMP REG | INS <br> BUS | $\begin{aligned} & \text { INS } \\ & \text { REG } \end{aligned}$ | $\begin{gathered} \text { RA } \\ \text { REG } \end{gathered}$ | $\begin{gathered} \text { RB } \\ \text { REG } \end{gathered}$ | MUL PIPE | $\begin{gathered} \mathbf{P} \\ \text { REG } \end{gathered}$ | $\begin{gathered} \text { C } \\ \text { REG } \end{gathered}$ | ALU PIPE | $\begin{gathered} \mathbf{S} \\ \text { REG } \end{gathered}$ | $\mathbf{Y}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| ］ 1 | A1 MSH | B1 MSH | A1，B1MSH | A1＊B1 |  |  |  |  |  |  |  |  |  |
| $\sqrt{2}$ | A1 LSH | B1 LSH | A1，B1MSH | A1＊B1 | A1＊B1 | A1 | B1 |  |  |  |  |  |  |
| $] 3$ | A2 MSH | B2 MSH | A2，B2MSH | A2＊B2 | A1＊B1 | A1 | B1 | A1＊B1 |  |  | － |  |  |
| $\sqrt{4}$ | A2 LSH | B2 LSH | A2，B2MSH | A2＊B2 | A2＊B2 | A2 | B2 | A1＊B1 |  |  |  |  |  |
| $\sqrt{5}$ | A3 MSH | B3 MSH | A3，B3MSH | $\begin{aligned} & P R+C R \\ & A 3 * B 3 \end{aligned}$ | A2＊B2 | A2 | B2 | A2＊B2 | P1 |  |  |  |  |
| $\sqrt{6}$ | A3 LSH | B3 LSH | A3，B3MSH | $\begin{aligned} & P R+C R \\ & A 3 * B 3 \end{aligned}$ | $\left\|\begin{array}{l} P R+C R, \\ A 3 * B 3 \end{array}\right\|$ | A3 | B3 | A2＊B2 | P1 | P1 |  |  |  |
| $\sqrt{7}$ | A4 MSH | B4 MSH | A4，B4MSH | $\begin{aligned} & \mathrm{PR}+\mathrm{SR} \\ & \mathrm{~A} 4 * B 4 \end{aligned}$ | $\left\|\begin{array}{l} \mathrm{PR}+\mathrm{SR}, \\ \mathrm{~A} 3 * B 3 \end{array}\right\|$ | A3 | B3 | A3＊B3 | P2 | P1 | $\mathrm{P} 1+\mathrm{P} 1$ |  |  |
| $\sqrt{8}$ | A4 LSH | B4 LSH | A4，B4MSH | $\begin{aligned} & \mathrm{PR}+\mathrm{SR} \\ & \mathrm{~A} 4 * B 4 \end{aligned}$ | $\left\|\begin{array}{l} \mathrm{PR}+\mathrm{SR}, \\ \mathrm{~A} 4 * B 4 \end{array}\right\|$ | A4 | B4 | A3＊B3 | P2 | P1 | $\mathrm{P} 1+\mathrm{P} 1$ |  |  |
| $\int 9$ | A5 MSH | B5 MSH | A5，B5MSH | $\begin{aligned} & \text { PR + SR } \\ & \text { A5 * B5 } \end{aligned}$ | $\begin{array}{\|l\|} \hline \mathrm{PR}+\mathrm{SR}, \\ \mathrm{~A} 4 * B 4 \\ \hline \end{array}$ | A4 | B4 | A4＊B4 | P3 | P2 | $\mathrm{S} 1+\mathrm{P} 2$ | S1 |  |
| $\sqrt{10}$ | A5 LSH | B5 LSH | A5，B5MSH | $\begin{aligned} & P R+S R \\ & A 5 * B 5 \end{aligned}$ | $\left.\begin{array}{\|} \mathrm{PR}+\mathrm{SR}, \\ \mathrm{~A} 5 * \mathrm{~B} \end{array} \right\rvert\,$ | A5 | B5 | A4＊B4 | P3 | P3 | S1＋P3 | S1 |  |
| $\sqrt{11}$ | A6 MSH | B6 MSH | A6，B6（M） | $\begin{aligned} & \mathrm{PR}+\mathrm{SR} \\ & \mathrm{~A} 6 * \mathrm{~B} 6 \end{aligned}$ | $\left\|\begin{array}{c} \mathrm{PR}+\mathrm{SR}, \\ \mathrm{~A} 5 * \mathrm{~B} \end{array}\right\|$ | A5 | B5 | A5＊B5 | P4 | P3 | XXXXX | S2 |  |
| $\sqrt{12}$ |  |  |  |  |  |  |  |  |  |  |  |  |  |

Table 30. Pseudocode for Fully Pipelined Double-Precision Product of Sums
(CLKM $=0$, CONFIG $=10$, PIPES $=000$, CLKC $\leftrightarrow S Y S C L K$ )

| CLK | $\begin{gathered} \text { DA } \\ \text { BUS } \end{gathered}$ | $\begin{gathered} \text { DB } \\ \text { BUS } \end{gathered}$ | $\begin{gathered} \text { TEMP } \\ \text { REG } \end{gathered}$ | $\begin{aligned} & \text { INS } \\ & \text { BUS } \end{aligned}$ | $\begin{aligned} & \text { INS } \\ & \text { REG } \end{aligned}$ | $\begin{gathered} \mathrm{RA} \\ \text { REG } \\ \hline \end{gathered}$ | $\begin{gathered} \mathrm{RB} \\ \text { REG } \end{gathered}$ | MUL PIPE | $\begin{array}{\|c\|} \hline \mathbf{P} \\ \text { REG } \\ \hline \end{array}$ | $\begin{gathered} \mathrm{C} \\ \text { REG } \end{gathered}$ | ALU PIPE | $\begin{gathered} \mathrm{S} \\ \text { REG } \end{gathered}$ | $\begin{gathered} \mathrm{Y} \\ \text { BUS } \end{gathered}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| $\bigcirc 1$ | A1 (M) | B1(M) | A1, B1 (M) | A1 + B1 |  |  |  |  |  |  |  |  |  |
| 2 | A1(L) | B1(L) | A1, B1 (M) | $A 1+B 1$ | A1 + B1 | A1 | B1 |  |  |  |  |  |  |
| 3 | A2(M) | B2(M) | A2,B2(M) | $\mathrm{A} 2+\mathrm{B} 2$ | $A 1+B 1$ | A1 | B1 |  |  |  | A1 + B1 |  |  |
| - 4 | A2(L) | B2(L) | A2,B2(M) | A2 + B2 | A2 + B2 | A2 | B2 |  |  |  | A1 + B1 | S1 |  |
| $\int 5$ | A3(M) | B3(M) | A3,B3(M) | $\begin{aligned} & C R * S R \\ & A 3+B 3 \end{aligned}$ | A2 + B2 | A2 | B2 |  |  | S1 | A2 + B2 | S1 |  |
| $\int 6$ | A3(L) | B3(L) | A3,B3(M) | $\begin{aligned} & C R * S R \\ & \text { A3 + B3 } \end{aligned}$ | $\left\|\begin{array}{l} C R * S R \\ A 3+B 3 \end{array}\right\|$ | A3 | B3 |  |  | S1 | A2 + B2 | S2 |  |
| $\int 7$ | XXX | XXX | XXX | SP Add | $\begin{aligned} & C R * S R \\ & \mathrm{~A} 3+\mathrm{B} 3 \end{aligned}$ | A3 | B3 | S1 * S2 |  | S1 | A3 + B3 | S2 |  |
| $\int 8$ | A4(M) | B4(M) | A4,B4(M) | $\begin{aligned} & \text { PR * SR } \\ & \text { A } 4+B 4 \end{aligned}$ | $\left\lvert\, \begin{aligned} & \mathrm{CR} * \mathrm{SR} \\ & \mathrm{~A} 3+\mathrm{B} 3 \end{aligned}\right.$ | $\begin{gathered} E N R A=L \\ A 3 \end{gathered}$ | $\begin{gathered} E N R B=L \\ B 3 \end{gathered}$ | S1 * S2 |  | S1 | A3 + B3 | XXX |  |
| $\int 9$ | A4(L) | B4(L) | A4,B4(M) | $\begin{aligned} & \text { PR * SR } \\ & \text { A4 + B4 } \end{aligned}$ | $\left\|\begin{array}{l} \text { PR * SR } \\ \mathrm{A} 4+\mathrm{B4} \end{array}\right\|$ | A4 | B4 | XXX | P1 | S1 | XXX | S3 |  |
| $\checkmark 10$ | XXX | XXX | XXX | SP Add | $\left\|\begin{array}{l} P R * S R \\ A 4+B 4 \end{array}\right\|$ | A4 | B4 | P1 * S3 | P1 | S1 | A4 + B4 | S3 |  |
| $\checkmark 11$ | A5(M) | B5(M) | A5,B5(M) | $\begin{aligned} & \mathrm{PR} * \mathrm{SR} \\ & \mathrm{~A} 5+\mathrm{B} 5 \end{aligned}$ | $\left\lvert\, \begin{aligned} & P R * S R \\ & A 4+B 4 \end{aligned}\right.$ | $\begin{gathered} E N R A=L \\ \mathrm{~A} 4 \end{gathered}$ | $\begin{gathered} E N R B=L \\ B 4 \end{gathered}$ | P1 * S3 | XXX | S1 | A4 + B4 | XXX |  |
| $\checkmark 12$ | A5(L) | B5(L) | A5,B5(M) | $\begin{aligned} & P R * S R \\ & A 5+B 5 \end{aligned}$ | $\left.\begin{aligned} & P R * S R \\ & A 5+B 5 \end{aligned} \right\rvert\,$ | A5 | B5 | XXX | P2 | S1 | XXX | S4 |  |

NOTE: On CLK 7 and CLK10, put 000000000 (Single-Precision Add) on the instruction bus.

In the product of sums timing table, the two initial sums are generated in independent ALU mode. The remaining operations are shown as alternating chained operations followed by single-precision adds. The SP adds are necessary to provide an extra cycle during which the multiplier outputs the current intermediate product. The current sum and the latest intermediate product are then fed back to the multiplier inputs for the next chained operations. In this manner, a double-precision product of sums is generated in three system clocks, as opposed to two clocks for a double-precision sum of products.

## Mixed Operations and Operands

Using mixed-precision data operands or performing sequences of mixed operations may require adjustments in timing, operand precision, and control settings. To simplify microcoding sequences involving mixed operations, mixed-precision operands, or both, it is useful to understand several specific requirements for mixed-mode or mixedprecision processing.

Calculations involving mixed-precision operands must be performed as double-precision operations (see Table 12). The instruction settings (I8-I7) should be set to indicate the precision of each operand from the RA and RB input registers. (Feedback operands from internal registers are also double-precision.) Mixed-precision operations should not be performed in chained mode.

Timing for operations with mixed-precision operands is the same as for a corresponding double-precision operation. In a mixed-precision operation, the single-precision operand must be loaded into the upper half of its input register.

Most format conversions also involve double-precision timing. Conversions between single- and double-precision floating point format are treated as mixed-precision operations. During integer to floating point conversions, the integer input should be loaded into the upper half of the RA register.

In applications where mixed-precision operations is not required, it is possible to tie the 18-17 instruction inputs together so that both controls always select the same precision.

Sequences of mixed operations may require changes in multiple control settings to deal with changes in timing of input, execution, and output of results. Figure 22 shows a simplified timing waveform for a series of mixed operations:

Clock cycle


FUNCTION AND DATA

RESULTS AND STATUS


A,B,C,D - double precision multiply; E,F - single precision operation; G,H,I,J - double precision add; $K, L$ - single precision opration. A double precision number is not required to be held on the outputs for two cycles unless it is followed by a like double precision function. If a double precision multiply is followed by single precision operation, there must be one open clock cycle.

Figure 22. Mixed Operations and Operands (PIPES2-PIPESO $=110$, CLKMODE $=0$ )

In this sequence, the fifth cycle is left open because a single-precision multiply follows a double-precision multiply. If the SP multiply were input during the period following the fourth rising clock edge, the result of the preceding operation would be overwritten, since an SP multiply executes in one clock cycle. To avoid such a condition, the FPU will not load during the required open cycle.

Because the sequence of mixed operations places constraints on output timing, only one cycle is available to output the double-precision (C * D) result. By contrast, the SP multiply ( $\mathrm{E} * \mathrm{~F}$ ) is available for two cycles because the operation which follows it does not output a result in the period following the seventh rising clock edge. In general, the precision and timing of each operation affects the timing of adjacent operations.

Control settings for CLKMODE and registers must also be considered in relation to precision and speed of execution. In Figure 23, a similar sequence of mixed operations is set up for execution in fully pipelined mode:

CLOCK CYCLE

$\begin{array}{ll}2 & 3\end{array}$


A,B,C,D - double precision multiply; E,F - single precision operation; G,H, - double precision add; I, J, K,L,M,N - single precision operation; O,P,Q,R - double precision multiply. In clock mode 1, a double precision result is two cycles long only when a double precision multiply is followed by a double precision multiply.

Figure 23. Mixed Operations and Operands (PIPES2-PIPESO $=000$, CLKMODE $=1$ )

Although the data operands can be loaded in one clock cycle with CLKMODE set high, enabling two additional internal registers delays the ( $A * B$ ) result one cycle beyond the previous example. Again, an open cycle is required after the ( $C * D$ ) operation because the next operation is single precision. The result of the ( $C * D$ ) multiply is available for one cycle instead of two, also because the following operation is single precision. With this setting of CLKMODE and PIPES2-PIPESO, a double-precision result is only available for two clock cycles when one DP multiply follows another DP multiply.

## Matrix Operations

The 'ACT8837 floating point unit can also be used to perform matrix manipulations involved in graphics processing or digital signal processing. The FPU multiplies and adds data elements, executing sequences of microprogrammed calculations to form new matrices.

## Representation of Variables

In state representations of control systems, an $n$-th order linear differential equation with constant coefficients can be represented as a sequence of $n$ first-order linear differential equations expressed in terms of state variables:

$$
\frac{d \times 1}{d t}=x 2, \ldots, \quad \frac{d x(n-1)}{d t}=x n
$$

For example, in vector-matrix form the equations of an nth-order system can be represented as follows:

$$
\begin{aligned}
& \frac{\mathrm{d}}{\mathrm{dt}} \begin{array}{|c}
\mathrm{x} 1 \\
\mathrm{x} 2 \\
: \\
: \\
\mathrm{xn} \\
\hline
\end{array}=\begin{array}{|cccc}
\mathrm{a} 11 & \mathrm{a} 12 & \ldots & \mathrm{a} 1 \mathrm{n} \\
: & : & & : \\
: & : & & : \\
: & : & & : \\
\mathrm{an} 1 & \mathrm{an} 2 & \ldots & \text { ann }
\end{array} \begin{array}{|ccc|}
\hline x 1 \\
\mathrm{x} 2 \\
: \\
: \\
\mathrm{xn}
\end{array} \\
& \text { or, } \dot{\mathrm{x}}=\mathrm{ax}+\mathrm{bu}
\end{aligned}
$$

Expanding the matrix equation for one state variable, $d \times 1 / d t$, results in the following expression:

$$
\dot{x} 1=(a 11 * x 1+\ldots+a 1 n * x n)+(b 11 * u 1+\ldots+b 1 n * u n)
$$

where $\dot{\mathrm{X}} 1=\mathrm{dx} 1 / \mathrm{dt}$.
Sequences of multiplications and additions are required when such state space transformations are performed, and the 'ACT8837 has been designed to support such sum-of-products operations. An $n \times n$ matrix A multiplied by an $n \times n$ matrix $X$ yields an $n \times n$ matrix $C$ whose elements cij are given by this equation:

$$
\begin{equation*}
c i j=\sum_{k=1}^{n} \text { aik } * x k j \text { for } i=1, \ldots, n \quad j=1, \ldots, n \tag{1}
\end{equation*}
$$

For the cij elements to be calculated by the 'ACT8837, the corresponding elements aik and xkj must be stored outside the 'ACT8837 and fed to the 'ACT8837 in the proper order required to effect a matrix multiplication such as the state space system representation just discussed.

## Sample Matrix Transformation

The matrix manipulations commonly performed in graphics systems can be regarded as geometrical transformations of graphic objects. A matrix operation on another matrix representing a graphic object may result in scaling, rotating, transforming, distorting, or generating a perspective view of the image. By performing a matrix operation on the position vectors which define the vertices of an image surface, the shape and position of the surface can be manipulated.

The generalized $4 \times 4$ matrix for transforming a three－dimensional object with homogeneous coordinates is shown below：

$$
T=\begin{array}{|ccccc}
a & b & c & : & d \\
e & f & g & : & h \\
i & j & k & : & l \\
\cdots & \cdots & \cdots & : & \ldots \\
m & n & o & : & p
\end{array}
$$

The matrix T can be partitioned into four component matrices，each of which produces a specific effect on the resultant image：


The $3 \times 3$ matrix produces linear transformation in the form of scaling，shearing and rotation．The $1 \times 3$ row matrix produces translation，while the $3 \times 1$ column matrix produces perspective transformation with multiple vanishing points．The final single element $1 \times 1$ produces overall scaling．Overall operation of the transformation matrix T on the position vectors of a graphic object produces a combination of shearing， rotation，reflection，translation，perspective，and overall scaling．

The rotation of an object about an arbitrary axis in a three－dimensional space can be carried out by first translating the object such that the desired axis of rotation passes through the origin of the coordinate system，then rotating the object about the axis through the origin，and finally translating the rotated object such that the axis of rotation resumes its initial position．If the axis of rotation passes through the point $P=[a b c 1]$ ， then the transformation matrix is representable in this form：

where R may be expressed as:

$$
R=\begin{array}{cccc}
n 1^{2}+(1-n)^{2} \cos \phi & n 1 n 2(1-\cos \phi)+n 3 \sin \phi & n 1 n 3(1-\cos \phi)-n 2 \sin \phi & 0 \\
n 1 n 2(1-\cos \phi)-n 3 \sin \phi & n 2^{2}+(1-n 2)^{2} \cos \phi & n 2 n 3(1-\cos \phi)+n 1 \sin \phi & 0 \\
n 1 n 3(1-\cos \phi)+n 2 \sin \phi & n 2 n 3(1-\cos \phi)-n 1 \sin \phi & n 3^{2}+(1-n 3)^{2} \cos \phi & 0 \\
0 & 0 & 0 & 1 \\
\hline
\end{array}
$$

and

$$
\mathrm{n} 1=\mathrm{q} 1 /\left(\mathrm{q} 1^{2}+\mathrm{q} 2^{2}+\mathrm{q} 3^{2}\right)^{1 / 2}=\underset{\text { rotation }}{\text { direction cosine for } \mathrm{x} \text {-axis of }}
$$

$\mathrm{n} 2=\mathrm{q} 2 /\left(\mathrm{q} 1^{2}+\mathrm{q} 2^{2}+\mathrm{q} 3^{2}\right)^{1 / 2}=$ direction cosine for y -axis of rotation
$\mathrm{n} 3=\mathrm{q} 3 /\left(\mathrm{q} 1^{2}+\mathrm{q} 2^{2}+\mathrm{q} 3^{2}\right)^{1 / 2}=$ direction cosine for z -axis of rotation
$\bar{n}=(\mathrm{n} 1 \mathrm{n} 2 \mathrm{n} 3) \quad=$ unit vector for $\overline{\mathrm{Q}}$
$\overline{\mathrm{Q}}=$ vector defining axis of rotation $=[\mathrm{q} 1 \mathrm{q} 2 \mathrm{q} 3]$
$\phi=$ the rotation angle about $\overline{\mathrm{Q}}$

A general rotation using equation (2) is effected by determining the $[x y z]$ coordinates of a point $A$ to be rotated on the object, the direction cosines of the axis of rotation [ $\mathrm{n} 1, \mathrm{n} 2, \mathrm{n} 3$ ], and the angle $\phi$ of rotation about the axis, all of which are needed to define matrix [R]. Suppose, for example, that a tetrahedron ABCD, represented by the coordinate matrix below is to be rotated about an axis of rotation RX which passes through a point $P=\left[\begin{array}{lll}5 & -6 & 3\end{array}\right]$ and whose direction cosines are given by unit vector [ $\mathrm{n} 1=0.866, \mathrm{n} 2=0.5, \mathrm{n} 3=0.707$ ]. The angle of rotation 0 is 90 degrees (see Figure 24). The rotation matrix [R] becomes

$$
\begin{aligned}
& \begin{array}{llll}
2 & -3 & 3 & 1 \\
1 & -2 & 2 & 1 \\
2 & -1 & 2 & 1 \\
2 & -2 & & \\
2 & & \longrightarrow \mathrm{~A} \\
\end{array} \\
& R=\begin{array}{cccc|}
\hline 0.750 & 1.140 & 0.112 & 0 \\
-0.274 & 0.250 & 1.220 & 0 \\
1.112 & -0.513 & 0.500 & 0 \\
0 & 0 & 0 & 1 \\
\hline
\end{array}
\end{aligned}
$$


(1) THIS ARROW DEPICTS THE FIRST TRANSLATION
(2) THIS ARROW DEPICTS THE $90^{\circ}$ ROTATION
(3) THIS ARROW DEPICTS THE BACK TRANSLATION

Figure 24. Sequence of Matrix Operations

The point transformation equation (2) can be expanded to include all the vertices of the tetrahedron as follows:

| $x a$ | $y a$ | $z a$ | $h 1$ |
| :--- | :--- | :--- | :--- |
| $x b$ | $y b$ | $z b$ | $h 2$ |
| $x c$ | $y c$ | $z c$ | $h 3$ |
| $x d$ | $y d$ | $z d$ | $h 4$ |


| $2-3$ 3 1 <br> $1-2$ 2 1 <br> $2-1$ 2 1 <br> $2-2$ 2 1 | $\begin{array}{\|rrrr\|}1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ -5 & 6 & -3 & 1\end{array}$ | 0.750 1.140 0.112 0 <br> -0.274 0.250 1.22 0 <br> 1.112 -0.513 0.500 0 <br> 0 0 0 1 | 1 0 0 0 <br> 0 1 0 0 <br> 0 0 1 0 <br> 5 -6 3 1 |
| :---: | :---: | :---: | :---: |
|  | translation to origin | rotation about origin | translation back to initial position |

The 'ACT8837 floating-point unit can perform matrix manipulation involving multiplications and additions such as those represented by equation (1). The matrix equation (3) can be solved by using the 'ACT8837 to compute, as a first step, the product matrix of the coordinate matrix and the first translation matrix of the righthand side of equation (3) in that order. The second step involves postmultiplying the rotation matrix by the product matrix. The third step implements the back-translation by premultiplying the matrix result from the second step by the second translation matrix of equation (3). Details of the procedure to produce a three-dimensional rotation about an arbitrary axis are explained in the following steps:

Step 1
Translate the tetrahedron so that the axis of rotation passes through the origin. This process can be accomplished by multiplying the coordinate matrix by the translation matrix as follows:


The 'ACT8837 could compute the translated coordinates AT, BT, CT, DT as indicated above. However, an alternative method resulting in a more compact solution is presented below.

## Step 2

Rotate the tetrahedron about the axis of rotation which passes through the origin after the translation of Step 1. To implement the rotation of the tetrahedron, postmultiply the rotation matrix $[R]$ by the translated coordinate matrix from Step 1. The resultant matrix represents the rotated coordinates of the tetrahedron about the origin as follows:


## Step 3

Translate the rotated tetrahedron back to the original coordinate space. This is done by premultiplying the resultant matrix of Step 2 by the translation matrix. The following calculations produces the final coordinate matrix of the transformed object:

| -3.072 | -2.670 | 3.324 | 1 |
| :--- | :--- | :--- | :--- | :--- |
| -5.208 | -3.047 | 3.932 | 1 |
| -4.732 | -1.657 | 5.264 | 1 |
| -4.458 | -1.907 | 4.044 | 1 |



A more compact solution to these transformation matrices is a product matrix that combines the two translation matrices and the rotation matrix in the order shown in equation (3). Equation (3) will then take the following form:

| $x a$ | $y a$ | $z a$ | $h 1$ |
| :--- | :--- | :--- | :--- |
| xb | bb | zb | $h 2$ |
| xc | yc | zc | $h 3$ |
| xd | yd | zd | $h 4$ |


| 2 | -3 | 3 | 1 |
| :--- | :--- | :--- | :--- |
| 1 | -2 | 2 | 1 |
| 2 | -1 | 2 | 1 |
| 2 | -2 | 2 | 1 |


| 0.750 | 1.140 | 0.112 | 0 |
| :---: | ---: | ---: | ---: |
| -0.274 | 0.250 | 1.220 | 0 |
| 1.112 | -0.513 | 0.500 | 0 |
| -3.730 | -8.661 | 8.260 | 1 |

The newly transformed coordinates resulting from the postmultiplication of the transformation matrix by the coordinate matrix of the tetrahedron can be computed using equation (1) which was cited previously:

$$
\begin{equation*}
\mathrm{cij}=\sum_{k=1}^{n} \text { aik } * x k j \text { for } i=1, \ldots, n \quad j=1, \ldots, n \tag{1}
\end{equation*}
$$

For example, the coordinates may be computed as follows:

$$
\begin{aligned}
\mathrm{xa}=\mathrm{c} 11 & =\mathrm{a} 11 * \times 11+\mathrm{a} 12 * \times 21+\mathrm{a} 13 * \times 31+\mathrm{a} 14 * \mathrm{x} 41 \\
& =2 * 0.750+(-3) *(-0.274)+3 * 1.112+1 *(-3.73) \\
& =1.5+0.822+3.336-3.73 \\
& =1.928 \\
\mathrm{ya}=\mathrm{c} 12 & =\mathrm{a} 11 * \times 12+\mathrm{a} 12 * \times 22+\mathrm{a} 13 * \times 32+\mathrm{a} 14 * \times 42 \\
& =2 * 1.140+(-3) * 0.250+3 *(-0.513)+1 \times(-8.661) \\
& =2.28-0.75-1.539-8.661 \\
& =-8.67 \\
\mathrm{za}=\mathrm{c} 13 & =\mathrm{a} 11 * \times 13+\mathrm{a} 12 * \times 23+\mathrm{a} 13 * \times 33+\mathrm{a} 14 * \times 43 \\
& =2 * 0.112+(-3) * 1.220+3 * 0.500+1 * 8.260 \\
& =0.224-3.66+1.5+8.260 \\
& =6.324 \\
\mathrm{~h} 1=\mathrm{c} 14 & =\mathrm{a} 11 * \times 14+\mathrm{a} 12 * \times 24+\mathrm{a} 13 * \times 34+\mathrm{a} 14 * \times 44 \\
& =2 * 0+(-3) * 0+3 * 0+1 * 1 \\
& =0+0+0+1 \\
& =1
\end{aligned}
$$

The other rotated vertices are computed in a similar manner:

$$
\begin{aligned}
& B^{\prime}=\left[\begin{array}{llll}
-5.208 & -3.047 & 3.932 & 1
\end{array}\right] \\
& C^{\prime}=\left[\begin{array}{llll}
-4.732 & -1.657 & 5.264 & 1
\end{array}\right] \\
& D^{\prime}=\left[\begin{array}{lll}
-4.458 & -1.907 & 4.044
\end{array}\right]
\end{aligned}
$$

## Microinstructions for Sample Matrix Manipulation

The 'ACT8837 FPU can compute the coordinates for graphic objects over a broad dynamic range. Also, the homogeneous scalar factors $\mathrm{h} 1, \mathrm{~h} 2, \mathrm{~h} 3$ and h 4 may be made unity due to the availability of large dynamic range. In the example presented below, some of the calculations pertaining to vertex $A^{\prime}$ are shown but the same approach can be applied to any number of points and any vector space.

The calculations below show the sequence of operations for generating two coordinates, xa and ya, of the vertex $A^{\prime}$ after rotation. The same sequence could be continued to generate the remaining two coordinates for $A^{\prime}$ (za and h1). The other vertices of the tetrahedron, $\mathrm{B}^{\prime}, \mathrm{C}^{\prime}$, and $\mathrm{D}^{\prime}$, can be calculated in a similar way.

A microcode sequence to generate this matrix multiplication is shown in Table 31. Table 32 presents a pseudocode description of the operations, clock cycles, and register contents for a single-precision matrix multiplication using the sum-of-products sequence presented in an earlier section. Registers used include the RA and RB input registers and the product $(\mathrm{P})$ and sum ( S ) registers.

Table 31. Microinstructions for Sample Matrix Multiplication
 four more cycles the second coordinate ya is output. Each subsequent coordinate can be calculated in four cycles so the 4-tuple for vertex $A^{\prime}$ requires a total of 18 cycles to complete.

Calculations for vertices $\mathrm{B}^{\prime}, \mathrm{C}^{\prime}$, and $\mathrm{D}^{\prime}$, can be executed in 48 cycles, 16 cycles for each vertex. Processing time improves when the transformation matrix is reduced, i.e., when the last column has the form shown below:


Table 32. Single-Precision Matrix Multiplication (PIPES2-PIPESO $=\mathbf{0 1 0}$ )

| CLOCK CYCLE | MULTIPLIER/ALU OPERATIONS | PSEUDOCODE |
| :---: | :---: | :---: |
| 1 | Load a11, x11 SP Multiply | $\begin{aligned} & \mathrm{a} 11 \rightarrow \mathrm{RA}, \times 11 \rightarrow \mathrm{RB} \\ & \mathrm{p} 1=\mathrm{a} 11 * \times 11 \end{aligned}$ |
| 2 | Load a12, x21 <br> SP Multiply <br> Pass P to S | $\begin{aligned} & \mathrm{a} 12 \rightarrow \mathrm{RA}, \times 21 \rightarrow \mathrm{RB} \\ & \mathrm{p} 2=\mathrm{a} 12 * \times 21 \\ & \mathrm{p} 1 \rightarrow \mathrm{P}(\mathrm{p} 1) \end{aligned}$ |
| 3 | Load a13, x31 SP Multiply Add P to S | $\begin{aligned} & a 13 \rightarrow R A, x 31 \rightarrow R B \\ & p 3=a 13 * x 31, p 2 \rightarrow P(p 2) \\ & P(p 1)+0 \rightarrow S(p 1) \end{aligned}$ |
| 4 | Load a14, x41 <br> SP Multiply <br> Add P to S | $\begin{aligned} & \mathrm{a} 14 \rightarrow \mathrm{RA}, \mathrm{x} 41 \rightarrow \mathrm{RB} \\ & \mathrm{p} 4=\mathrm{a} 14 * \times 41, \mathrm{p} 3 \rightarrow \mathrm{P}(\mathrm{p} 3) \\ & \mathrm{P}(\mathrm{p} 2)+\mathrm{S}(\mathrm{p} 1) \rightarrow \mathrm{S}(\mathrm{p} 1+\mathrm{p} 2) \end{aligned}$ |
| 5 | Load a11, x12 <br> SP Multiply Add P to S | $\begin{aligned} & a 11 \rightarrow R A, \times 12 \rightarrow R B \\ & p 5=a 11 * \times 12, p 4 \rightarrow P(p 4) \\ & P(p 3)+S(p 1+p 2) \rightarrow S(p 1+p 2+p 3) \end{aligned}$ |
| 6 | Load a12, x22 <br> SP Multiply <br> Pass $P$ to $S$ Output S | $\begin{aligned} & a 12 \rightarrow R A, \times 22 \rightarrow R B \\ & p 6=a 12 * \times 22, p 5 \rightarrow P(p 5) \\ & P(p 4)+S(p 1+p 2+p 3) \rightarrow \\ & S(p 1+p 2+p 3+p 4) \end{aligned}$ |
| 7 8 | Load a13, x32 <br> SP Multiply <br> Add P to S <br> Load a14, x42 <br> SP Multiply <br> Add P to S | $\begin{aligned} & \text { a13 } \rightarrow R A, \times 32 \rightarrow R B \\ & p 7=a 13 * \times 32, p 6 \rightarrow P(p 6) \\ & P(p 5)+0 \rightarrow S(p 5) \\ & a 14 \rightarrow R A, x 42 \rightarrow R B \\ & p 8=a 14 * \times 42, p 7 \rightarrow P(p 7) \\ & P(p 6)+S(p 5) \rightarrow S(p 5+p 6) \end{aligned}$ |
| 9 | Next operands Next instruction Add P to S | $\begin{aligned} & A \rightarrow R A, B \rightarrow R B \\ & p i=A * B, p 8 \rightarrow P(p 8) \\ & P(p 7)+S(p 5+p 6) \rightarrow S(p 5+p 6+p 7) \end{aligned}$ |
| 10 | Next operands Next instruction Output S | $\begin{aligned} & C \rightarrow R A, D \rightarrow R B \\ & p j=C * D, p i \rightarrow P(p i) \\ & P(p 8)+S(p 5+p 6+p 7) \rightarrow \\ & \quad S(p 5+p 6+p 7+p 8) \end{aligned}$ |

The h-scalars h1, h2, h3, and h4 are equal to 1 . The number of clock cycles to generate each 4 -tuple can then be decreased from 16 to 13 cycles. Total number of clock cycles to calculate all four vertices is reduced from 66 to 54 clocks. Figure 25 summarizes the overall matrix transformation.


Figure 25. Resultant Matrix Transformation
This microprogram can also be written to calculate sums of products with all pipeline registers enabled so that the FPU can operate in its fastest mode. Because of timing relationships, the $C$ register is used in some steps to hold the intermediate sum of products. Latency due to pipelining and chained data manipulation is 11 cycles for calculation of the first coordinate, and four cycles each for the other three coordinates.

After calculation of the first vertex, 16 cycles are required to calculate the four coordinates of each subsequent vertex. Table 33 presents the sequence of calculations for the first two coordinates, $x a$ and ya.

Table 33. Fully Pipelined Sum of Products (PIPES2-PIPESO $=000$ ) (Bus or Register Contents Following Each Rising Clock Edge)

| CLOCK CYCLE | $\begin{gathered} 1 \\ \text { BUS } \end{gathered}$ | $\begin{gathered} \text { DA } \\ \text { BUS } \\ \hline \end{gathered}$ | $\begin{gathered} \text { DB } \\ \text { BUS } \\ \hline \end{gathered}$ | $\begin{gathered} \text { I } \\ \text { REG } \end{gathered}$ | $\begin{gathered} \hline \text { RA } \\ \text { REG } \\ \hline \end{gathered}$ | $\begin{array}{c\|} \hline \text { RB } \\ \text { REG } \\ \hline \end{array}$ | $\begin{aligned} & \text { MUL } \\ & \text { PIPE } \\ & \hline \end{aligned}$ | ALU <br> PIPE | $\begin{gathered} \mathrm{P} \\ \text { REG } \end{gathered}$ | $\begin{gathered} \text { S } \\ \text { REG } \end{gathered}$ | $\begin{gathered} \mathrm{C} \\ \text { REG } \end{gathered}$ | $\begin{gathered} \mathrm{Y} \\ \text { BUS } \\ \hline \end{gathered}$ |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | Mul | $\times 11$ | a11 |  |  |  |  |  |  |  |  |  |
| 1 | Mul | $\times 21$ | a12 | Mul | $\times 11$ | a11 |  |  |  |  |  |  |
| 2 | Chn | x 31 | a13 | Mul | x21 | a12 | p1 |  |  |  |  |  |
| 3 | Mul | $\times 41$ | a14 | Chn | x31 | a13 | p2 |  | p1 |  |  |  |
| 4 | Chn | $\times 12$ | a11 | Mul | $\times 41$ | a14 | p3 | s1 | p2 |  |  |  |
| 5 | Chn | x22 | a12 | Chn | x12 | a11 | p4 | $\dagger$ | p3 | s1 | p2 |  |
| 6 | Chn | x32 | a13 | Chn | x22 | a12 | p5 | s2 | p4 | $\dagger$ | p2 |  |
| 7 | Chn | x42 | a14 | Chn | x 32 | a13 | p6 | s3 | p5 | s2 | p2 |  |
| 8 | Chn | $\times 13$ | a11 | Chn | x42 | a14 | p7 | s4 | p6 | s3 | s2 |  |
| 9 | Chn | $\times 23$ | a12 | Chn | x13 | a11 | p8 | xa | p7 | s4 | p6 |  |
| 10 | Chn | x33 | a13 | Chn | x23 | a12 | p9 | s5 | p8 | xa | p6 | xa |
| 11 | Chn | x43 | a14 | Chn | x33 | a13 | p10 | s6 | p9 | s5 | p6 |  |
| 12 | Chn | $\times 14$ | a11 | Chn | x43 | a14 | p11 | s7 | p10 | s6 | s5 |  |
| 13 | Chn | x24 | a12 | Chn | x14 | a11 | p12 | ya | p11 | s7 | p10 |  |
| 14 | Chn | x 34 | a13 | Chn | x24 | a12 | p13 | s8 | p12 | ya | p10 | ya |
| 15 | Chn | x44 | a14 | Chn | x34 | a13 | p14 | s9 | p13 | s8 | p10 |  |

${ }^{\dagger}$ Contents of this register are not valid during this cycle.
Products in Table 33 are numbered according to the clock cycle in which the operands and instruction were loaded into the RA, RB, and I register, and execution of the instruction began. Sums indicated in Table 33 are listed below:

$$
\begin{array}{lll}
s 1=p 1+0 & s 5=p 5+p 7 & s 9=p 10+p 12 \\
s 2=p 1+p 3 & s 6=p 6+p 8 & x a=p 1+p 2+p 3+p 4 \\
s 3=p 2+p 4 & s 7=p 9+0 & y a=p 5+p 6+p 7+p 8 \\
s 4=p 5+0 & s 8=p 9+p 11 &
\end{array}
$$

## SAMPLE MICROPROGRAMS FOR BINARY DIVISION AND SQUARE ROOT

The SN74ACT8837 Floating Point Unit supports binary division and square root calculations using the Newton-Raphson algorithm. The 'ACT8837 performs these calculations by executing sequences of floating-point operations according to the control settings contained in specific microprogrammed routines. This implementation of the Newton-Raphson algorithm requires that a seed ROM provide values for the first approximations of the reciprocals of the divisors.

This application note presents several microprograms for floating-point division and square root using the Newton-Raphson algorithm. Each sample program is analyzed briefly to show details of the floating-point procedures being performed.

## Binary Division Using the Newton-Raphson Algorithm

Binary division can be performed as an iterative procedure using the Newton-Raphson algorithm. For a dividend $A$, divisor $B$, and quotient $Q$, this procedure calculates a value for $1 / B$ which is then used to evaluate the expression $Q=A * 1 / B$. The calculation can be performed with either single- or double-precision operands, and examples of each precision are shown.

The basic algorithm calculates the value of a quotient Q by approximating the reciprocal of the divisor $B$ to adequate precision and then multiplying the dividend $A$ by the approximation of the reciprocal:

$$
\begin{aligned}
Q=A / B=A * X n, \text { where } X n & =\text { the value of } X \text { after the } n \text {th iteration } \\
n & =\text { the number of iterations to achieve the } \\
& \text { desired precision }
\end{aligned}
$$

Intermediate values of $X$ are calculated using the following expression:

$$
\begin{gathered}
X i+1=X i *(2-B * X i), \text { where } X 0=\text { approximates } 1 / B \text { for } \\
\text { the range } 0<X 0<2 / B
\end{gathered}
$$

To illustrate a program using the Newton-Raphson algorithm, the sequence of calculations is presented in detail. For double-precision operations, three iterations are
needed to achieve adequate precision in the value of $1 / B$. A value for the seed X0 (approximately equal to $1 / B$ ) is assumed to be given, and the following operations are performed to evaluate $\mathbf{Q}$ from double-precision inputs:

$$
\begin{aligned}
& X 1=X 0(2-B * X 0) \\
& X 2=X 1(2-B * X 1)=X 0(2-B * X 0) *(2-B * X O(2-B * X O)) \\
& X 3=X 2(2-B * X 2) \\
& X 3=X O(2-B * X 0) *(2-B * X O(2-B * X 0)) *(2-B * X O *(2-B \\
& \text { * XO) * (2-B * XO * (2-B * XO) ) } \\
& Q=A * 1 / B=A * X 3 \\
& \mathrm{~A} / \mathrm{B}=\mathrm{A} * \mathrm{XO}(2-\mathrm{B} * \mathrm{XO}) *(2-\mathrm{B} * \mathrm{XO}(2-\mathrm{B} * \mathrm{XO})) *(2-\mathrm{B} * \mathrm{XO} \\
& \text { * }(2-\mathrm{B} * \mathrm{XO}) *(2-\mathrm{B} * \mathrm{XO} *(2-\mathrm{B} * \mathrm{XO}))) \\
& \begin{array}{llll}
\mathrm{X} 1 & \text { X1 } 1 \quad \text { X1 }
\end{array}
\end{aligned}
$$

X3

Table 36 presents decimal and hexadecimal values for $A, B$, and $X O$, which are used in the sample calculation. The computed value of the quotient $Q$ is also included, showing the representations of the results of this sample division.

Table 34. Sample Data Values and Representations

| TERM | DECIMAL REPRESENTATION |  | IEEE HEXADECIMAL |
| :---: | :---: | :--- | :---: |
|  | VALUE | MANTISSA $\cdot 2$ EXPONENT |  |
| A | 22 | $1.375 * 24$ | 4036000000000000 |
| B | 7 | $1.75 * 22$ | 401C0000 00000000 |
| X0 | $1 / 7$ | $1.140625 * 2(-3)$ | 3FC24000 00000000 |
| Q | $22 / 7$ | $1.5714285714285713 * 21$ | 4009249249249249 |

In Table 35, the sequence and timing of this procedure is shown exactly as performed by the 'ACT8837. This example shows the steps in a double-precision division requiring three iterations to achieve the desired accuracy. In this table each operation is sequenced according to the clock cycles during which the instruction inputs for that operation are presented at the pins of the 'ACT8837. Operations are accompanied by a pseudocode summary of the operations performed by the 'ACT8837 and the clock cycle when an operand is available or a result is valid.

Each line of pseudocode indicates the operands being used, the operations being performed, the registers involved, and the clock cycles when the results appear. Each
register is represented by its usual abbreviation (RA, RB, P, S, or C) followed by the number of the clock cycle when an operand will be valid or available at the register. For example, "P.4" refers to the contents of the Product Register after the fourth clock cycle.

Table 35. Binary Division Using the Newton-Raphson Algorithm

| CLOCK <br> CYCLES | OPERATIONS | PSEUDOCODE |
| :---: | :---: | :--- |
| 1,2 | $\mathrm{~B} * \mathrm{XO}$ | $\mathrm{B} \rightarrow \mathrm{RA} .2, \mathrm{XO} \rightarrow \mathrm{RB} .2$ |
|  |  | $\mathrm{RA} .2 * \mathrm{RB} .2 \rightarrow \mathrm{P} .4$ |
| 3,4 | $2-\mathrm{B} * \mathrm{XO}$ | $2-\mathrm{P} .4 \rightarrow \mathrm{~S} .6$ |
| 5,6 | $\mathrm{X} 1=\mathrm{XO}(2-\mathrm{B} * \mathrm{XO})$ | $\mathrm{RB} .2 * \mathrm{~S} .6 \rightarrow \mathrm{P} .8$ |
| 7,8 | $\mathrm{~B} * \mathrm{X} 1$ | $\mathrm{RA} .2 * \mathrm{P} .8 \rightarrow \mathrm{P} .10$ |
| 9,10 | $2-\mathrm{B} * \mathrm{X} 1$ | $\mathrm{P} .8 \rightarrow \mathrm{C} .9,2-\mathrm{P} .10 \rightarrow \mathrm{~S} .12$ |
| 11,12 | $\mathrm{X} 2=\mathrm{X} 1(2-\mathrm{B} * \mathrm{X} 1)$ | $\mathrm{C} .9 * \mathrm{~S} .12 \mathrm{P} .14$ |
| 13,14 | $\mathrm{~B} * \mathrm{X} 2$ | $\mathrm{RA} .2 * \mathrm{P} .14 \rightarrow \mathrm{P} .16$ |
| 15,16 | $2-\mathrm{B} * \mathrm{X} 2$ | $\mathrm{P} .14 \rightarrow \mathrm{C} .15,2-\mathrm{P} .16 \rightarrow \mathrm{~S} .18$ |
| 17,18 | $\mathrm{X} 3=\mathrm{X} 2(2-\mathrm{B} * \mathrm{X} 2)$ | $\mathrm{A} \rightarrow \mathrm{RA} .18, \mathrm{C} .15 * \mathrm{~S} .18 \rightarrow \mathrm{P} .20$ |
| 19,20 | $\mathrm{~A} * \mathrm{X} 3$ | $\mathrm{RA} .18 * \mathrm{P} .20 \rightarrow \mathrm{P} .22$ |
| 21,22 | Output MSH | $\mathrm{P} .22 . \mathrm{MSH} \rightarrow \mathrm{Y}$ |

The sequence of operations can be microcoded for execution exactly as listed in the table above. Sample microprograms (with data and parity fields provided) are given below. To make the programs easier to follow, comment lines have been included to indicate clock timing, calculation performed by the instructions being loaded, and operations being represented, in the same pseudocode as in the preceding table. The fields in the microinstruction sequences presented below are arranged in the following order:

d h h hhh h h h hh h h h h h h h h h h h h h h hhhhhhh h hhhhhh h h h

All fields in the sample microcode sequences (except for line numbers) are represented as hexadecimal numbers. Line numbers are the only decimal numbers in the samples.

## Single-Precision Newton-Raphson Binary Division

Use of the Newton-Raphson algorithm is similar for both single- and double-precision operands. However, for implementations which handle both single- and doubleprecision division, it may be preferable to use a double-precision seed ROM, converting the double-precision seeds to single precision when necessary.

The following sample program involves conversion of a double-precision seed XO for use in single-precision division. Since $B$ is given as a single-precision number, it must be converted to double precision in order to address a double-precision seed ROM. Then the seed XO, which is double precision, must be converted to single precision for the actual calculation.

Two iterations are used in the single-precision example. Thus, the formula $0=A * 1 / B$ may be rewritten with $n=2$ :

$$
0=A * 1 / B=A * X 2
$$

where $\mathrm{X} 2=\mathrm{X} 1 *(2-\mathrm{B} * \mathrm{X} 1)$ and $\mathrm{X} 1=\mathrm{X} 0 *(2-\mathrm{B} * \mathrm{XO})$

$$
A * 1 / B=A * X O *(2-B * X 0) *[2-B * X O *(2-B * X 0)]
$$

Table 36 presents a single-precision division using a double-precision seed ROM. This example divides 22/7.

Table 36. Single-Precision Newton-Raphson Binary Division


Table 36. Single-Precision Newton-Raphson Binary Division (Concluded)
;
;Lines 13-14 Calculation: $B * X 1$
Operation: RA. 8 * P. $14 \rightarrow$ P. 16
1300040002 EF 00001100003113000000000000000000 1410040002 EF 00001100003113000000000000000000 ;
;Lines 15-16 Calculation: $2-(B * X 1)$
Operations: P. $14 \rightarrow$ C.15, $2-\mathrm{P} .16 \rightarrow \mathrm{~S} .18$
1500202002 FB 00001100003113000000000000000000 1610202002 FB O O O 01100003113000000000000000000 ;
;Lines 17-18 Calculation: $X 2=X 1(2-B * X 1)$
; Operations: $A \rightarrow$ RA.18, C. $15 * S .18 \rightarrow$ P. 20
$17000400029 F 00101100003113418000000000000000$ $18100400029 F 0010110000311341 \mathrm{BOOOOO} 0000000000$
;Lines 19-20 Calculation: A * X2
; Operations: RA. 18 * P. $20 \rightarrow$ P. 22
1900040002 EF 00001100003113000000000000000000 2010040002 EF 00001100003113000000000000000000 ;
;Lines 21-22 Operation: P. $22 \rightarrow \mathrm{Y}$
;
2100020002 EF 00001100003113000000000000000000 2210020002 EF 00001100003113000000000000000000

## Double-Precision Newton-Raphson Binary Division

If the value of $B$ is given as a double-precision number and $X O$ is looked up in a doubleprecision seed ROM, no conversions are required prior to performing a double-precision division using the Newton-Raphson algorithm. Three iterations are used in the doubleprecision example ( $n=3$ ). The following formula represents the sequence of calculations to be performed:

$$
\begin{aligned}
\mathrm{A} / \mathrm{B}= & \mathrm{A} * \mathrm{XO} *(2-\mathrm{B} * \mathrm{XO}) *[2-\mathrm{B} * \mathrm{XO} *(2-\mathrm{B} * \mathrm{XO})] \\
& *(2-\mathrm{B} * \mathrm{XO} \cdot(2-\mathrm{B} * \mathrm{XO}) *[2-\mathrm{B} * \mathrm{XO} .(2-\mathrm{B} * \mathrm{XO})])
\end{aligned}
$$

Table 37 shows a double-precision division using a double-precision seed ROM. The example divides 22/7.

Table 37. Double-Precision Newton-Raphson Binary Division
;
;
;
$\begin{array}{ll}\text {;Lines 1-4 } & \text { Calculation: } B * X O \\ ; & \text { Operations: } B \rightarrow \text { RA. } 4, X O \rightarrow \text { RB. } 4, \text { RA. } 4 * \text { RB. } 4 \rightarrow \text { P. } 8\end{array}$
;
01001 CO 002 FF 000011000031133 FC 240000000000000 02101 CO 002 FF 000011000031133 FC 240000000000000 03001 CO 002 FF 00111100003113401 COO 00000000000 04101 CO 002 FF 00111100003113401 COOOO 0000000000 ;
;Lines 5-8 Calculation: $2-(B * X O)$
; Operation: $2-\mathrm{P} .8 \rightarrow \mathrm{~S} .12$
;
0500382002 FB 00001100003113000000000000000000 0610382002 FB 00001100003113000000000000000000 0700382002 FB 00001100003113000000000000000000 0810382002 FB 00001100003113000000000000000000 ;
;Lines 9-12 Calculation: $\mathrm{X} 1=\mathrm{XO}(2-\mathrm{B} * \mathrm{XO})$
;
Operation: RB. $4 * S .12 \rightarrow$ P. 16
,
09001 CO 002 BF 00001100003113000000000000000000 10101 CO 002 BF 00001100003113000000000000000000 11001 CO 002 BF 00001100003113000000000000000000 12101 CO 002 BF 00001100003113000000000000000000

Table 37. Double-Precision Newton-Raphson Binary Division (Continued)
;

| ;Lines 13-16 | Calculation: B * X1 |
| :--- | :--- |
| $;$ | Operations: RA. $4 *$ P. $16 \rightarrow$ P. 20 |

13001 CO 002 EF 00001100003113000000000000000000 14101 CO 002 EF 00001100003113000000000000000000 15001 CO 002 EF 00001100003113000000000000000000 16101 CO 002 EF 00001100003113000000000000000000 ;
;Lines 17-20
Calculation: $2-(B * X 1)$
Operations: P. $16 \rightarrow \mathrm{C} .18,2-\mathrm{P} .20 \rightarrow \mathrm{~S} .24$
1700382002 FB 00001100003113000000000000000000 1811382002 FB 00001100003113000000000000000000 1900382002 FB 00001100003113000000000000000000 2010382002 FB 00001100003113000000000000000000 ;
;Lines 21-24 Calculation: $\mathrm{X} 2=\mathrm{X} 1(2-\mathrm{B} * \mathrm{X} 1)$

21001 CO 0029 F 00001100003113000000000000000000 22101 CO 0029 F 00001100003113000000000000000000 23001 CO 002 9F 00001100003113000000000000000000 24101 CO 002 9F 00001100003113000000000000000000 ;
;Lines 25-28 Calculation: $B * \times 2$
;
25001 CO 002 EF 00001100003113000000000000000000 26101 CO 002 EF 00001100003113000000000000000000 27001 CO 002 EF 00001100003113000000000000000000 28101 CO 002 EF 00001100003113000000000000000000 ;
;
;Lines 29-32 Calculation: $2-(B * X 2)$
;
Operations: P. $28 \rightarrow$ C. 30, $2-\mathrm{P} .32 \rightarrow \mathrm{~S} .36$
2900382002 FB 00001100003113000000000000000000 3011382002 FB 00001100003113000000000000000000 3100382002 FB 00001100003113000000000000000000 3210382002 FB 00001100003113000000000000000000

Table 37. Double-Precision Newton-Raphson Binary Division (Concluded)

```
;
;Lines 33-36 Calculation: X3 = X2(2-B * X2)
Operations: A }->\mathrm{ RA.36, C. 30 * S. 36 }->\mathrm{ P. }4
330 0 1C0 0 3 2 9F 0 0 1 1 1 1 0 0 0 0 3 1 1 3 40360000000000000 0 0
34101CO 0 3 2 9F 0 0 1 1 1 1 0 0 0 0 3 1 1 34036000000000000 0 0
350 0 1CO 0 3 2 9F O 0 1 1 1 1 0 0 0 0 3 1 1 300000000 00000000 0 0
3610 1CO 0 3 2 9F 0 0 1 1 1 10000 311 30000000000000000 0 0
;Lines 37-40 Calculation: A * X3
; Operations: RA. }36*\mathrm{ P. }40->\mathrm{ P. }4
37 O O 1CO O O 2 EF O O O O 1 1 0 O O O 3 1 1 3 00000000 00000000 0 0
381 0 1CO O O 2 EF O O O 0 1 1 0 0 0 0 3 1 1 3 00000000000000000 0 0
3900 1CO 0 O 2 EF O O O 0 1 1 0 0 0 0 3 1 1 30000000000000000 0 0
40 1 0 1CO O O 2 EF O O O O 1 1 0 O O O 3 1 1 3 00000000 00000000 0 0
;
;Lines 41-44 Operation: P.44.MSH }->\textrm{Y
4100120002 FF 0 0 0 0 1 1 0 0 0 0 3 1 1 3000000000000000000
4210120002 FF 0 0 0 0 1 1 00000 3 1 1 3000000000000000000
4300120002 FF 0 0 0 0 1 1 00000311 3000000000000000000
4410120002 FF 0 0 0 0 1 1 0 0 0 0 3 1 1 30000000000000000 0 0
;
;Line 45
Operation: P.44.LSH }->\textrm{Y

\section*{Binary Square Root Using the Newton-Raphson Algorithm}

Square roots may be calculated iteratively using the Newton-Raphson algorithm. The procedure is similar to Newton-Raphson division and involves evaluating the following expression:
\[
A=B * X n
\]
where \(\mathrm{Xn}=\) the value of X after the \(n\)th iteration given
\[
\begin{aligned}
\mathrm{Xi}+1 & =0.5 * X i *[3-B *(X i Z Z 2)] \\
X 0 & =\text { a guess at } 1 / \operatorname{sqrt}(B) \text { where } 0<X 0<\operatorname{sqrt}(3 / B) \\
\text { and } n & =\text { number of iterations to achieve the desired precision }
\end{aligned}
\]

\section*{Single-Precision Square Root Using a Double-Precision Seed ROM}

When the value of \(B\) is given in single precision, it must be converted to a doubleprecision number before it can be used to address a double-precision seed ROM. Since the seed XO is stored as a double-precision number, it must first be converted to single precision before it is used in the calculation.

Two iterations ( \(\mathrm{n}=2\) ) are used in a single-precision calculation so the following expression for \(\operatorname{sqrt}(\mathrm{B})\) is to be evaluated:
\[
\begin{aligned}
& A=B * X 2 \\
& \text { where } \mathrm{X} 2=0.5 * \mathrm{X} 1 *[3-B *(X 12)] \\
& \text { and } \mathrm{X} 1=0.5 * \mathrm{XO} *[3-\mathrm{B} *(\mathrm{XO} 2)] \\
& \mathrm{A}=\mathrm{B} * 0.5 * 0.5 * \mathrm{XO} *[3-\mathrm{B} *(\mathrm{XO} 2)] \\
& \text { * [3-B * (0.5 * XO * [3-B * (XO 2)]) 2] }
\end{aligned}
\]

Table 38. Single-Precision Binary Square Root
;
;Lines 1-2 Calculation: B s.p. \(\rightarrow\) d.p.
;
    Operations: \(B \rightarrow\) RA.1, (s.p. to d.p.)(RA.1) \(\rightarrow\) S. 2
0100026113 FF 00101100003113400000000000000000
0210026113 FF 00101100003113400000000000000000
;
;
;Lines 3-4 Calculation: Load XO
; Operation: \(\quad X O \rightarrow\) RA. 4
'O3 0 0 126 1 0 2 FF O O 101100003113 3FE6A000 0000000000
0410126102 FF 00101100003113 3FE6AOOO 0000000000
;
;Lines 5-6 Calculation: XO d.p. \(\rightarrow\) s.p.
; Operations: (d.p. to s.p.)(RA.4) \(\rightarrow\) S. 6
0500126102 FF 00101100003113 3FE6A0000000000000 0
0610126102 FF O 0101100003113 3FE6AOOO 0000000000
;
;
\(\begin{array}{ll}\text {;Lines 7-8 } & \text { Calculation: Load B, B * XO } \\ ; & \text { Operations: } \mathrm{S.} 6 \rightarrow \mathrm{C} .7, \mathrm{~B} \rightarrow \mathrm{RB} .8, \mathrm{RB} .8 * \mathrm{C} .7 \rightarrow \mathrm{P} .10\end{array}\)
;
\(07010401027 F 00010100003113400000000000000000\)
\(08100401027 F 00010100003113400000000000000000\)
;
,
;Lines 9-10
;
Calculation: B * XO 2
Operations: P. 10 * C. \(7 \rightarrow \mathrm{P} .12,3 \rightarrow \mathrm{RA} .10 \rightarrow \mathrm{~S} .12\)
\(09002600026 F 00101100003113404000000000000000\)
\(10102600026 F 00101100003113404000000000000000\)
;
,
;Lines 11-12 Calculation: \(3-(B * X O 2)\)
; Operation: \(\quad \mathrm{S} .12-\mathrm{P} .12 \rightarrow \mathrm{~S} .14\)

1100003002 FA 00001100003113000000000000000000 1210003002 FA 00001100003113000000000000000000

\title{
Table 38. Single-Precision Binary Square Root (Continued)
}
;
;Lines 13-14 Calculation: \(X 0\) * (3-(B * XO 2))
Operations: C. \(7 *\) S. \(14 \rightarrow\) P.16, \(1 / 2 \rightarrow\) RA. \(14 \rightarrow\) S. 16
\(13002600029 F 001011000031133 F 0000000000000000\) \(14102600029 F 001011000031133 F 0000000000000000\) ;
;Lines 15-16 Calculation: \(1 / 2 * X 0 *(3-(B * X 02)) \rightarrow X 1\)
; Operations: \(S .16 * P .16 \rightarrow P .18,0 \rightarrow R A .16\),
RA. 16 + RB. 8 S .18

1500240002 AF 00101100003113000000000000000000 1610240002 AF 00101100003113000000000000000000 ;
;Lines 17-18 Calculation: B * X 1
; Operations: S. \(18 *\) P. \(18 \rightarrow\) P. 20
1700040002 AF 00001100003113000000000000000000 1810040002 AF 00001100003113000000000000000000
;
;Lines 19-20 Calculation: \(B * \times 12\)

Operations: P. \(18 \rightarrow\) C.19, P. 20 * C. \(19 \rightarrow\) P.22,
\[
3 \rightarrow \text { RA. } 20 \rightarrow \text { S. } 22
\]
\(19012600026 F 00101100003113404000000000000000\) \(20102600026 F 00101100003113404000000000000000\) ;
;Lines 21-22 Calculation: \(3-(B * X 12)\)
; Operations: S. \(22-\mathrm{P} .22 \rightarrow\) S. 24

2100003002 FA 00001100003113000000000000000000 2210003002 FA 00001100003113000000000000000000 ;
;Lines 23-24 Calculation: \(\mathrm{X1}\) * (3-(B * X1 2))
; Operations: C. \(19 *\) S. \(24 \rightarrow\) P. \(26,1 / 2 \rightarrow\) RA. \(24 \rightarrow\) S. 26
\(23002600029 F 001011000031133 F 0000000000000000\) \(24102600029 F 001011000031133 F 0000000000000000\)

Table 38. Single-Precision Binary Square Root (Concluded)
;
;
;
;
;
2500240002 AF 00101100003113000000000000000000 2610240002 AF 00101100003113000000000000000000
;
;Lines 27-28 Calculation: \(\mathrm{B} * \mathrm{X} 2 \rightarrow \mathrm{~A}\)
; Operations: S. \(28 *\) P. \(28 \rightarrow\) P. 30
2700040002 AF 00001100003113000000000000000000 2810040002 AF 00001100003113000000000000000000 ;
;Lines 29-30 Calculation: NOP
; Operation: \(Y \rightarrow\) Output
'2901 OOA O O 2 FF O O O O 11100003113000000000000000000 3010 OOA O O 2 FF OOOO1100003113000000000000000000
\[
\begin{aligned}
A= & B * 0.5 * 0.5 * 0.5 * X 0 *(3-B *(X 02)] \\
& *[3-B *(0.5 * X 0 *[3-B *(X 02)]) 2] \\
& *[3-B *(0.5 * 0.5 * X 0 *[3-B *(X 02)] \\
& *[3-B *(0.5 * X 0 *[3-B *(X 02)]) 2]) 2]
\end{aligned}
\]

Table 39. Double-Precision Binary Square Root
;
;Lines 1-4 Calculations: Load B, Load X0, B * XO
;
;
Operations: \(\quad B \rightarrow\) RB. \(4, \mathrm{XO} \rightarrow\) RA. 4, RA. \(4 *\) RB. \(4 \rightarrow\) P. 8 RA. \(4 \rightarrow\) S. \(8 \rightarrow\) C. 10
\(01003 E 0002\) FF 00001100003113400000000000000000 0210 3EO 002 FF 00001100003113400000000000000000 0300 3EO 002 FF 00111100003113 3FE6AOOO 0000000000 0410 3EO 002 FF 00111100003113 3FE6AOOO 0000000000 ;
;
;Lines 5-8 Calculations: B * XO 2
Operations: P. \(8 * \mathrm{~S} .8 \rightarrow \mathrm{P} .12,3 \rightarrow\) RA. \(8 \rightarrow \mathrm{~S} .12\)
;
0500 3EO 002 AF 00001100003113000000000000000000 0610 3EO O 02 AF 00001100003113000000000000000000 0700 3EO 002 AF 00101100003113400800000000000000 0810 3EO 002 AF 00101100003113400800000000000000

5 Lع881כヤャLNS ;
;Lines 9-12 Calculations: \(3-(B * X O 2)\)
Operations: S .12 - P. \(12 \rightarrow \mathrm{~S} .16\)
0900183002 FA 00000100003113000000000000000000 1011183002 FA 00000100003113000000000000000000 1100183002 FA 00001100003113000000000000000000 1210183002 FA 00001100003113000000000000000000
;Lines 13-16 Calculations: XO * \((3-(\mathrm{B} * \mathrm{XO} 2))\)
Operations: \(\quad\) C. 10 *S. \(16 \rightarrow\) P. \(20,1 / 2 \rightarrow\) RA. \(16 \rightarrow\) S. 20
\(13003 E O 0029 F 00001100003113000000000000000000\) 1410 3EO O O 2 9F 00001100003113000000000000000000 15003 EO 002 9F 001011000031133 FEOOOOO 0000000000 1610 3EO O O 2 9F OO101100003113 3FEOOOOO OOOOOOOOOO

Table 39. Double-Precision Binary Square Root (Continued)
```

;
;Lines 17-20 Calculations: 1/2 * XO * (3-(B * XO 2)) }->\textrm{X}
Operations: S.20 * P. 20 P P. 24 -> C.25, 0 -> RA.20,
RA. }20+\mathrm{ RB. }4->\mathrm{ S. }2

```

```

1810 3C0 0 0 2 AF O O O O 1 1 0 0 0 0 3 1 1 30000000000000000 0 0
1900 3C0 0 O 2 AF O O 1 0 1 1 0 0 0 0 3 1 1 30000000000000000 0 0
2010 3CO O O 2 AF O O 1 011 0000 311 3000000000000000000
;
;Lines 21-24 Calculations: B * X1
Operations: S. 24 * P. }24->\mathrm{ P. }2
2100 1CO O O 2 AF OOOO11 0000 311 3000000000000000000
22101C0002 AF 0 0 0 0 1 1 0000 311 3000000000000000000
23001CO O O 2 AF OOO O11 0000 311 3000000000000000000
24101CO O O 2 AF OOOO11 0000 311 30000000000000000000
;
;Lines 25-28 Calculations: B * X1 2
; Operations: P. 28 * C. 25 -> P.32, 3 -> RA. 28 -> S. }3
2501 3EO O O 2 6FOOOO110000 311 3000000000000000000
2610 3EO O O 2 6F O O O O 1 1 0 0 0 0 3 1 1 300000000 00000000 0 0
270 0 3EO O O 2 6FOO1 O11 000 O 311 3400800000000000000
2810 3EO O O 2 6FOO10110000 311 3400800000000000000
;
;Lines 29-32 Calculations: 3-(B*X1 2)
Operations: S. 32 - P. 32 -> S. 36
2900183002 FA 0 0 0 0 0 1 00000311 30000000000000000000
3010183002 FA 0 0 0 0 0 1 00000311 3000000000000000000
3100183002 FA 0 0 0 0 1 1 00000 3 1 1 3000000000000000000
3210183002 FA 0 000110000 311 3000000000000000000

```

Table 39. Double-Precision Binary Square Root (Continued)
```

;

```
;Lines 33-36 Calculations: \(X 1\) * (3-(B*X1 2))
;
Operations: \(\quad \mathrm{C} .25 * \mathrm{~S} .36 \rightarrow \mathrm{P} .40,1 / 2 \rightarrow\) RA. 36 S .40
\(33003 E 00029 F 00001100003113000000000000000000\)
3410 3EO 002 9F 00001100003113000000000000000000
3500 3EO 002 9F 00101100003113 3FEOOOOO 0000000000
3610 3EO 002 9F 00101100003113 3FEOOOOO 0000000000
;
;
;Lines 37-40
;
;
    Calculations: \(1 / 2 * X 1 *(3-(B * X 12)) \rightarrow X 2\)
    Operations: \(\quad S .40 * P .40 \rightarrow P .44 \rightarrow C .45,0 \rightarrow\) RA. 40 ,
    RA. 40 + RB. 4 S. 44
    3700 3CO 002 AF 00001100003113000000000000000000
    3810 3CO 002 AF 00001100003113000000000000000000
    3900 3CO 002 AF 00101100003113000000000000000000
    \(40103 C O 002\) AF 00101100003113000000000000000000

Calculations: \(B\) * X2
        ; Operations: \(\quad \mathrm{S} .44 * \mathrm{P} .44 \rightarrow \mathrm{P} .48\)
        41001 CO 002 AF 00001100003113000000000000000000
        42101 CO 002 AF 00001100003113000000000000000000
        43001 CO 002 AF 00001100003113000000000000000000
        44101 CO 002 AF 00001100003113000000000000000000
        ;
        ;Lines 45-48
        Calculations: \(B\) * X2 2
        ; Operations: P. \(48 * \mathrm{C} .45 \rightarrow\) P.52, \(3 \rightarrow\) RA. \(48 \rightarrow\) S. 52
        4501 3EO 002 6F 00001100003113000000000000000000
        4610 3EO \(0026 F 00001100003113000000000000000000\)
        4700 ЗEO 002 6F 00101100003113400800000000000000
        4810 3EO \(0226 F 00101100003113400800000000000000\)

\section*{Table 39. Double-Precision Binary Square Root (Continued)}
;
;Lines 49-52 Calculations: \(3-(B * X 2\) 2)
;
Operations: S .52 - P. \(52 \rightarrow \mathrm{~S} .56\)
4900183002 FA 00000100003113000000000000000000 5010183002 FA 00000100003113000000000000000000 5100183002 FA 00001100003113000000000000000000 5210183002 FA 00001100003113000000000000000000 ;
;Lines 53-56 Calculations: \(X 2\) * \((3-(B * X 2\) 2) \()\)
;
Operations: \(\quad C .45 * S .56 \rightarrow P .60,1 / 2 \rightarrow\) RA. \(56 \rightarrow\) S. 60
53003 EO 0029 F 00001100003113000000000000000000
5410 3EO 002 9F 00001100003113000000000000000000
5500 3EO 002 9F 00101100003113 3FEOOOOO 0000000000 5610 3EO 002 9F 00101100003113 3FEOOOOO 0000000000 ;
;Lines 57-60 Calculations: \(1 / 2\) * X2 * (3 - (B * X2 ) ) \(\rightarrow \mathrm{X} 3\)
;
;
Operations: S. 60 * P. \(60 \rightarrow\) P.64, \(0 \rightarrow\) RA.60,
RA. \(60+\) RB. \(4 \rightarrow\) S. 64
\(57003 C 0002\) AF 00001100003113000000000000000000
5810 3CO 002 AF 00001100003113000000000000000000
\(59003 C 0002\) AF 00101100003113000000000000000000 \(60103 C O 002\) AF 00101100003113000000000000000000 ;
;
;Lines 61-64 Calculations: \(B * X 3 \rightarrow A\)
; Operations: S. \(64 *\) P. \(64 \rightarrow\) P. \(68 \rightarrow\) Y.MSH
;
61001 CO 002 AF 00001100003113000000000000000000
62101 CO 002 AF 00001100003113000000000000000000
63001 CO 002 AF 00001100003113000000000000000000
64101 CO 002 AF 00001100003113000000000000000000

Table 39. Double-Precision Binary Square Root (Concluded)
```

;
;Lines 65-68 Calculation: NOP
Operation: Y.MSH }->\mathrm{ Output
6501 18A 0 0 2 FF O O 0 0 1 1 0 0 0 0 3 1 1 30000000000000000 0 0
661 0 18A 0 0 2 FF 0 0 0 0 1 1 0000 3 1 1 30000000000000000 0 0
6700 18A 0 0 2 FF O O 0 0 1 1 0 0 0 0 3 1 1 300000000 00000000 0 0
681 0 18A 0 0 2 FF 0 0 0 0 1 1 0 0 0 0 3 1 1 300000000 00000000 0 0
;
;Line 69 Calculation: NOP
; Operation: Y.LSH }->\mathrm{ Output
;
6900 18A 0 0 2 FF OOO 01000003113000000000000000000

```

\section*{GLOSSARY}

Biased exponent - The true exponent of a floating point number plus a constant called the exponent field's excess. In IEEE data format, the excess or bias is 127 for singleprecision numbers and 1023 for double-precision numbers.

Denormalized number (denorm) - A number with an exponent equal to zero and a nonzero fraction field, with the implicit leading (leftmost) bit of the fraction field being 0 .

NaN (not a number) - Data that has no mathematical value. The 'ACT8837/'ACT8847 produces a NaN whenever an invalid operation such as \(0 * \infty\) is executed. The output format for an NaN is an exponent field of all ones, a fraction field of all ones, and a zero sign bit. Any number with an exponent of all ones and a nonzero fraction is treated as a NaN on input.

Normalized number - A number in which the exponent field is between 1 and 254 (single precision) or 1 and 2046 (double precision). The implicit leading bit is 1.

Wrapped number - A number created by normalizing a denormalized number's fraction field and subtracting from the exponent the number of shift positions required to do so. The exponent is encoded as a two's complement negative number.

\section*{Implementing a Double-Precision Seed ROM}

The seed ROM assumed in the previous microcode examples is a double-precision seed ROM containing both division and square root seeds. Six chips are necessary to build this seed ROM: five \(4 \times 4096\) registered PROMs and one latch (ordinarily implemented in a PAL). Figure 26 shows a sample implementation for a double-precision seed ROM.

Three of the PROMs are for generating the exponent part of the seed. All 11 exponent lines are necessary to accurately determine the exponent of the seed. There are 12 address lines in a \(4 \times 1024\) PROM, so the last address line can be used for a microcode bit that tells whether a divide or square root seed is being read. Since there are only 11 bits in the exponent and three PROMs are used, there are 12 output bits but one bit is not used. The equations giving the contents of the PROMs is given in a later section.

The other two PROMs generate the mantissa part of the seed. One address line of the PROMs is used for the microcode bit telling whether a divide or square root seed is to be used. For a square root seed, the least significant bit of the exponent is needed in generating the mantissa seed. Therefore, another address line of the PROMs is used by the least significant exponent bit. This leaves 10 address lines to be used to look up the mantissa seed. Since there are eight output bits from the two PROMs, an eightbit seed is generated.

The sign bit of \(B\) needs to be preserved for use when the seed is read. In the case of binary division, this requirement is obvious. In the square root calculation, the sign bit of \(B\) should always be zero. This condition should be tested by the microprogram.

Since every real square root has two answers, normally the positive answer is assumed. However, since the sign of \(B\) is meaningless to Newton-Raphson unless it is positive, the example microprograms assume that a negative \(B\) simply means that the negative of the square root of \(B\) is the desired answer instead of the positive root. This is accomplished by using the absolute value of \(B\) in all computations except for looking up the seed. If the seed is negative, then the answer generated will be the negative root.

\section*{PROM Contents}

Because one address line of the PROMs selects divide or square root, the PROMs can be considered to be divided functionally into two halves: the divide half and the square root half. Each functional half is discussed separately in the sections below.

\section*{Divide PROMs}

The exponent part of the seed is defined in the following manner. Assuming that \(B=m *\left(2^{e}\right)\) and \(X O=m^{\prime} *\left(2^{e^{\prime}}\right), \mathrm{e}^{\prime}\) is computed as \(\mathrm{e}^{\prime}=-\mathrm{e}\). Using the definition of an IEEE number, the value of \(m\) can be represented as a number within the following interval: \(1 \leq m<2\).


Figure 26. IEEE Double-Precision Seed ROM for Newton-Raphson Division and Square Root

This range of values of \(m\) can be subdivided into two cases:
\[
\mathrm{m}=1, \text { or } 1<\mathrm{m}<2
\]

Since \(m^{\prime}\) is computed as \(m^{\prime}=1 / m\), the range of \(m^{\prime}\) will be
\[
m^{\prime}=1, \text { or } 1 / 2<m^{\prime}<1
\]

To be represented as a normalized IEEE number, \(\mathrm{m}^{\prime \prime}\) would be
\[
\begin{equation*}
m^{\prime \prime}=m^{\prime} *\left(2^{1}\right)=2 / m \tag{1}
\end{equation*}
\]

This would make the range of \(\mathrm{m}^{\prime \prime}\)
\[
m^{\prime \prime}=2 \text {, or } 1<m^{\prime \prime}<2
\]

This is still not quite in the range of a valid IEEE number; however, \(\mathrm{m}^{\prime \prime}=2\) only when \(m=1\). Therefore, \(m^{\prime \prime}\) can be forced to be just less than 2 in this case.

Since \(X O=m^{\prime} *\left(2 e^{\prime}\right)\), to use \(\mathrm{m}^{\prime \prime}\) in the PROMs, we must have an \(\mathrm{e}^{\prime \prime}\) in the exponent such that \(X O=m^{\prime \prime} *\left(2 e^{\prime \prime}\right)\). This is true for \(e^{\prime \prime}=e^{\prime}-1\). Since, \(X O=m^{\prime \prime} *\left(2 e^{\prime \prime}\right)\), the following substitution can be made:
\[
\begin{aligned}
\mathrm{XO} & =\left(\mathrm{m}^{\prime} *(21)\right) *\left(2\left(\mathrm{e}^{\prime}-1\right)\right) \\
& =\mathrm{m}^{\prime} *(21) *\left(2 \mathrm{e}^{\prime}\right) *(2(-1)) \\
& =\mathrm{m}^{\prime} *\left(2 \mathrm{e}^{\prime}\right) *(2(1-1)) \\
& =\mathrm{m}^{\prime} *\left(2 \mathrm{e}^{\prime}\right) *(20) \\
& =\mathrm{m}^{\prime} *\left(2 \mathrm{e}^{\prime}\right)
\end{aligned}
\]

Therefore, if \(e^{\prime \prime}\) is used in the exponent PROMs and \(m^{\prime \prime}\) is used in the mantissa PROMs, a normalized IEEE seed can be generated. The only exception to the formula is that for \(m=1\),
\[
m^{\prime \prime}=2 / m-\text { delta }
\]

Where delta \(=2^{(-8)}\)
So \(\mathrm{m}^{\prime \prime}=2 / \mathrm{m}\), and \(\mathrm{e}^{\prime \prime}=(-\mathrm{e})-1\).
Since IEEE exponents are represented in excess 1023 notation, a formula for \(X^{\prime \prime}\) must be determined, given that \(X\) is the IEEE exponent. As an IEEE exponent, \(X=e+1023 \rightarrow e=X-1023\) and \(X^{\prime \prime}=e^{\prime \prime}+1023\). So, for \(X^{\prime \prime}\) in terms of \(X\),
\[
\begin{aligned}
X^{\prime \prime} & =e^{\prime \prime}+1023 \\
& =(-\mathrm{e})-1+1023 \\
& =(-(X-1023))+1022 \\
& =1023-X+1022 \\
& =2045-X
\end{aligned}
\]

So given the 11 bits of \(X\) as address of the seed exponent, the value stored at address \(X\) is
\[
\begin{equation*}
X^{\prime \prime}=2045-X \tag{2}
\end{equation*}
\]

Given that the mantissa seed ROM uses 10 bits of the mantissa to determine the seed, each seed Xm will be used for some range of mantissas, Bm to ( \(\mathrm{Bm}+2 *\) delta). The formula for Xm is from formula (1).
\[
\begin{array}{ll}
2 / \mathrm{Bm} & \rightarrow \mathrm{Xm} \\
2 /(\mathrm{Bm}+2 * \text { delta }) & \rightarrow \mathrm{Xm} \\
\text { Where delta }=2^{(-11)} &
\end{array}
\]

This value is used since the actual Xm should be generated by the mantissa in the center of the given range:
\[
X m=2 /(B m+\text { delta })
\]

This would result in a more accurate seed on the average. Therefore, the formula used to generate the mantissa part of the seed is
\[
\begin{equation*}
X m=2 /(B m+(2(-11))) \tag{3}
\end{equation*}
\]

\section*{Square Root PROMs}

The seed for the square root, \(X 0\), is actually the reciprocal of the square root of the data, B :
\[
X 0=1 /\left(B^{(1 / 2)}\right)
\]

Given \(B=m *\left(2^{e}\right)\) and \(X O=m^{\prime} *\left(2^{e^{\prime}}\right)\), the expression for \(X O\) can be evaluated by substitution and reduction:
\[
\begin{aligned}
X O & =1 /((\mathrm{m} *(2 \mathrm{e}))(1 / 2)) \\
& =1 /\left(\mathrm{m}^{(1 / 2) *(2(\mathrm{e} / 2)))}\right. \\
& =\mathrm{m}^{(-1 / 2) *(2(-\mathrm{e} / 2))}
\end{aligned}
\]

Then \(m^{\prime}\) and \(e^{\prime}\) may be written as \(m^{\prime}=m^{(-1 / 2)}\) and \(e^{\prime}=-e / 2\).
Next, it is necessary to verify that the above \(\mathrm{m}^{\prime}\) and \(\mathrm{e}^{\prime}\) form a valid normalized IEEE number. When \(e\) is an odd number, \(e^{\prime}\) is not an integer and, therefore, it is not valid IEEE exponent. If the above expression is separated into two cases, \(e^{\prime}\) can be represented in terms of a valid IEEE exponent, \(\mathrm{e}^{\prime \prime}\) :
\[
\begin{array}{ll}
e^{\prime}=-e / 2 & \text { for e even } \\
e^{\prime}=e^{\prime \prime}+1 / 2 & \text { for } e \text { odd }
\end{array}
\]

Rewriting \(e^{\prime \prime}\) in terms of e produces this expression:
\[
e^{\prime \prime}=e^{\prime}-1 / 2=(-e / 2)-1 / 2 \quad \text { for } e \text { odd }
\]

Then a valid IEEE exponent, \(\mathrm{e}^{\prime \prime}\), can be written for all e as
\[
\begin{array}{ll}
e^{\prime \prime}=-e / 2 & \text { for e even } \\
e^{\prime \prime}=(-e / 2)-1 / 2 & \text { for e odd }
\end{array}
\]

This is equivalent to \(e^{\prime \prime}=\operatorname{int}(-e / 2)\) for all e. However, the \(1 / 2\) affects the mantissa:
\[
\begin{array}{ll}
X O=m^{\prime} *\left(2 e^{\prime}\right) \\
X O=m^{\prime} *\left(2\left(e^{\prime \prime}+1 / 2\right)\right) & \text { for odd } e \\
X O=m^{\prime} *(21 / 2) *\left(2^{\prime \prime}\right) & \text { for odd } e
\end{array}
\]

Since \(X 0=m^{\prime \prime} *\left(2 e^{\prime \prime}\right) m^{\prime \prime}\) can be rewritten as
\[
\begin{array}{ll}
m^{\prime \prime}=m^{\prime} & \text { for even } e \\
m^{\prime \prime}=m^{\prime} *\left(2^{1 / 2}\right) & \text { for odd } e
\end{array}
\]

In terms of \(m, m^{\prime \prime}=m^{-1 / 2} \quad\) for even \(e\)
\[
m^{\prime \prime}=\left(m^{-1 / 2}\right) *\left(2^{1 / 2}\right)
\]
for odd e

Simplifying \(\mathrm{m}^{\prime \prime}\) for odd e ,
\[
\begin{array}{ll}
m^{\prime \prime}=\left(1 / m^{1 / 2}\right) *\left(2^{1 / 2}\right) & \text { for odd } e \\
m^{\prime \prime}=\left(2 / m^{1 / 2)}\right. & \text { for odd } e
\end{array}
\]

Just as the divide exponent needed to be converted to excess 1023 notation, so the same must be done for the square root:
\[
\begin{aligned}
& X^{\prime \prime}=e^{\prime \prime}+1023 \\
& X=e+1023 \\
& X^{\prime \prime}=\operatorname{int}(-e / 2)+1023 \\
& X^{\prime \prime}=\operatorname{int}((1023-X) / 2)+1023
\end{aligned}
\]

The IEEE bits for the exponent seed, \(\mathrm{X}^{\prime \prime}\), can be expressed in terms of the IEEE bits for the exponent of \(B, X\) :
\[
X^{\prime \prime}=\operatorname{int}((1023-X) / 2)+1023
\]

Because the formula for \(m^{\prime \prime}\) depends on the least significant bit of \(e\), that bit must be used as an address line to the mantissa.

Since \(X=e+1023\), an odd value of \(e\) will result in an even value of \(X\), and an even value of \(e\) will result in an odd value of \(X\). Therefore,
\[
\begin{array}{ll}
m^{\prime \prime}=m^{-1 / 2} & \text { for odd } X \\
m^{\prime \prime}=2 / m^{1 / 2} & \text { for even } X
\end{array}
\]

\section*{Overview}

\section*{SN74ACT8818 16-Bit Microsequencer}

\section*{SN74ACT8836 \(32 \times 32\)-Bit Parallel Multiplier}
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support ..... 8
Mechanical Data

\section*{SN74ACT8841 Digital Crossbar Switch}

The SN74ACT8841 is a single-chip digital crossbar switch that cost-effectively eliminates bottlenecks to speed data through complex bus architectures.

The 'ACT8841 has 16 four-bit bidirectional ports which can be connected in any conceivable combination. Total time for data transfer is \(14-\mathrm{ns}\) flowthrough.

The 'ACT8841 is ideal for multiprocessor application, where memory bottlenecks tend to occur. For example, four 32-bit buses can be easily connected by two 'ACT8841 devices. System architectures based on the 16-port 'ACT8841 can include up to 16 switching nodes (i.e., processors, memories, or bus interfaces). Larger processor arrays can be built with multistage interconnect schemes.

\title{
SN74ACT8841 DIGITAL CROSSBAR SWITCH
}
- High-Speed Programmable Switch for Parallel Processing Applications
- Dynamically Reconfigurable for FaultTolerant Routing
- 64 Bidirectional Data I/Os in 16 Nibble (Four-Bit) Groups
- Data I/O Selection Programmable by Nibble
- Eight Banks of Control Flip-Flops for Storing Configuration Programs
- Two Selectable Hard-Wired Switching Configurations
- Selectable Stored-Data or Real-Time Inputs
- 156-Pin Grid-Array Package
- CMOS \(1 \mu \mathrm{~m}\) EPIC \({ }^{\text {m }}\) Process
- Single 5-V Power Supply

\section*{description}

The SN74ACT8841 is a flexible, high-speed digital crossbar switch. It is easily microprogrammable to support user-definable interconnection patterns. This crossbar switch is especially suited to multiprocessor interconnects that are dynamically reconfigurable or even reprogrammable after each system clock. The 'ACT8841 is built in Texas instruments advanced \(1 \mu \mathrm{~m}\) EPIC \(^{\mathrm{m}}\) CMOS process to enhance performance and reduce power consumption. The switch requires only a \(5-\mathrm{V}\) power supply.
Because the 'ACT8841 is a 16 -port device, system architectures based on the 'ACT8841 can include up to 16 switching nodes, which may be processors, data memories, or bus interfaces. Larger processor arrays can be built with multistage interconnection schemes. Most applications will use the crossbar switch as a broadband bus interface controller, for example, between closely coupled processors which must exchange data with very low propagation delays.

The 'ACT8841 has ten selectable control sources, including eight banks of programmable control flip-flops and two hard-wired control circuits. The device can switch from 1 to 16 nibbles ( 4 to 64 bits) of data in a single cycle.

The 64 I/O pins of the 'ACT8841 are arranged in 16 switchable nibbles (see Figure 1). A single input nibble can be broadcast to any combination of 15 output nibbles, or even to 16 nibbles (including itself) if operating off registered data. Multiple input nibbles can be switched to multiple outputs, depending on the programmed configurations available in the control flip-flops.
The digital crossbar switch is intended primarily for multiprocessor interconnection and parallel processing applications. The device can be used to select and transfer data from multiple sources to multiple destinations. Since it can be dynamically reprogrammed, it is suitable for use in reconfigurable networks for fault-tolerant routing.

EPIC is a trademark of Texas Instruments Incorporated

\section*{description (continued)}

The 'ACT8841 and the bipolar SN74AS8840 share the same architecture. Microcode for the 'AS8840 can be run on the 'ACT8841 if the additional control inputs to the 'ACT8841 are properly terminated. However, because the 'ACT8841 is a CMOS device with six additional control inputs, the 'AS8840 and the 'ACT8841 are not socket-compatible and cannot be used interchangably. A summary of the differences between the SN74AS8840 and the SN74ACT8841 is provided in the 'AS8840 and 'ACT8841 FUNCTIONAL COMPARISON at the end of the data sheet.

The SN74ACT8841 is characterized for opertion from \(0^{\circ} \mathrm{C}\) to \(70^{\circ} \mathrm{C}\).
Table 1. 'ACT8841 Pin Grid Allocation
\begin{tabular}{|ll|ll|ll|ll|}
\hline & PIN & & & PIN & & PIN & NAME
\end{tabular}

\section*{SN74ACT8841 \\ DIGITAL CROSSBAR SWITCH}

Table 2. 'ACT8841 Pin Functional Description
\begin{tabular}{|c|c|c|c|}
\hline PIN & & & \\
\hline NAME & NO. & & DESCRIPTI \\
\hline CNTRO & \(J 15\) & \multirow{16}{*}{1/0} & \multirow{16}{*}{Control I/O. Inputs four control words to the control flip-flops on each CRCLK cycle. As outputs, the same addresses can be used to read the flip-flop settings.} \\
\hline CNTR1 & \(J 14\) & & \\
\hline CNTR2 & \(J 13\) & & \\
\hline CNTR3 & H15 & & \\
\hline CNTR4 & A9 & & \\
\hline CNTR5 & B9 & & \\
\hline CNTR6 & C9 & & \\
\hline CNTR7 & A8 & & \\
\hline CNTR8 & G1 & & \\
\hline CNTR9 & G2 & & \\
\hline CNTR10 & G3 & & \\
\hline CNTR1 1 & H1 & & \\
\hline CNTR12 & P7 & & \\
\hline CNTR13 & N7 & & \\
\hline CNTR14 & R7 & & \\
\hline CNTR15 & P8 & & \\
\hline CRADRO & B7 & 1 & Control register address. Selects 16 -bits of control flip-flops as a source/destination for outputs/inputs \\
\hline CRADR1 & A7 & & on CNTRO-CNTR15. (see Table 7) \\
\hline CRCLK & C8 & 1 & Control register clock. Clocks CNTRO-CNTR15 into the control flip-flops on low-to-high transition. \\
\hline CREADO & N8 & \multirow{3}{*}{1} & \multirow[t]{3}{*}{Selects one of eight banks of control flip-flops to read out on CNTRO CNTR15 in 16-bit words addressed by CRADR1-CRADRO.} \\
\hline CREAD1 & R8 & & \\
\hline CREAD2 & R9 & & \\
\hline CRSELO & G15 & \multirow{4}{*}{1} & \multirow{4}{*}{Selects one of ten control configurations.} \\
\hline CRSEL 1 & G14 & & \\
\hline CRSEL2 & G13 & & \\
\hline CRSEL3 & F15 & & \\
\hline CRSRCE & B8 & 1 & Load source select. When low selects CNTR inputs, when high selects DATA inputs. \\
\hline
\end{tabular}

\section*{SN74ACT8841}
digital crossbar switch

Table 2．＇ACT8841 Pin Functional Description（continued）
\begin{tabular}{|c|c|c|c|}
\hline \begin{tabular}{l}
PIN \\
NAME
\end{tabular} & NO． & I／O & DESCRIPTION \\
\hline CRWRITEO & J2 & & \\
\hline CRWRITE1 & J3 & 1 & Destination select．Selects one of eight control banks．（see Table 4） \\
\hline CRWRITE2 & K1 & & \\
\hline DO & N10 & & \\
\hline D1 & R11 & & \\
\hline D2 & P11 & & \\
\hline D3 & N11 & & \\
\hline D4 & P12 & & \\
\hline D5 & R13 & & \\
\hline D6 & N12 & & \\
\hline D7 & P13 & & \\
\hline D8 & N14 & & \\
\hline D9 & N15 & & \\
\hline D10 & M14 & & \\
\hline D11 & M15 & & \\
\hline D12 & L14 & & \\
\hline D13 & L15 & & \\
\hline D14 & K14 & & \\
\hline D15 & K13 & & \\
\hline D16 & F13 & 1／0 & 1／O data bits 0 through 31 （data bits 0 through 31 are the least significant half）． \\
\hline D17 & E15 & & \\
\hline D18 & E14 & & \\
\hline D19 & D15 & & \\
\hline D20 & D14 & & \\
\hline D21 & C15 & & \\
\hline D22 & D13 & & \\
\hline D23 & C14 & & \\
\hline D24 & B13 & & \\
\hline D25 & A13 & & \\
\hline D26 & B12 & & \\
\hline D27 & A12 & & \\
\hline D28 & B11 & & \\
\hline D29 & A11 & & \\
\hline D30 & B10 & & \\
\hline D31 & C10 & & \\
\hline D32 & C6 & \multirow{4}{*}{I／O} & \\
\hline D33 & A5 & & \multirow[t]{3}{*}{I／O data bits 32 through 35 （data bits 32 through 63 are the most significant half）．} \\
\hline D34 & B5 & & \\
\hline D35 & A4 & & \\
\hline
\end{tabular}

Table 2. 'ACT8841 Pin Functional Description (continued)
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|l|}{PIN} & 1/0 & DESCRIPTION \\
\hline NAME & NO. & & DESCRPTON \\
\hline D36 & B4 & \multirow{28}{*}{1/0} & \multirow{28}{*}{I/O data bits 36 through 63 (data bits 32 through 63 are the most significant half).} \\
\hline D37 & A3 & & \\
\hline D38 & C4 & & \\
\hline D39 & B3 & & \\
\hline D40 & C2 & & \\
\hline D41 & C1 & & \\
\hline D42 & D2 & & \\
\hline D43 & D1 & & \\
\hline D44 & E2 & & \\
\hline D45 & E1 & & \\
\hline D46 & F2 & & \\
\hline D47 & F3 & & \\
\hline D48 & K3 & & \\
\hline D49 & L1 & & \\
\hline D50 & L2 & & \\
\hline D51 & M1 & & \\
\hline D52 & M2 & & \\
\hline D53 & N1 & & \\
\hline D54 & M3 & & \\
\hline D55 & N2 & & \\
\hline D56 & P3 & & \\
\hline D57 & R3 & & \\
\hline D58 & P4 & & \\
\hline D59 & R4 & & \\
\hline D60 & P5 & & \\
\hline D61 & R5 & & \\
\hline D62 & P6 & & \\
\hline D63 & N6 & & \\
\hline GND & A1 & & \multirow{14}{*}{Ground (all pins must be used).} \\
\hline GND & A2 & & \\
\hline GND & A14 & & \\
\hline GND & A15 & & \\
\hline GND & B1 & & \\
\hline GND & B2 & & \\
\hline GND & B14 & & \\
\hline GND & B15 & & \\
\hline GND & C3 & & \\
\hline GND & C13 & & \\
\hline GND & D7 & & \\
\hline GND & D9 & & \\
\hline GND & G4 & & \\
\hline GND & G12 & & \\
\hline
\end{tabular}

SN74ACT8841
DIGITAL CROSSBAR SWITCH
\begin{tabular}{|c|c|c|c|}
\hline & & & Table 2. 'ACT8841 Pin Functional Description (continued) \\
\hline \multicolumn{2}{|l|}{PIN} & 1/0 & DESCRIPTION \\
\hline GND & J4 & & \\
\hline GND & \(J 12\) & & \\
\hline GND & M7 & & \\
\hline GND & M10 & & \\
\hline GND & N3 & & \\
\hline GND & N13 & & \\
\hline GND & P1 & & \\
\hline GND & P2 & & Ground (all pins must be used). \\
\hline GND & P14 & & \\
\hline GND & P15 & & \\
\hline GND & R1 & & \\
\hline GND & R2 & & \\
\hline GND & R14 & & \\
\hline GND & R15 & & \\
\hline LSCLCK & H13 & 1 & Clocks the least significant half of data inputs into the input registers on a low-to-high transition. \\
\hline MSCLK & H3 & 1 & Clocks the most significant half of data inputs into the input registers on a low-to-high transition. \\
\hline \(\overline{\mathrm{OEC}}\) & J1 & 1 & Output enable for control flip-flops, active low \\
\hline \(\overline{\text { OEDO }}\) & P10 & & \\
\hline OED 1 & R12 & & \\
\hline OED2 & L13 & & \\
\hline OED3 & K15 & & \\
\hline OED 4 & F14 & & \\
\hline OED5 & E13 & & \\
\hline OED6 & C11 & & \\
\hline OED7 & A 10 & 1 & \\
\hline OED8 & B6 & 1 & Output enables for data nibbles, active low \\
\hline OED9 & C5 & & \\
\hline OED 10 & E3 & & \\
\hline OED11 & F1 & & \\
\hline OED 12 & K2 & & \\
\hline OED13 & L3 & & \\
\hline OED14 & N5 & & \\
\hline OED15 & R6 & & \\
\hline
\end{tabular}

Table 2. 'ACT8841 Pin Functional Description (concluded)
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|l|}{PIN} & & \multirow[b]{2}{*}{DESCRIPTION} \\
\hline NAME & NO. & & \\
\hline SELDLS & H14 & 1 & When low, selects the stored, least significant data input to the main internal bus. When high, realtime data is selected. \\
\hline SELDMS & H2 & 1 & When low, selects the stored, most significant data input to the main internal bus. When high, realtime data is selected. \\
\hline \[
\begin{aligned}
& \text { TPO } \\
& \text { TP1 }
\end{aligned}
\] & \[
\begin{aligned}
& \hline \text { P9 } \\
& \text { R10 }
\end{aligned}
\] & 1 & Test pins. High during normal operation. (see Table 9) \\
\hline \(V_{\text {CC }}\) & C7 & & \\
\hline \(V_{\text {CC }}\) & C12 & & \\
\hline \(V_{\text {CC }}\) & D3 & & \\
\hline \(V_{\text {CC }}\) & D8 & & \\
\hline \(V_{\text {CC }}\) & H4 & & 5-V supply \\
\hline \(V_{\text {CC }}\) & H12 & & 5-V supply \\
\hline \(V_{\text {CC }}\) & M8 & & \\
\hline \(V_{C C}\) & M13 & & \\
\hline \(V_{\text {CC }}\) & N4 & & \\
\hline \(V_{\text {CC }}\) & N9 & & \\
\hline \(\overline{\text { WE }}\) & A6 & 1 & Write enable for control flip-flops, active low \\
\hline
\end{tabular}

\section*{overview}

The 64 I/O pins of the 'ACT8841 are arranged in 16 nibble (four-bit) groups where each set of four pins serves as bidirectional inputs to and outputs from a nibble multiplexer. During a switching operation, each nibble passes four bits of either stored or real-time data to the main internal 64-bit data bus. Each output multiplexer will independently select one of the 16 nibbles from this 64 -bit data bus.

Data nibbles are organized into two groups: the least significant half (D31-DO) and the most significant half (D63-D32). Stored versus real-time data inputs can be selected separately for the LSH and the MSH. Two clock inputs, LSCLK and MSCLK, are available to latch LSH and MSH data inputs, respectively, into the data register.

The pattern of output nibbles resulting from the switching operation is determined by a selectable control source, either one of eight banks of programmable control flip-flops or one of two hard-wired switching configurations. Inputs to the control flip-flops can be loaded either from the data bus or from control I/Os. A separate clock (CRCLK) is provided for loading the banks of control flip-flops.

\section*{SN74ACT8841}
digital crossbar switch
logic symbol


FIGURE 1
Texas
INSTRUMENTS

\section*{SN74ACT8841 \\ DIGITAL CROSSBAR SWITCH}

\section*{architecture}

The＇ACT8841 digital crossbar switch has its 64 data I／Os arranged in 16 multiplexer logic blocks，as shown in Figure 2．Each nibble multiplexer logic block handles four bits of real－time input and four bits of stored－ data input，and either input can be passed to the common data bus．

Two input multiplexer controls are provided to select between stored and real－time inputs．SELDLS controls input data selection for the LSH（D31－DO）of the 64－bit data input，and SELDMS for the MSH（D63－D32）． The input register clocks，LSCLK and MSCLK，are grouped in the same way and are used to clock data into the registers in the multiplexer logic blocks．The 16 data input nibbles make up the 64 data bits on the internal main bus．

This common bus supplies 16 data nibbles to a 16－to－1 output multiplexer in each multiplexer logic block （see Figure 3）．As determined by one of ten selectable control sources，the 16 －to－ 1 output multiplexer selects a data nibble to send to the outputs via the three－state output driver．

Control of the input and output multiplexers determines the input－to－output pattern for the entire crossbar switch．Many different switching combinations can be set up by programming the control flip－flop configurations to determine the outputs from the 16－to－1 multiplexers．

For example，the switch can be programmed to broadcast one data input nibble through the other 15 nibbles （ 60 outputs）．Conversely，a 15 －to－ 1 nibble multiplexer can be configured by programming the switch to select and output a single data nibble from the 64 －bit bus．Several examples are described in more detail in a later section．
functional block diagram


FIGURE 2

Texas
INSTRUMENTS



\section*{SN74ACT8841 \\ digital crossbar switch}

\section*{multiplexer logic group}

There are 16 multiplexer logic blocks, one for each nibble. External data flows from four data I/O pins into a logic block. A block diagram of the multiplexer logic is shown in Figure 3. The data inputs are either clocked into the data register or passed directly to the main internal bus. The 64 bits of data from the main bus are presented to a 16-to-1 multiplexer, which selects the data nibble output.

Each of the 16 nibble multiplexer logic blocks contains eight control flip-flop (CF) groups, one for each of the control banks. A control bank stores one complete switching configuration. Each CF group consists of four D-type edge-triggered flip-flops. In Figure 3, the CF groups are shown as CFXX0 to CFXX7, where \(X X\) indicates the number of the nibble multiplexer logic group ( \(0<=X X<=15\) ). CFXXO represents the 16 CF groups (one from each logic block) which make up flip-flop control bank 0, CFXX1 the 16 CF groups in bank 1, etc.

In addition to the eight banks of programmable flip-flops, two hard-wired switching configurations can be selected. The MSH/LSH exchange directs the input nibbles from each half of the switch to the data outputs directly opposite. This switching pattern is shown in Table 3 below. For example, data input on D11-D8 is output on D43-D40, and data input on D43-D40 is output on D11-D8.

Table 3. MSH/LSH Exchange
\begin{tabular}{|ccc|}
\hline LSH & & MSH \\
\hline D3-D0 & \(\longleftrightarrow\) & D35-D32 \\
D7-D4 & \(\longleftrightarrow\) & D39-D36 \\
D11-D8 & \(\longleftrightarrow\) & D43-D40 \\
D15-D12 & \(\longleftrightarrow\) & D47-D44 \\
D19-D16 & \(\longleftrightarrow\) & D51-D48 \\
D23-D20 & \(\leftrightarrow\) & D55-D52 \\
D27-D24 & \(\longleftrightarrow\) & D59-D56 \\
D31-D28 & \(\longleftrightarrow\) & D63-D60 \\
\hline
\end{tabular}

The second hard-wired configuration, a read-back function, causes all 64 bit to be output on the same I/Os on which they were input. Neither of the hard-wired control configurations affects the contents of the control banks.

The control source select, CRSEL3-CRSELO, determines which switching pattern is selected, as shown in Table 4.

Table 4. 16-to-1 Output Multiplexer Control Source Selects
\begin{tabular}{|cccc|cc|}
\hline CRSEL3 & CRSEL2 & CRSEL1 & CRSELO & CONTROL SOURCE SELECTED \\
\hline L & L & L & L & Control bank 0 & (programmable) \\
L & L & L & H & Control bank 1 & (programmable) \\
L & L & H & L & Control bank 2 & (programmable) \\
L & L & H & H & Control bank 3 & (programmable) \\
L & H & L & L & Control bank 4 & (programmable) \\
L & H & L & H & Control bank 5 & (programmable) \\
L & H & H & L & Control bank 6 & (programmable) \\
L & H & H & H & Control bank 7 & (programmable) \\
H & X & X & L & MSH/LSH exchange* \\
H & X & X & H & Read-back (output echoes input)* \\
\hline
\end{tabular}

\footnotetext{
*Hard-wired switching configuration
\(X=\) don't care
}

POST OFFICE BOX 655012 • DALLAS, TEXAS 75265

\section*{control words}

A CF group can store a four-bit control word (CFN3-CFNO) to select the output of the 16-to-1 multiplexer for that nibble port. One control word is loaded in each CF group. A total of 16 words, one per multiplexer logic block, are loaded in a bank to configure one complete switching pattern. Table 5 lists the control words and the input data each selects.

Each control word can be stored in a CF group and sent as an internal control signal to select the output of a 16-to-1 multiplexer in a nibble logic block. For example, any CF group loaded with the word 'LHHH' will select the data input on D31-D28 as the outputs of the associated nibble. If all 16 CF groups in a bank were loaded with "LHHH,' the same output (D31-D28) would be selected by the entire switch.

Table 5. 16-to-1 Output Multiplexer Control Words
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{4}{|c|}{INTERNAL SIGNALS} & \multirow[t]{2}{*}{INPUT DATA SELECTED AS MULTIPLEXER OUTPUT} \\
\hline CFN3 & CFN2 & CFN 1 & CFNO & \\
\hline L & L & L & L & D3-D0 \\
\hline L & L & L & H & D7-D4 \\
\hline L & L & H & L & D11-D8 \\
\hline L & L & H & H & D15-D12 \\
\hline L & H & L & L & D19-D16 \\
\hline L & H & L & H & D23-D20 \\
\hline L & H & H & L & D27-D24 \\
\hline L & H & H & H & D31-D28 \\
\hline H & L & L & L & D35-D32 \\
\hline H & L & L & H & D39-D36 \\
\hline H & L & H & L & D43-D40 \\
\hline H & L & H & H & D47-D44 \\
\hline H & H & L & L & D51-D48 \\
\hline H & H & L & H & D55-D52 \\
\hline H & H & H & L & D59-D56 \\
\hline H & H & H & H & D63-D60 \\
\hline
\end{tabular}

\section*{loading control configurations}

CRWRITE2-CRWRITEO select which control bank is being loaded, as shown in Table 6.
Table 6. Control Flip-Flops Load Destination Select
\begin{tabular}{|ccc|c|}
\hline CRWRITE2 & CRWRITE1 & CRWRITEO & DESTINATION \\
\hline L & L & L & Control bank 0 \\
L & L & H & Control bank 1 \\
L & H & L & Control bank 2 \\
L & H & H & Control bank 3 \\
H & L & L & Control bank 4 \\
H & L & H & Control bank 5 \\
H & H & L & Control bank 6 \\
H & H & H & Control bank 7 \\
\hline
\end{tabular}

\section*{SN74ACT8841 \\ DIGITAL CROSSBAR SWITCH}

The control words for a bank can be loaded either 16 bits at a time on the control I/O pins (CNTR15-CNTRO) or all 64 bits at once on the data inputs (D63-DO). If the control load source select, CRSRCE, is high, the words are loaded from the data inputs. When CRSRCE \(=L\), the CNTR inputs are used.

When a control bank is loaded from the data inputs, \(\overline{\text { WE }}\), CRSRCE, CRWRITE2-CRWRITEO, and the control register clock CRCLK are used in combination to load all 16 control words ( 64 bits) in a single cycle. A MSH/LSH exchange like that shown in Table 3 is used to load the flip flops on a rising CRCLK clock edge. For example, data inputs D3-DO go to the data bus and then to the CF group that selects the data outputs for D35-D32. CRWRITE2-CRWRITEO select the control bank that is loaded (see Table 6).

The CNTR15-CNTRO inputs can also be used to load the control banks. The bank is selected by CRWRITE2-CRWRITEO (see Table 6). Four control words per CRCLK cycle can be input to the CF groups (CFXX) that make up the bank. The CF groups loaded are selected by CRADR1-CRADRO, as shown in Table 7. Four CRCLK cycles are needed to load an entire control bank.

Table 7. Loading Control Flip-Flops from CNTR I/Os
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{CRAD1} & \multirow[t]{2}{*}{CRADO} & \multirow[t]{2}{*}{\(\overline{W E}\)} & \multirow[t]{2}{*}{CRCLK} & \multicolumn{4}{|l|}{CF GROUPS LOADED BY CONTROL (CNTR) I/O NUMBERS} \\
\hline & & & & 15-12 & 11-8 & \(7-4\) & 3-0 \\
\hline L & L & L & 5 & CF12 & CF8 & CF4 & CFO \\
\hline L & H & L & \(\Gamma\) & CF13 & CF9 & CF5 & CF1 \\
\hline H & L & L & - & CF14 & CF10 & CF6 & CF2 \\
\hline H & H & L & 5 & CF15 & CF11 & CF7 & CF3 \\
\hline X & X & H & X & \multicolumn{4}{|c|}{Inhibit write to flip-flops} \\
\hline
\end{tabular}

To read out the control settings, the same address signals can be used, except that no CRCLK signal is needed and \(\overline{\mathrm{OEC}}\) is pulled low. CREAD2-CREADO select the bank to be read; the format is the same as for CRWRITE2-CRWRITEO, shown in Table 6.

Using the control I/Os to read the control bank settings can be valuable during debugging or diagnostics. Control settings are volatile and will be lost if the 'ACT8841 is powered off. An external program controlling switch operation may need to read the control bank settings so that it can save and restore the current switching configurations.

\section*{test pins}

TP1-TPO test pins are provided for system testing. As Table 8 shows, these pins should be maintained high during normal operation. To force all outputs and I/Os low, low signals are placed on TP1-TPO and all output enables ( \(\overline{\mathrm{OED}} 15-\overline{\mathrm{OED}} 0\) and \(\overline{\mathrm{OEC}}\) ). To force all outputs and I/Os high, TP1 and all output enables are pulled low, and TPO is driven high. When TPO is left low and a high signal is placed on TP1, all outputs on the 'ACT8841 are placed in a high-impedance state, isolating the chip from the rest of the system.

Table 8. Test Pin Inputs
\begin{tabular}{|cc|c|c|l|}
\hline TP1 & TPO & \begin{tabular}{c}
\(\overline{\text { OED15- }}\) \\
\(\overline{\text { OED0 }}\)
\end{tabular} & \(\overline{\text { OEC }}\) & \multicolumn{1}{|c|}{ RESULT } \\
\hline L & L & L & L & All outputs and I/Os forced low \\
L & H & L & L & All outputs and I/Os forced high \\
H & L & X & X & All outputs placed in a high-impedance state \\
H & H & X & X & Normal operation (default state) \\
\hline
\end{tabular}

Texas

\section*{SN74ACT8841}
dIGITAL CROSSBAR SWITCH

\section*{examples}

Most 'ACT8841 switch configurations are straightforward to program, involving few control signals and procedures to set up the control words in the banks of flip-flops. Control signals and procedures for loading and using control words are shown in the following examples.

\section*{broadcasting a nibble}

Any of the 16 data input nibbles can be broadcast to the other 15 data nibbles for output. For ease of presentation, input nibble D63-D60 is used in this example. Example 1 presents the microcode sequence for loading flip-flop bank 0 and executing the nibble broadcast.

The low signal on CRSRCE selects CNTR15-CNTRO as the input source, and the low signals on CRWRITE2-CRWRITEO select flip-flop bank 0 as the destination. Table 5 shows that to select data on D63-D60 as the output nibble, the four bits in the control word CFN3-CFNO must be high; therefore the CNTR15-CNTRO inputs are coded high. The four microcode instructions shown in Example 1 load the same control word from CNTR15-CNTRO into all 16 CF groups of bank 0.
Once the control flip-flops have been loaded, the switch can be used to broadcast nibble D63-D60 as programmed. The microcode instruction to execute the broadcast is shown as the last instruction in Example 1. \(\overline{W E}\) is held high and the data to be broadcast is input on D63-D60. The high signal on SELDMS selects a real-time data input for the broadcast. MSCLK and LSCLK (not shown) can be used to load the input registers if the input nibble is to be retained. No register clock signals are needed if the input data is not being stored.
The banks of control flip-flops not selected as a control source can be loaded with new control words or read out on CNTR15-CNTRO while the switch is operating. For example, the MSH data inputs can be used to load flip-flop bank 1 of the LSH while bank 0 of the LSH is controlling data I/O.

Example 1．Programming a Nibble Broadcast
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{\[
\begin{array}{|l|l|}
\hline \text { INSt. } \\
\text { no. }
\end{array}
\]} & \multirow[b]{2}{*}{CRSRCE} & \multicolumn{3}{|l|}{\multirow[t]{2}{*}{CRWRITE2 CRWRITE1 CRWRITEO}} & \multicolumn{2}{|l|}{\multirow[b]{2}{*}{CRADR1 CRADRO}} & \multicolumn{4}{|c|}{CNTR I／O Numbers} & \multirow[b]{2}{*}{CRSEL3} & \multirow[b]{2}{*}{CRSEL2} & \multirow[b]{2}{*}{CRSEL1} & \multirow[b]{2}{*}{CRSELO} & \multirow[b]{2}{*}{\(\overline{\text { WE }}\)} & \multirow[b]{2}{*}{SELDMS} & \multirow[b]{2}{*}{SELDLS} & \multicolumn{3}{|r|}{\multirow[b]{2}{*}{OED 15 －\(\overline{\text { ED }} 0\)}} & & \multirow[b]{2}{*}{OEC} & \multirow[t]{2}{*}{CRCLK} \\
\hline & & & & & & & 15.12 & 11－8 & 7.4 & 3－0 & & & & & & & & & & & & & \\
\hline 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1111 & 1111 & 1111 & 1111 & x & x & x & x & 0 & x & x & xxxx & x \(\times\) x \(\times\) & xxxx & x xxx & 1 & \(\Gamma\) \\
\hline 2 & 0 & 0 & 0 & 0 & 0 & 1 & 1111 & 1111 & 1111 & 1111 & x & \(x\) & x & x & 0 & x & x & xxxx & xxxx & xxxx & xxxx & 1 & 5 \\
\hline 3 & 0 & 0 & 0 & 0 & 1 & 0 & 1111 & 1111 & 1111 & 1111 & x & x & x & x & 0 & x & x & xxxx & \(x \times x x\) & xxxx & \(x \times x x\) & 1 & 5 \\
\hline 4 & 0 & 0 & 0 & 0 & 1 & 1 & 1111 & 1111 & 1111 & 1111 & x & x & x & \(\times\) & 0 & x & x & xxxx & xxxx & xxxx & xxxx & 1 & 5 \\
\hline 5 & X & \(\times\) & \(\times\) & x & \(\times\) & \(\times\) & xxxx & xxxx & xxxx & xxxx & 0 & 0 & 0 & 0 & 1 & 1 & x & 1000 & 0000 & 0000 & 0000 & 1 & None \\
\hline
\end{tabular}
\begin{tabular}{|c|l|}
\multicolumn{2}{c|}{ Comments } \\
\hline INST．NO． & \multicolumn{1}{|c|}{ COMMENT } \\
\hline 1 & Loads CF12，CF8，CF4，CFO of bank O \\
2 & Loads CF13，CF9，CF5，CF1 of bank O \\
3 & Loads CF14，CF10，CF6，CF2 of bank 0 \\
4 & Loads CF15，CF11．CF7，CF3 of bank O \\
5 & Selects bank O for switching contol \\
Selects real－time data inputs \\
\hline
\end{tabular}

\section*{SN74ACT8841 DIGITAL CROSSBAR SWITCH}

\section*{programming an MSH/LSH exchange}

A second, more complicated example involves programming the switch to swap corresponding nibbles between the MSH and the LSH (first nibble in the LSH for first nibble in the MSH, and so on). This swap can be implemented using the hard-wired logic circuit selected when CRSEL3 is high and CRSELO is low. Programming this swap without using the MSH/LSH exchange logic requires loading a different control word into each mux logic block. This is described below for purposes of illustration.
Each nibble in one half, either LSH or MSH, selects as output the registered data from the corresponding nibble in the other half. The registered data from D35-D32 is to be output on D3-D0, the registered data from D3-D0 is output on D35-D32, and so on for the remaining nibbles. As shown in Table 4, the flip-flops for D3-D0 have to be set to 1000 and the D35-D32 inputs must be low. The CF groups and control words involved in this switching pattern are listed in Table 9.

Table 9. Control Words for an MSH/LSH Exchange
\begin{tabular}{|c|c|c|c|}
\hline \begin{tabular}{c} 
CF \\
GROUP
\end{tabular} & \begin{tabular}{c} 
CNTR INPUTS \\
TO LOAD \\
FLIP-FLOPS
\end{tabular} & \begin{tabular}{c} 
CONTROL \\
WORD \\
LOADED
\end{tabular} & RESULTS \\
\hline CF15 & & 0111 & D31-D28 \(\rightarrow\) D63-D60 \\
CF14 & CNTR15- & 0110 & D27-D24 \(\rightarrow\) D59-D56 \\
CF133 & CNTR12 & 0101 & D23-D20 \(\rightarrow\) D55-D52 \\
CF12 & & 0100 & D19-D16 \(\rightarrow\) D51-D48 \\
\hline CF11 & & 0011 & D15-D12 \(\rightarrow\) D47-D44 \\
CF10 & CNTR11- & 0010 & D11-D8 \(\rightarrow\) D43-D40 \\
CF9 & CNTR8 & 0001 & D7-D4 \(\rightarrow\) D39-D36 \\
CF8 & & 0000 & D3-DO \(\rightarrow\) D35-D32 \\
\hline CF7 & & 1111 & D63-D60 \(\rightarrow\) D31-D28 \\
CF6 & CNTR7- & 1110 & D59-D56 \(\rightarrow\) D27-D24 \\
CF5 & CNTR4 & 1101 & D55-D52 \(\rightarrow\) D23-D20 \\
CF4 & & 1100 & D51-D48 \(\rightarrow\) D19-D16 \\
\hline CF3 & & 1011 & D47-D44 \(\rightarrow\) D15-D12 \\
CF2 & CNTR3- & 1010 & D43-D40 \(\rightarrow\) D11-D8 \\
CF1 & CNTRO & 1001 & D39-D36 \(\rightarrow\) D7-D4 \\
CFO & & 1000 & D35-D32 \(\rightarrow\) D3-D0 \\
\hline
\end{tabular}

With this list of control words and the signals in Table 7, the 16-bit control inputs on CNTR15-CNTRO can be arranged to load the control flip-flops in four cycles. Example 2 shows the microcode instructions for loading the control words and executing the exchange.
In Example 2, bank 7 of flip-flops is being programmed. Bank 7 is selected by taking CRWRITE2-CRWRITEO high and leaving CRSRCE low (see Table 4) when the control words are loaded on CNTR15-CNTRO. With \(\overline{W E}\) held low, the CRCLK is used to load the four sets of control words. Once the flip-flops are loaded, data can be input on D63-DO and the programmed pattern of output selection can be executed. A microinstruction to select registered data inputs and bank 7 as the control source is shown as the last instruction in Example 2. The data must be clocked into the input registers, using LSCLK and MSCLK, before the last instruction is executed.

\section*{SN74ACT8841 DIGITAL CROSSBAR SWITCH}

The control flip－flops could also have been loaded from the data input nibbles in one CRCLK cycle．Input nibbles from one half are mapped onto the control flip－flops of the other half．All control words to set up a switching pattern should be loaded before the bank of flip－flops is selected as control source．The microcode instructions to load bank 1 with the 16 control words in one cycle are presented in Example 3.

Example 3．Loading the MSH／LSH Exchange from Data Inputs
\begin{tabular}{|c|ccc|c|cc|c|c|}
\hline CRSRCE & CRWRITE2 & CRWRITE1 & CRWRITEO & WE & SELDMS & SELDLS & OED15－OEDO & CRCLK \\
\hline 1 & 0 & 0 & 1 & 0 & 1 & 1 & 1111111111111111 & \(\varsigma\) \\
\hline
\end{tabular}

These control nibbles may be loaded from the input as a 64－bit real－time input word or as two 32 －bit words stored previously．To use stored control words，MSCLK and LSCLK are used to load the LSH and MSH input registers with the correct sequence of control nibbles．Whenever the flip－flops are loaded from the data inputs，all 64 bits of control data must be present when the CRCLK is used so that all control nibbles in a program are loaded simultaneously．Example 4 presents the three microcode instructions to load the MSH and LSH input registers and then to pass the registered data to flip－flop bank 2.

Example 4．Loading Control Flip－Flops from Input Registers
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \[
\begin{array}{|c}
\hline \text { INST. } \\
\text { No. } \\
\hline
\end{array}
\] & CRSRCE & CRWRITE2 & CRWRITE1 & CRWRITEO & \(\overline{W E}\) & SELDMS & SELDLS & \[
\begin{array}{|l|}
\hline \text { OED15- } \\
\hline \text { OEDO } \\
\hline
\end{array}
\] & CRCLK & MSCLK & LSCLK & COMMENTS \\
\hline 1 & x & X & X & X & 1 & x & x & 1 & None & 5 & None & Load inputs D63－D32 \\
\hline 2 & \(x\) & x & x & X & 1 & X & X & 1 & None & None & 」 & \[
\begin{array}{|c}
\text { Load inputs } \\
\text { D31-DO }
\end{array}
\] \\
\hline 3 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 5 & None & None & Load control bank 2 \\
\hline
\end{tabular}

The control words in a program can also be read back from the flip－flops using the CNTR outputs．Four instructions are necessary to read the 64 bits in a bank of flip－flops out on CNTR15－CNTRO．WE is held high and \(\overline{O E C}\) is taken low．No CRCLK signal is required．CREAD2－CREADO select bank 2 of flip－flops， and CRADR1－CRADRO select in sequence the four addresses of the 16 －bit words to be read out on the CNTR outputs．Example 5 shows the four microcode instructions．

Example 5．Reading Control Settings on CNTR Outputs
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline INST． & \multirow[t]{2}{*}{CREAD2} & \multirow[t]{2}{*}{CREAD1} & \multirow[t]{2}{*}{CREADO} & \multirow[t]{2}{*}{OEC} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{CRADR1 CRADRO}} & \multirow[t]{2}{*}{WE} & \multicolumn{4}{|l|}{CNTR I／O NUMBERS} & \multirow[b]{2}{*}{COMMENT} \\
\hline NO． & & & & & & & & 15－12 & \(11-8\) & \(7-4\) & 3－0 & \\
\hline 1 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0100 & 0000 & 1100 & 1000 & Read CF12，CF8，CF4，CFO \\
\hline 2 & 0 & 1 & 0 & 0 & 0 & 1 & 1 & 0101 & 0001 & 1101 & 1001 & Read CF13，CF9，CF5，CF1 \\
\hline 3 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0110 & 0010 & 1110 & 1010 & Read CF14，CF10，CF6，CF2 \\
\hline 4 & 0 & 1 & 0 & 0 & 1 & 1 & 1 & 0111 & 0011 & 1111 & 1011 & Read CF15，CF11，CF7，CF3 \\
\hline
\end{tabular}

\section*{SN74ACT8841 DIGITAL CROSSBAR SWITCH}
absolute maximum ratings over operating free-air temperature range (unless otherwise noted) \({ }^{\dagger}\)
\begin{tabular}{|c|c|}
\hline Supply voltage, & -0.5 V to 6 V \\
\hline Input clamp current, \(\mathrm{I}_{\mathrm{K}}\left(\mathrm{V}_{1}<0\right.\) or \(\left.\mathrm{V}_{1}>\mathrm{V}_{\text {CC }}\right)\) & \(\pm 20 \mathrm{~mA}\) \\
\hline  & \(\pm 50 \mathrm{~mA}\) \\
\hline Continuous output current, \(\mathrm{I}_{\mathrm{O}}\left(\mathrm{V}_{\mathrm{O}}=0\right.\) to \(\mathrm{V}_{\mathrm{CC}}\) ) & \(\pm 50 \mathrm{~mA}\) \\
\hline Continuous current through VCC or GND pins & \(\pm 100 \mathrm{~mA}\) \\
\hline Operating free-air temperature range & \(0^{\circ} \mathrm{C}\) to \(70^{\circ} \mathrm{C}\) \\
\hline Storage temperature range & \(-65^{\circ} \mathrm{C}\) to \(150^{\circ} \mathrm{C}\) \\
\hline
\end{tabular}
'Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings only and functional operation of the device at these or any other conditions beyond those indicated under "recommended operating conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.

\section*{recommended operating conditions}
\begin{tabular}{|c|c|c|c|c|c|}
\hline & PARAMETER & MIN & NOM & MAX & UNIT \\
\hline \(\mathrm{V}_{\mathrm{CC}}\) & Supply voltage & 4.5 & 5.0 & 5.5 & V \\
\hline \(\mathrm{V}_{\text {IH }}\) & High-level input voltage & 2 & & \(\mathrm{V}_{\mathrm{CC}}\) & V \\
\hline \(V_{\text {IL }}\) & Low-level input voltage & 0 & & 0.8 & V \\
\hline \({ }^{1} \mathrm{OH}\) & High-level output current & & & -8 & mA \\
\hline \({ }^{\text {I OL }}\) & Low-level output current & & & 8 & mA \\
\hline \(\mathrm{V}_{1}\) & Input voltage & 0 & & \(V_{C C}\) & V \\
\hline \(\mathrm{V}_{\mathrm{O}}\) & Output voltage & 0 & & \(\mathrm{V}_{\mathrm{CC}}\) & V \\
\hline \(\mathrm{dt} / \mathrm{dv}\) & Input transition rise or fall rate & 0 & & 15 & \(\mathrm{ns} / \mathrm{V}\) \\
\hline \(\mathrm{T}_{\text {A }}\) & Operating free-air temperature & 0 & & 70 & \({ }^{\circ} \mathrm{C}\) \\
\hline
\end{tabular}
electrical characteristics over recommended operating free-air temperature range (unless otherwise noted)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{PARAMETER} & \multirow[b]{2}{*}{TEST CONDITIONS} & \multirow[b]{2}{*}{\(V_{\text {cc }}\)} & \multicolumn{3}{|c|}{\(\mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C}\)} & \multirow[b]{2}{*}{MIN} & \multirow[b]{2}{*}{TYP} & \multirow[b]{2}{*}{MAX} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & MIN & TYP & MAX & & & & \\
\hline \multirow{4}{*}{\(\mathrm{V}_{\mathrm{OH}}\)} & \multirow[b]{2}{*}{\({ }^{\prime} \mathrm{OH}=-20 \mu \mathrm{~A}\)} & 4.5 V & & & & 4.4 & & & \multirow{4}{*}{V} \\
\hline & & 5.5 V & & & & 5.4 & & & \\
\hline & \multirow[b]{2}{*}{\(\mathrm{I}^{\mathrm{OH}}=-8 \mathrm{~mA}\)} & 4.5 V & & 3.8 & & 3.7 & & & \\
\hline & & 5.5 V & & 4.8 & & 4.7 & & & \\
\hline \multirow{4}{*}{\(\mathrm{V}_{\mathrm{OL}}\)} & \multirow[b]{2}{*}{\(\mathrm{IOL}=20 \mu \mathrm{~A}\)} & 4.5 V & & & & & & 0.1 & \multirow{4}{*}{V} \\
\hline & & 5.5 V & & & & & & 0.1 & \\
\hline & \multirow[b]{2}{*}{\(\mathrm{I}^{\mathrm{OL}}=8 \mathrm{~mA}\)} & 4.5 V & & 0.32 & & & & 0.4 & \\
\hline & & 5.5 V & & 0.32 & & & & 0.4 & \\
\hline \({ }^{1} \mathrm{OZ}\) & \(\mathrm{V}_{\mathrm{O}}=\mathrm{V}_{\mathrm{CC}}\) or 0 & 5 V & \multicolumn{3}{|r|}{\(\pm 0.5\)} & & & \(\pm 0.5\) & \(\mu \mathrm{A}\) \\
\hline 1 & \(V_{1}=V_{\text {CC }}\) or 0 & 5.5 V & \multicolumn{3}{|c|}{0.1} & & & \(\pm 1\) & \(\mu \mathrm{A}\) \\
\hline \({ }^{1} \mathrm{CC}\) & \(\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}\) or 0.10 & 5.5 V & & & & & & 100 & \(\mu \mathrm{A}\) \\
\hline \(\mathrm{C}_{1}\) & \(\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}\) or 0 & 5 V & & & & & & & pF \\
\hline
\end{tabular}
\({ }^{\dagger}\) This is the increase in supply current for each input that is at one of the specified TTL voltage levels rather than \(0 \vee\) or \(V_{C C}\).

\section*{SN74ACT8841}

DIGITAL CROSSBAR SWITCH
switching characteristics over recommended ranges of supply voltage and operating free-air temperature
(unless otherwise noted)
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline PARAMETER & FROM & то & MIN & TYP \({ }^{\text { }}\) & MAX & UNIT \\
\hline \multirow{9}{*}{\({ }^{t} \mathrm{pd}\)} & Data in & \multirow{5}{*}{Data out} & & 7 & 14 & \multirow{9}{*}{ns} \\
\hline & MSCLK, LSCLK & & & 10 & 18 & \\
\hline & SELDMS, SELDLS & & & 9 & 15 & \\
\hline & CRCLK & & & 12 & 19 & \\
\hline & CRSEL3-CRSELO & & & 12 & 19 & \\
\hline & CREAD2-CREADO & \multirow{3}{*}{CNTR} & & 10 & 18 & \\
\hline & CRCLK & & & 10 & 18 & \\
\hline & CRAD1, CRADO & & & 8 & 16 & \\
\hline & TP1, TP0 & All outputs & & 10 & 19 & \\
\hline \multirow{3}{*}{\({ }^{t}\) en} & TP1, TPO & All outputs & & 10 & 15 & \multirow{3}{*}{ns} \\
\hline & \(\overline{\text { OED }}\) & Data out & & 7 & 12 & \\
\hline & \(\overline{\text { OEC }}\) & CNTR & & 8 & 14 & \\
\hline \multirow{3}{*}{\({ }^{t}\) dis} & TP1, TPO & All outputs & & 10 & 15 & \multirow{3}{*}{ns} \\
\hline & \(\overline{\text { OED }}\) & Data out & & 5 & 8 & \\
\hline & \(\overline{\mathrm{OEC}}\) & CNTRn & & 6 & 10 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) All typical values are at \(\mathrm{VCC}=5 \mathrm{~V}, \mathrm{TA}=25^{\circ} \mathrm{C}\).
timing requirements over recommended ranges of supply voltage and operating free-air temperature (unless otherwise noted)
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{PARAMETER} & MIN & MAX & UNIT \\
\hline & Pulse duration & LSCLK, MSCLK, CRCLK high or low & 7 & & ns \\
\hline \multirow{7}{*}{\({ }^{\text {tsu }}\)} & \multirow{7}{*}{Setup time before CRCLK} & Data & 7 & & \multirow{7}{*}{ns} \\
\hline & & CNTR & 7 & & \\
\hline & & SELDMS, SELDLS & 9 & & \\
\hline & & CRADR1,CRADRO & 8 & & \\
\hline & & CRSRCE, CRWRITE2-CRWRITEO & 8 & & \\
\hline & & LSCLK, MSCLK & 10 & & \\
\hline & & \(\overline{\text { WE }}\) & 8 & & \\
\hline & \multicolumn{2}{|l|}{Setup time, data before LSCLK or MSCLK} & 7 & & ns \\
\hline \multirow{6}{*}{\(t_{h}\)} & \multirow{6}{*}{Hold time after CRCLK} & Data & 0 & & \multirow{6}{*}{ns} \\
\hline & & CNTR & 0 & & \\
\hline & & SELDMS, SELDLS & 0 & & \\
\hline & & CRADR1, CRADRO & 0 & & \\
\hline & & CRSRCE, CRWRITE & 0 & & \\
\hline & & \(\overline{W E}\) & 0 & & \\
\hline & Hold time, data after LSCLK or MSCLK & & 0 & & ns \\
\hline
\end{tabular}
post office box 655012 - dallas. texas 75265

\section*{SN74ACT8841 \\ digital crossbar switch}

\section*{'AS8840 AND 'ACT8841 FUNCTIONAL COMPARISON}

\section*{differences between the SN74AS8840 and the SN74ACT8841}

The SN74AS8840 and the SN74ACT8841 digital crossbar switches essentially perform the same function. The SN74AS8840 and the SN74ACT8841 are based on the same 16-port architecture, differing in the number of control registers, power consumption, and pin-out.

One difference is in the number of programmable control flip-flop banks available to configure the switch. The 'AS8840 has two programmable control banks, while the 'ACT8841 has eight. Both have two selectable hard-wired switching configurations.

The increased number of control banks in the 'ACT8841 require six additional pins not found on the 'AS8840. These are: CRWRITE2, CRWRITE1, CREAD2, CREAD1, CRSEL3, and CRSEL2. CREAD and CRWRITE on the ' 8840 become CREADO and CRWRITEO on the ' 8841 . On the ' 8840 , CRSEL 1 selects the hardwired control functions when high. This function is performed by the CRSEL3 signal on the ' 8841 . Therefore, CRSEL2 and CRSEL1 are actually the added signals.

The 'ACT8841 is a low-power CMOS device requiring only \(5-\mathrm{V}\) power. Because of its STL internal logic and TTL I/Os, the 'AS8840 requires both \(2-\mathrm{V}\) and \(5-\mathrm{V}\) power.

Both the 'AS8840 and the 'ACT8841 are in 156 pin grid-array packages, however, the two devices are not pin-for-pin compatible. Control signals were added to the 'ACT8841 and the 2-V VCC pins ('AS8840 only) were assigned other functions in the 'ACT8841.

\section*{changing 'AS8840 microcode to 'ACT8841 microcode}

Since only six signals have been added to the 'ACT8841, changing existing 'AS8840 microcode to 'ACT8841 microcode is straight forward. CRSEL3 on the 'ACT8841 is functionally equivalent to CRSEL1 on the 'AS8840. CREAD2, CREAD1, CRWRITE2, CRWRITE1, CRSEL2, and CRSEL1 bits must be added. These can always be 0 if no additional control banks are needed. Additional control configurations can be stored by programming these bits.

All other signals in the 'AS8840 microcode remain the same when converting to 'ACT8841 microcode.

6
しヤ8810 \(\forall \downarrow L N S\)

\section*{Overview}

1

\section*{SN74ACT8818 16-Bit Microsequencer}
SN74ACT8832 32-Bit Registered ALU ..... 3
SN74ACT8836 32- \(\times\) 32-Bit Parallel Multiplier ..... 4
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support8
Mechanical Data

\title{
SN74ACT8847 64-Bit Floating Point Unit
}
- Meets IEEE Standard for Single- and DoublePrecision Formats
- Performs Floating Point and Integer Add, Subtract, Multiply, Divide, Square Root, and Compare
- 64-Bit IEEE Divide in 11 Cycles, 64-Bit Square Root in 14 Cycles
- Performs Logical Operations and Logical Shifts
- Superset of TI's SN74ACT8837
- 30-ns, 40-ns and 50-ns Pipelined Performance
- Low-Power EPIC™ CMOS

The SN74ACT8847 is a high-speed, double-precision floating point and integer processor. It performs high-accuracy, scientific computations as part of a customized host processor or as a powerful stand-alone device. Its advanced math processing capabilities allow the chip to accelerate the performance of both CISC- and RISC- based systems.

High-end computer systems, such as graphics workstations, mini-computers and 32-bit personal computers, can utilize the single-chip 'ACT8847 for both floating point and integer functions.

EPIC is a trademark of Texas Instruments Incorporated.

\section*{Contents}
Page
Overview ..... 7-23
Understanding the 'ACT8847 Floating Point Unit ..... 7-23
Microprogramming the 'ACT8847 ..... 7-23
Support Tools ..... 7-24
Design Support ..... 7-24
Design Expertise ..... 7-24
'ACT8847 Logic Symbol ..... 7-25
'ACT8847 Pin Descriptions ..... 7-26
'ACT8847 Specifications ..... 7-35
'ACT8847 Load Circuit ..... 7-43
SN74ACT8847 64-Bit Floating Point Unit ..... 7-50
Introduction ..... 7-50
Major Architectural Features ..... 7-50
Data Flow in Pipelined Architectures ..... 7-52
Controi Architectures for High-Speed Microprogrammed Architectures ..... 7-54
Microprogram Control of an 'ACT8847 FPU Subsystem ..... 7-57
'ACT8847 Data Formats ..... 7-57
Status Outputs ..... 7-60
SN74ACT8847 Architecture ..... 7-60
Overview ..... 7-60
Pipeline Controls ..... 7-62
Temporary Input Register ..... 7-64
RA and RB Input Registers ..... 7-65
Configuration Controls ..... 7-65
Clock Mode Settings ..... 7-66
Operand Selection ..... 7-68
C Register ..... 7-70
Pipelined ALU ..... 7-72
Pipelined Multiplier ..... 7-73
Data Output Controls ..... 7-74
Parity Checker/Generator ..... 7-75
Master/Slave Comparator ..... 7-75
Status and Exception Generation ..... 7-77

\section*{Contents（Continued）}
Page
Microprogramming the＇ACT8847 ..... 7－82
Control Inputs ..... 7－82
Rounding Modes ..... 7－84
FAST and IEEE Modes ..... 7－84
Handling of Denormalized Numbers ..... 7－84
Stalling the Device ..... 7－86
Reset ..... 7－86
Test Pins ..... 7－86
Independent ALU Operations ..... 7－87
Independent Multiplier Operations ..... 7－93
Chained Multiplier／ALU Operations ..... 7－96
Sample Independent ALU Microinstructions ..... 7－98
Sample Independent Multiplier Microinstructions ..... 7－106
Sample Chained Mode Microinstructions ..... 7－126
Instruction Timing ..... 7－134
Exception and Status Handling ..... 7－136
＇ACT8847 Reference Guide ..... 7－139
Instruction Inputs ..... 7－139
Input Configuration ..... 7－144
Operand Source Select ..... 7－144
Pipeline Control ..... 7－145
Round Control ..... 7－145
Status Output Selection ..... 7－146
Test Pin Control ..... 7－146
Miscellaneous Control Inputs ..... 7－147Glossary7－147
SN74ACT8847 Application Notes ..... 7－148
Sum of Products and Product of Sums ..... 7－148

\section*{Contents (Continued)}
Page
Matrix Operations ..... 7-151
Representation of Variables ..... 7-151
Sample Matrix Transformation ..... 7-152
Microinstructions for Sample Matrix Manipulation ..... 7-158
Chebyshev Routines for the SN74ACT8847 FPU ..... 7-162
Introduction ..... 7-162
Overview of Chebyshev's Expansion Method ..... 7-163
Format for the Remainder of the Application Note ..... 7-165
References ..... 7-166
Cosine Routine Using Chebyshev's Method ..... 7-166
Steps Required to Perform the Calculation ..... 7-166
Algorithms for the Three Steps ..... 7-167
Required System Intervention ..... 7-168
Number of 'ACT8847 Cycles Required to Calculate Cosine(x) ..... 7-168
Listing of the Chebyshev Constants (c's) ..... 7-168
Pseudocode Table for the Cosine(x) Calculation ..... 7-169
Microcode Table for the Cosine(x) Calculation ..... 7-172
Sine Routine Using Chebyshev's Method ..... 7-174
Steps Required to Perform the Calculation ..... 7-174
Algorithms for the Three Steps ..... 7-174
Required System Intervention ..... 7-175
Number of 'ACT8847 Cycles Required to Calculate Sine(x) ..... 7-175
Listing of the Chebyshev Constants (c's) ..... 7-175
Pseudocode Table for the Sine(x) Calculation ..... 7-176
Microcode Table for the Sine(x) Calculation ..... 7-179

\section*{Contents (Continued)}
Page
Tangent Routine Using Chebyshev's Method ..... 7-181
Steps Required to Perform the Calculation ..... 7-181
Algorithms for the Three Steps ..... 7-181
Required System Intervention ..... 7-183
Number of 'ACT8847 Cycles Required to Calculate Tangent(x) ..... 7-183
Listing of the Chebyshev Constants ( \(c\) 's) ..... 7-183
Pseudocode Table for the Tangent( \(x\) ) Calculation ..... 7-184
Microcode Table for the Tangent(x) Calculation ..... 7-188
ArcSine and ArcCosine Routine Using Chebyshev's Method ..... 7-192
Steps Required to Perform the Calculation ..... 7-192
Algorithms for the Three Steps ..... 7-192
Required System Intervention ..... 7-194
Number of 'ACT8847 Cycles Required to Calculate ArcSine( \(x\) ) and ArcCosine( \(x\) ) ..... 7-194
Listing of the Chebyshev Constants ( \(\mathrm{c}^{\prime} \mathrm{s}\) ) ..... 7-194
Pseudocode Table for the ArcSine(x) and ArcCosine Calculation ..... 7-195
Microcode Table for the \(\operatorname{ArcSine}(x)\) and \(\operatorname{ArcCosine}(x)\) Calculation ..... 7-198
ArcTangent Routine Using Chebyshev's Method ..... 7-201
Steps Required to Perform the Calculation ..... 7-201
Algorithms for the Three Steps ..... 7-202
Required System Intervention ..... 7-203
Number of 'ACT8847 Cycles Required to Calculate Arctangent(x) ..... 7-203
Listing of the Chebyshev Constants (c's) ..... 7-204
Pseudocode Table for the ArcTangent(x) Calculation ..... 7-205
Microcode Table for the ArcTangent(x) Calculation ..... 7-210

\section*{Contents (Concluded)}
Page
Exponential Routine Using Chebyshev's Method ..... 7-214
Steps Required to Perform the Calculation ..... 7-214
Algorithms for the Three Steps ..... 7-215
Required System Intervention ..... 7-216
Number of 'ACT8847 Cycles Required to Calculate \(\operatorname{Exp}(x)\) ..... 7-216
Listing of the Chebyshev Constants (c's) ..... 7-216
Pseudocode Table for the Exp(x) Calculation ..... 7-217
Microcode Table for the Exp(x) Calculation ..... 7-220
High-Speed Vector Math and 3-D Graphics ..... 7-223
Introduction ..... 7-223
SN74ACT8837 and SN74ACT8847 Floating Point Units ..... 7-223
Mathematical Processing Applications ..... 7-227
Graphics Applications ..... 7-227
Vector Arithmetic ..... 7-228
Computational Operations on Data Vectors ..... 7-229
Compare Operations on Data Vectors ..... 7-239
Graphics Applications ..... 7-243
Creating a 3-D Image ..... 7-244
Three-Dimensional Coordinate Transforms ..... 7-247
Three-Dimensional Clipping ..... 7-250
Summary of Graphics Systems Performance ..... 7-263

\section*{List of Illustrations}
Figure Page
1 Load Circuit ..... 7-43
2 Timing Diagram for: SP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU \(\rightarrow\) CONVERT DP TO SP ..... 7-44
3 Timing Diagram for: DP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU ..... 7-46
4 Timing Diagram for: SP ((Scalar * Vector) + Vector) ..... 7-48
5 High Level Block Diagram ..... 7-51
6 Multiply/Accumulate Operation ..... 7-52
7 Example of Fully Pipelined Operation ..... 7-53
8 PLA Control Circuit Example ..... 7-54
9 Microprogrammed Architecture ..... 7-55
10 Microprogrammed Architecture with Address Register ..... 7-56
11 IEEE Single-Precision Format ..... 7-58
12 IEEE Double-Precision Format ..... 7-58
13 'ACT8847 Detailed Block Diagram ..... 7-61
14 Pipeline Controls ..... 7-63
15 Input Register Control ..... 7-65
16 Operand Selection Multiplexer ..... 7-69
17 C Register Timing ..... 7-71
18 Functional Diagram for ALU ..... 7-72
19 Functional Diagram for Multiplier ..... 7-73
20 Y Output Control ..... 7-75
21 Example of Master/Slave Operation ..... 7-76
22 Status Output Control ..... 7-79
23 Exception Detect Mask Logic ..... 7-81
24 Single-Precision Independent ALU Operation, All Registers Disabled (PIPES2-PIPESO = 111, CLKMODE = X) ..... 7-98
25 Single-Precision Independent ALU Operation, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = X) ..... 7-99
26 Single-Precision Independent ALU Operation, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X) ..... 7-100

\section*{List of Illustrations（Continued）}
Figure Page
27 Single－Precision Independent ALU Operation，All Registers Enabled（PIPES2－PIPESO＝000， CLKMODE＝X） ..... 7－101
28
Double－Precision Independent ALU Operation，All Registers Disabled（PIPES2－PIPESO＝111， CLKMODE \(=0\) ） ..... 7－102
29
Double－Precision Independent ALU Operation，Input Registers Enabled（PIPES2－PIPESO＝110， CLKMODE \(=0\) ） ..... 7－103
30
Double－Precision Independent ALU Operation，Input and Output Registers Enabled（PIPES2－PIPESO＝010， CLKMODE＝1） ..... 7－104
31
Double－Precision Independent ALU Operation，All Registers Enabled（PIPES2－PIPESO＝000， CLKMODE \(=0\) ） ..... 7－105
32 Single－Precision Independent Multiplier Operation， All Registers Disabled（PIPES2－PIPESO＝111， CLKMODE \(=X\) ） ..... 7－106
33 Single－Precision Independent Multiplier Operation， Input Registers Enabled（PIPES2－PIPESO＝110， CLKMODE \(=\mathrm{X}\) ） ..... 7－107
34 Single－Precision Independent Multiplier Operation， Input and Output Registers Enabled （PIPES2－PIPESO \(=010\), CLKMODE \(=X\) ） ..... 7－10835 Single－Precision Independent Multiplier Operation，All Registers Enabled（PIPES2－PIPESO \(=000\) ，CLKMODE＝X）7－109Double－Precision Independent Multiplier Operation，All Registers Disabled（PIPES2－PIPESO \(=111\) ，CLKMODE \(=0\) ）7－110
37 Double－Precision Independent Multiplier Operation， Input Registers Enabled（PIPES2－PIPESO＝110， CLKMODE＝1） ..... 7－111
38 Double－Precision Independent Multiplier Operation， Input and Output Registers Enabled （PIPES2－PIPESO \(=010\), CLKMODE \(=0\) ） ..... 7－112

\section*{List of Illustrations (Continued)}
Figure Page
39 Double-Precision Independent Multiplier Operation, All Registers Enabled (PIPES2-PIPESO = 000, CLKMODE = 0) ..... 7-113
40 Single-Precision Floating Point Division, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE \(=\mathrm{X}\) ) ..... 7-114
41 Single-Precision Floating Point Division, Input and Pipeline Registers Enabled (PIPES2-PIPESO = 100, CLKMODE = X) ..... 7-114
42 Single-Precision Floating Point Division, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE \(=X\) ) ..... 7-115
43 Single-Precision Floating Point Division, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=X\) ) ..... 7-115
44 Double-Precision Floating Point Division, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE \(=0\) ) ..... 7-116
45 Double-Precision Floating Point Division, Input and Pipeline Registers Enabled (PIPES2-PIPESO = 100, CLKMODE \(=0\) ) ..... 7-116
46 Double-Precision Floating Point Division, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = 1) ..... 7-117
47 Double-Precision Floating Point Division, All Registers Enabled (PIPES2-PIPESO = 000, CLKMODE = 1) ..... 7-117
48 Integer Division, Input Registers Enabled(PIPES2-PIPESO \(=100\), CLKMODE \(=X\) )7-118
49 Integer Division, Input and Pipeline Registers Enabled (PIPES2-PIPESO = 100 CLKMODE \(=\) X) ..... 7-118
50 Integer Division, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X) ..... 7-119
51 Integer Division, All Registers Enabled(PIPES2-PIPESO = 000, CLKMODE \(=\) X)7-119
52 Single-Precision Floating Point Square Root, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = X) ..... 7-120

\section*{List of Illustrations (Continued)}
Figure Page
53 Single-Precision Floating Point Square Root, Input and Pipeline Registers Enabled (PIPES2-PIPESO \(=100\), CLKMODE \(=\) X) ..... 7-120
54 Single-Precision Floating Point Square Root, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X) ..... 7-121
55 Single-Precision Floating Point Square Root, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE = X) ..... 7-121
56 Double-Precision Floating Point Square Root, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = 1) ..... 7-122
57 Double-Precision Floating Point Square Root, Input and Pipeline Registers Enabled (PIPES2-PIPESO = 100, CLKMODE = 0) ..... 7-122
58 Double-Precision Floating Point Square Root, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = 1) ..... 7-123
59 Double-Precision Floating Point Square Root, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=0\) ) ..... 7-123
60 Integer Square Root, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = X). ..... 7-124
61 Integer Square Root, Input and Pipeline Registers Enabled (PIPES2-PIPESO = 100, CLKMODE = X) ..... 7-12462 Integer Square Root, Input and Output RegistersEnabled (PIPES2-PIPESO = 010, CLKMODE = X)7-125
63
Integer Square Root, All Registers Enabled(PIPES2-PIPESO = 000, CLKMODE = X)7-125
64 Single-Precision Chained Mode Operation, All Registers Disabled (PIPES2-PIPESO = 111, CLKMODE \(=\mathrm{X}\) ) ..... 7-126
65 Single-Precision Chained Mode Operation, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = 1) ..... 7-127
66 Single-Precision Chained Mode Operation, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X) ..... 7-128

\section*{List of Illustrations (Concluded)}
Figure Page
67 Single-Precision Chained Mode Operation, All Registers Enabled (PIPES2-PIPESO = 000, CLKMODE = X) ..... 7-129
68 Double-Precision Chained Mode Operation, All Registers Disabled (PIPES2-PIPESO \(=111\), CLKMODE \(=0\) ) ..... 7-130
69 Double-Precision Chained Mode Operation, Input Registers Enabled (PIPES2-PIPESO = 110, CLKMODE = 1) ..... 7-131
70 Double-Precision Chained Mode Operation, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = 0) ..... 7-132
71 Double-Precision Chained Mode Operation, All Registers Enabled (PIPES2-PIPESO = 000, CLKMODE = 0) ..... 7-133
72 Sequence of Matrix Operations ..... 7-154
73 Resultant Matrix Transformation ..... 7-161
74 SN74ACT8837 Floating Point Unit ..... 7-225
75 SN74ACT8847 Floating Point Unit ..... 7-226
76 Creating a 3-D Image ..... 7-245
77 View Volume ..... 7-246
78a Model of Procedure for Creating a 3-D Graphic ..... 7-247
78b Model of Creating and Transforming a 3-D Graphic ..... 7-247
79 Viewing Pyramid Showing Six Clipping Planes ..... 7-251

7 Lヵ88」つもゅLNS

\section*{List of Tables}
Table Page
1 'ACT8847 Pin Grid Allocation ..... 7-27
2 'ACT8847 Pin Functional Description ..... 7-28
3 Sum of Products Calculation ..... 7-51
4 IEEE Floating Point Number Representations ..... 7-59
5 Pipeline Controls (PIPES2-PIPESO) ..... 7-64
6 Double-Precision Input Data Configuration Modes ..... 7-66
7 Single-Precision Input Data Configuration Mode ..... 7-66
8a Double-Precision CREG + PREG Using CLKMODE \(=0\), PIPES \(=010\) ..... 7-67
8b Double-Precision CREG + PREG Using CLKMODE \(=1\), PIPES = 010 ..... 7-67
9a Double-Precision PREG + RB Using CLKMODE \(=0\), PIPES = 010 ..... 7-68
9b Double-Precision PREG + RB Using CLKMODE \(=1\), PIPES = 010 ..... 7-68
10 Multiplier Input Selection ..... 7-68
11 ALU Input Selection ..... 7-70
12 Independent ALU Operations ..... 7-73
13 Independent Multiplier Operations ..... 7-74
14 Comparison Status Outputs ..... 7-77
15 Status Outputs ..... 7-78
16 Status Output Selection (Chained Mode) ..... 7-80
17 Control Inputs ..... 7-83
18 Rounding Modes ..... 7-84
19 Handling Wrapped Multiplier Outputs ..... 7-85
20 Test Pin Control Inputs ..... 7-86
21 Independent ALU Operations, Single Floating Point Operand \((110=0,19=0,17=0,16=0)\) ..... 7-88
22 Independent ALU Operations, Single Integer Operand \((110=0,19=1,16=0)\) ..... 7-89
23 Independent ALU Operations, Two Floating Point Operands \((110=0,19=0, I 5=0)\) ..... 7-91
24 Independent ALU Operations, Two Integer Operands \((110=0,19=1, I 6=0)\) ..... 7-91
25 Loading the Exception Detect Mask Register ..... 7-92

\section*{List of Tables (Continued)}
Table ..... Page
26 NOP Instruction ..... 7-92
27 Independent Multiplier Operations ..... 7-94
28 Independent Multiply Operations Selected by I4-I2 ( \(110=0,16=1,15=0\) ) ..... 7-95
29 Independent Divide/Square Root Operations Selected by \(14-12\) ( \(110=0,16=1,15=1)\) ..... 7-95
30 Chained Multiplier/ALU Operations (110 = 1) ..... 7-97
31 Number of Clocks Required to Complete an Operation ..... 7-134
32 NOPs Inserted to Guarantee that Double-Precision Results Remain Valid for Two Clock Cycles (PIPES2-PIPESO = 000) ..... 7-135
33 Independent ALU Operations, Single Floating Point Operand ..... 7-139
34 Independent ALU Operations, Two Floating Point Operands ..... 7-140
35 Independent ALU Operations, One Integer Operand ..... 7-140
36 Independent ALU Operations, Two Integer Operands ..... 7-141
37 Independent Floating Point Multiply Operations ..... 7-141
38 Independent Floating Point Divide/Square Root Operations ..... 7-141
39 Independent Integer Multiply/Divide/Square Root Operations ..... 7-142
40 Chained Multiplier/ALU Floating Point Operations ..... 7-142
41 Chained Multiplier/ALU Integer Operations ..... 7-14342 Double-Precision Input Data Configuration Modes7-144
43 Multiplier Input Selection ..... 7-144
44 ALU Input Selection ..... 7-145
45 Pipeline Controls (PIPES2-PIPESO) ..... 7-145
46 Rounding Modes ..... 7-145
47 Status Output Selection (Chained Mode) ..... 7-146
48 Test Pin Control Inputs ..... 7-146
49 Miscellaneous Control Inputs ..... 7-147
50 Pseudocode for Fully Pipelined Double-Precision Sum of Products (CLKMODE \(=0\), CONFIG1-CONFIGO \(=10\), PIPES2-PIPESO \(=000\) ) ..... 7-149

\section*{List of Tables (Continued)}
Table Page
51 Pseudocode for Fully Pipelined Double-Precision Product of Sums (CLKMODE \(=0\), CONFIG1-CONFIGO = 10, PIPES2-PIPESO = 000) ..... 7-150
52 Single-Precision Matrix Multiplication (PIPES2-PIPESO = 010) ..... 7-159
53 Microinstructions for Sample Matrix Multiplication ..... 7-160
54 Fully Pipelined Single-Precision Sum of Products (PIPES2-PIPESO = 000) ..... 7-162
55 Cycle Count and Execution Speed for the Seven Chebyshev Functions ..... 7-163
56 Pseudocode for Chebyshev Cosine Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-169
57 Pseudocode for Chebyshev Sine Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-176
58 Pseudocode for Chebyshev Tangent Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-184
59 Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-195
60 Pseudocode for Chebyshev ArcTangent Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-205
61 Pseudocode for Chebyshev Exponential Routine (PIPES2-PIPESO = 010, RND1-RNDO = 00) ..... 7-217
62 Data Flow for Pipelined Single-Precision Vector Add, \(N=6\) ..... 7-229
63 Program Listing for Pipelined Single-Precision Vector Add, \(\mathrm{N}=6\) ..... 7-230
64 Data Flow for Pipelined Single-Precision Vector Multiply, \(\mathrm{N}=6\) ..... 7-230
65 Program Listing for Pipelined Single-Precision Vector Multiply, \(N=6\) ..... 7-231
66 Data Flow for Unpiped Single-Precision Vector Multiply, \(\mathrm{N}=6\) ..... 7-231
67 Data Flow for Pipelined Single-Precision Sum of Products, \(\mathrm{N}=8\) ..... 7-232

\section*{List of Tables（Continued）}
Table Page
68 Program Listing for Pipelined Single－Precision Sum of Products， \(\mathrm{N}=8\) ..... 7－233
69 Data Flow for Unpiped Single－Precision Sum of Products， \(\mathrm{N}=8\) ..... 7－233
70 Data Flow for＇ACT8837 Pipelined Single－Precision Vector Divide， \(\mathrm{N}=1\) ..... 7－235
71 Program Listing for＇ACT8837 Pipelined Single－Precision Vector Divide， \(\mathrm{N}=1\) ..... 7－236
72 Data Flow for＇ACT8837 Pipelined Single－Precision Interleaved Vector Divide，N＝ 2 ..... 7－236
73 Data Flow for＇ACT8837 Unpiped Single－Precision Interleaved Vector Divide，N＝ 1 ..... 7－237
74 Data Flow for＇ACT8847 Pipelined Single－Precision Vector Divide ..... 7－238
75 Program Listing for＇ACT8847 Pipelined Single－Precision Vector Divide ..... 7－238
76 Data Flow for Pipelined Single－Precision Vector MAX ..... 7－240
77 Data Flow for Pipelined Single－Precision Interleaved Vector MAX／MIN ..... 7－240
78 Program Listing for Pipelined Single－Precision Interleaved Vector MAX／MIN ..... 7－241
79 Data Flow for Unpiped Single－Precision Vector MAX ..... 7－241
80 Data Flow for Pipelined Single－Precision List MAX ..... 7－242
81 Program Listing for Pipelined Single－Precision List MAX ..... 7－243
82 Partial Data Flow for Product of［X，Y，Z，W］and General Transform Matrix ..... 7－248
83 Partial Data Flow for Product of［X，Y，Z，W］and Reduced Transform Matrix ..... 7－249
84 Data Flow for Clipping a Line Segment Against the Z＝N Plane ..... 7－253
85 Program Listing for Clipping a Line Segment Against the \(\mathrm{Z}=\mathrm{N}\) Plane ..... 7－254
86 Data Flow for Clipping a Line Segment Against the Z＝N Plane ..... 7－255

\section*{List of Tables (Concluded)}
Table Page
87 Data Flow for Computing t1, t2, s1, and s2 Using an SN74ACT8837 ..... 7-257
88 Program Listing for Three-Processor Clip to Compute \(\mathrm{t} 1, \mathrm{t} 2, \mathrm{~s} 1\), and s 2 ..... 7-258
89 A > B Comparison Function Table ..... 7-259
90 Data Flow for Accept/Reject Testing ..... 7-260
91 Data Flow for the X Processor ..... 7-262
92 Program Listing for the X Processor ..... 7-262
93 Summary of Graphics Systems Performance ..... 7-263
94 Available Options for Graphic System Designs ..... 7-263

\section*{Overview}

Using a top-down approach, this user guide contains the following major sections:
Introduction (to Microprogrammed Architectures and the 'ACT8847)
SN74ACT8847 Architecture
Microprogramming the 'ACT8847
Easy-to-Access Reference Guide
Application Notes
The SN74ACT8847 combines a multiplier and an arithmetic-logic unit in a single microprogrammable VLSI device. The 'ACT8847 is implemented in Texas Instruments one-micron CMOS technology to offer high speed and low power consumption with exceptional flexibility and functional integration. The FPUs can be microprogrammed to operate in multiple modes to support a variety of floating point applications.

The 'ACT8847 is fully compatible with the IEEE standard for binary floating point arithmetic, STD 754-1985. This FPU performs both single- and double-precision operations, integer operations, logical operations, and division and square root operations (as single microinstructions).

\section*{Understanding the 'ACT8847 Floating Point Unit}

To support floating point processing in IEEE format, the 'ACT8847 may be configured for either single- or double-precision operation. Instruction inputs can be used to select three modes of operation, including independent ALU operations, independent multiplier operations, or simultaneous ALU and multiplier operations.

Three levels of internal data registers are available. The device can be used in flowthrough mode (all registers disabled), pipelined mode (all registers enabled), or in other available register configurations. An instruction register, a 64-bit constant register, and a status register are also provided.

Each FPU can handle three types of data input formats. The ALU accepts data operands in integer format or IEEE floating point format. A third type of operand, denormalized numbers, can also be processed after the ALU has converted them to "wrapped" numbers, which are explained in detail in a later section. The 'ACT8847 multiplier operates on normalized floating point numbers, wrapped numbers, and integer operands.

\section*{Microprogramming the 'ACT8847}

The 'ACT8847 is a fully microprogrammable device. Each FPU operation is specified by a microinstruction or sequence of microinstructions which set up the control inputs of the FPU so that the desired operation is performed.

\section*{Support Tools}

Texas Instruments has developed functional evaluation models of the＇ACT8847 in software which permit designers to simulate operation of the FPU．To evaluate the functions of an FPU，a designer can create a microprogram with sample data inputs， and the simulator will emulate FPU operation to produce sample data output files，as well as several diagnostic displays to show specific aspects of device operation．Sample microprogram sequences are included in this section．

\section*{Design Support}

Texas Instruments Regional Technology Centers，staffed with systems－oriented engineers，offer a training course to assist users of TI LSI products and their application to digital processor systems．Specific attention is given to the understanding and generation of design techniques which implement efficient algorithms designed to match high－performance hardware capabilities with desired performance levels．

Information on VLSI devices and product support can be obtained from the following Regional Technology Centers：

\section*{Atlanta}

Texas Instruments Incorporated
3300 N．E．Expressway，Building 8
Atlanta，GA 30341
404／662－7945

\section*{Boston}

Texas Instruments Incorporated
950 Winter Street，Suite 2800
Waltham，MA 02154
617／895－9100
Northern California
Texas Instruments Incorporated
5353 Betsy Ross Drive
Santa Clara，CA 95054
408／748－2220

Chicago
Texas Instruments Incorporated 515 Algonquin
Arlington Heights，IL 60005
312／640－2909
Dallas
Texas Instruments Incorporated 10001 E．Campbell Road Richardson，TX 75081 214／680－5066

Southern California Texas Instruments Incorporated 17891 Cartwright Drive
Irvine，CA 92714
714／660－8140

\section*{Design Expertise}

Texas Instruments can provide in－depth technical design assistance through consultations with contract design services．Contact your local Field Sales Engineer for current information or contact VLSI Systems Engineering at 214／997－3970．

\section*{'ACT8847 Logic Symbol}


\section*{'ACT8847 Pin Descriptions}

Pin descriptions and grid allocation for the 'ACT8847 are given on the following pages. The pin at location A1 has been omitted for indexing purposes.

208 PIN . . . GB PACKAGE
(TOP VIEW)


Table 1. 'ACT8847 Pin Grid Allocation
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multicolumn{2}{|r|}{PIN} & \multicolumn{2}{|r|}{PIN} & \multicolumn{2}{|r|}{PIN} & \multicolumn{2}{|r|}{PIN} & \multicolumn{2}{|r|}{PIN} & \multicolumn{2}{|r|}{PIN} \\
\hline NO. & NAME & No. & NAME & NO. & NAME & NO. & NAME & No. & NAME & NO. & NAME \\
\hline A1 & missing & C2 & YO & E3 & FAST & J15 & FLOWC & P1 & ENRC & S1 & NC \\
\hline A2 & INF & C3 & Y3 & E4 & GND & J16 & SRCC & P2 & PIPESO & S2 & PBO \\
\hline A3 & Y5 & C4 & Y6 & E14 & GND & J17 & BYTEP & P3 & RESET & S3 & DBO \\
\hline A4 & Y8 & C5 & Y9 & E15 & AGTB & K1 & SELOP3 & P4 & PB1 & S4 & DB4 \\
\hline A5 & Y11 & C6 & Y12 & E16 & AEQB & K2 & SELOP4 & P5 & DB1 & S5 & DB11 \\
\hline A6 & Y14 & C7 & Y15 & E17 & MSERR & K3 & SELOP5 & P6 & DB5 & S6 & DB12 \\
\hline A7 & Y17 & C8 & Y18 & F1 & 15 & K4 & GND & P7 & DB9 & S7 & DB15 \\
\hline A8 & Y20 & C9 & Y23 & F2 & 13 & K14 & GND & P8 & DB16 & S8 & DB19 \\
\hline A9 & Y21 & C10 & Y26 & F3 & RNDO & K15 & PA. 1 & P9 & DB21 & S9 & DB23 \\
\hline A10 & Y24 & C11 & Y30 & F4 & GND & K16 & PA2 & P10 & DB28 & S10 & DB26 \\
\hline A11 & Y27 & C12 & PY1 & F14 & GND & K17 & PA3 & P11 & DAO & S11 & DB30 \\
\hline A12 & Y29 & C13 & UNDER & F15 & PERRA & L1 & SELOP6 & P12 & DA4 & S12 & DA2 \\
\hline A13 & PYO & C14 & INEX & F16 & \(\overline{\text { OEY }}\) & L2 & SELOP7 & P13 & DA8 & S13 & DA6 \\
\hline A14 & PY3 & C15 & DENIN & F17 & OES & L3 & CLK & P14 & DA12 & S14 & DA10 \\
\hline A15 & IVAL & C16 & SRCEX & G1 & 17 & L4 & \(V_{\text {cc }}\) & P15 & DA19 & S15 & DA14 \\
\hline A16 & NEG & C17 & CHEX & G2 & 16 & L14 & GND & P16 & DA22 & S16 & DA15 \\
\hline A17 & NC & D1 & 11 & G3 & 14 & L15 & DA30 & P17 & DA23 & S17 & DA17 \\
\hline B1 & ED & D2 & RND1 & G4 & \(\mathrm{V}_{\mathrm{CC}}\) & L16 & DA31 & R1 & PIPES 1 & T1 & NC \\
\hline B2 & Y2 & D3 & Y1 & G14 & \(\mathrm{V}_{\mathrm{CC}}\) & L17 & PAO & R2 & HALT & T2 & PB3 \\
\hline B3 & Y4 & D4 & GND & G15 & OEC & M1 & ENRB & R3 & PB2 & T3 & DB3 \\
\hline B4 & Y7 & D5 & \(\mathrm{V}_{\mathrm{CC}}\) & G16 & SELMS/[/LS & M2 & ENRA & R4 & DB2 & T4 & DB7 \\
\hline B5 & Y10 & D6 & GND & G17 & TEST1 & M3 & CLKC & R5 & DB6 & T5 & DB8 \\
\hline B6 & Y13 & D7 & GND & H1 & 110 & M4 & GND & R6 & DB10 & T6 & DB13 \\
\hline B7 & Y16 & D8 & \(V_{C C}\) & H2 & 19 & M14 & \(\mathrm{V}_{\mathrm{CC}}\) & R7 & DB14 & T7 & DB17 \\
\hline B8 & Y19 & D9 & GND & H3 & 18 & M15 & DA27 & R8 & DB18 & T8 & DB20 \\
\hline B9 & Y22 & D10 & GND & H4 & GND & M16 & DA28 & R9 & DB22 & T9 & DB24 \\
\hline B10 & Y25 & D11 & \(\mathrm{V}_{\mathrm{CC}}\) & H14 & GND & M17 & DA29 & R10 & DB27 & T10 & DB25 \\
\hline B11 & Y28 & D12 & GND & H15 & TESTO & N1 & CONFIGO & R11 & DB31 & T11 & DB29 \\
\hline B12 & Y31 & D13 & GND & H16 & SELST1 & N2 & CONFIG1 & R12 & DA3 & T12 & DA1 \\
\hline B13 & PY2 & D14 & \(V_{C C}\) & H17 & SELSTO & N3 & CLKMODE & R13 & DA7 & T13 & DA5 \\
\hline B14 & OVER & D15 & STEX1 & J1 & SELOP2 & N4 & PIPES2 & R14 & DA11 & T14 & DA9 \\
\hline B15 & RNDCO & D16 & STEXO & J2 & SELOP1 & N14 & DA18 & R15 & DA16 & T15 & DA13 \\
\hline B16 & DENORM & D17 & UNORD & J3 & SELOPO & N15 & DA24 & R16 & DA20 & T16 & NC \\
\hline B17 & DIVBYO & E1 & 12 & J4 & \(V_{\text {CC }}\) & N16 & DA25 & R17 & DA21 & T17 & NC \\
\hline C1 & PERRB & E2 & 10 & J14 & \(\mathrm{V}_{\mathrm{CC}}\) & N17 & DA26 & & & & \\
\hline
\end{tabular}

Table 2. 'ACT8847 Pin Functional Description
\begin{tabular}{|c|c|c|c|}
\hline  & NO. & I/O/Z \({ }^{\dagger}\) & DESCRIPTION \\
\hline \multicolumn{4}{|r|}{DATA BUS SIGNALS (96 PINS)} \\
\hline DAO & P11 & \multirow{31}{*}{1} & \multirow{31}{*}{DA 32-bit input data bus. Data can be latched in a 64-bit temporary register or loaded directly into an input register} \\
\hline DA1 & T12 & & \\
\hline DA2 & S12 & & \\
\hline DA3 & R12 & & \\
\hline DA4 & P12 & & \\
\hline DA5 & T13 & & \\
\hline DA6 & S13 & & \\
\hline DA7 & R13 & & \\
\hline DA8 & P13 & & \\
\hline DA9 & T14 & & \\
\hline DA10 & S14 & & \\
\hline DA11 & R14 & & \\
\hline DA12 & P14 & & \\
\hline DA13 & T15 & & \\
\hline DA14 & S15 & & \\
\hline DA15 & S16
R15 & & \\
\hline DA17 & S17 & & \\
\hline DA18 & N14 & & \\
\hline DA19 & P15 & & \\
\hline DA20 & R16 & & \\
\hline DA21 & R17 & & \\
\hline DA22 & P16 & & \\
\hline DA23 & P17 & & \\
\hline DA24 & N15 & & \\
\hline DA25 & N16 & & \\
\hline DA26 & N17 & & \\
\hline DA27 & M15 & & \\
\hline DA28 & M16 & & \\
\hline DA29 & M17 & & \\
\hline DA30 & L15 & & \\
\hline DA31 & L16 & & \\
\hline DB0 & S3 & \multirow{11}{*}{1} & \multirow{11}{*}{DB 32-bit input data bus. Data can be latched in a 64-bit temporary register or loaded directly into an input register.} \\
\hline DB1 & P5 & & \\
\hline DB2 & R4 & & \\
\hline DB3 & T3 & & \\
\hline DB4 & S4 & & \\
\hline DB5 & P6 & & \\
\hline DB6 & R5 & & \\
\hline DB7 & T4 & & \\
\hline DB8 & T5 & & \\
\hline DB9 & P7 & & \\
\hline DB10 & R6 & & \\
\hline
\end{tabular}

\footnotetext{
\(\dagger\) Input, output, and high-impedance state.
}

Table 2. 'ACT8847 Pin Functional Description (Continued)


\footnotetext{
\({ }^{\dagger}\) Input, output, and high-impedance state.
}

Table 2. 'ACT8847 Pin Functional Description (Continued)
\begin{tabular}{|c|c|c|c|}
\hline  & NO. & 1/0/2 \({ }^{\dagger}\) & DESCRIPTION \\
\hline \multicolumn{4}{|r|}{DATA BUS SIGNALS (96 PINS)} \\
\hline Y21 & A9 & & \\
\hline Y22 & B9 & & \\
\hline Y23 & C9 & & \\
\hline Y24 & A10 & & \\
\hline Y25 & B10 & & \\
\hline Y26 & C10 & 1/0 & 32-bit \(Y\) output data bus \\
\hline Y27 & A11 & & \\
\hline Y28 & B11 & & \\
\hline Y29 & A12 & & \\
\hline Y30 & C11 & & \\
\hline Y31 & B12 & & \\
\hline \multicolumn{4}{|r|}{PARITY AND MASTER/SLAVE SIGNALS (16 PINS)} \\
\hline BYTEP & J17 & 1 & When high, selects parity generation for each byte of input (four parity bits for each bus). When low, selects parity generation for whole 32-bit input (one parity bit for each bus). Even parity is used. \\
\hline MSERR & E17 & 0 & Master/Slave error output pin \\
\hline PAO & L17 & \multirow{4}{*}{1} & \multirow{4}{*}{Parity inputs for DA data} \\
\hline PA1 & K15 & & \\
\hline PA2 & K16 & & \\
\hline PA3 & K17 & & \\
\hline PBO & S2 & \multirow{4}{*}{1} & \multirow{4}{*}{Parity inputs for DB data} \\
\hline PB1 & P4 & & \\
\hline PB2 & R3 & & \\
\hline PB3 & T2 & & \\
\hline PERRA & F15 & 0 & DA data parity error output. When high, signals a byte or word has failed an even parity check. \\
\hline PERRB & C1 & 0 & DB data parity error output. When high, signals a byte or word has failed an even parity check. \\
\hline PYO & A13 & \multirow{4}{*}{1/0/2} & \multirow{4}{*}{Y port parity data} \\
\hline PY1 & C12 & & \\
\hline PY2 & B13 & & \\
\hline PY3 & A14 & & \\
\hline \multicolumn{4}{|r|}{CLOCK, CONTROL, AND INSTRUCTION SIGNALS (46 PINS)} \\
\hline CLK & L3 & 1 & Master clock for all registers except C register \\
\hline CLKC & M3 & 1 & C register clock \\
\hline CLKMODE & N3 & 1 & Selects whether temporary register loads only on rising clock edge (CLKMODE \(=\mathrm{L}\) ) or on falling edge (CLKMODE \(=\mathrm{H}\) ). \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) Input, output, and high-impedance state.
}

Table 2. 'ACT8847 Pin Functional Description (Continued)
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|l|}{PIN} & 1/0/2 \({ }^{\dagger}\) & DESCRIPTION \\
\hline NAME & NO. & & \\
\hline \multicolumn{4}{|r|}{CLOCK, CONTROL, AND INSTRUCTION SIGNALS (46 PINS)} \\
\hline \[
\begin{aligned}
& \hline \text { CONFIGO } \\
& \text { CONFIG1 }
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{N} 1 \\
& \mathrm{~N} 2
\end{aligned}
\] & 1 & Select data sources for RA and RB registers from DA bus, DB bus and temporary register \\
\hline ENRA & M2 & 1 & When high, enables loading of RA register on a rising clock edge if the RA register is not disabled (see PIPESO below). \\
\hline ENRB & M1 & 1 & When high, enables loading of RB register on a rising clock edge if the RB register is not disabled (see PIPESO below). \\
\hline ENRC & P1 & 1 & When low, enables write to C register when CLKC goes high. \\
\hline FAST & E3 & 1 & When low, selects gradual underflow (IEEE model). When high, selects sudden underflow, forcing all denormalized inputs and outputs to zero. \\
\hline FLOWC & J15 & 1 & When high, causes product or sum to bypass C register, so that product or sum appears on the C register output bus. Timing is similar to P register or \(S\) register feedback operands. \(C\) register remains unchanged. Product or sum may also be simultaneously fed back in usual manner (not through C register). \\
\hline HALT & R2 & 1 & Stalls operation without altering contents of instruction or data registers (except the CREG, which has a separate write enable). Active low. \\
\hline 10 & E2 & & \\
\hline 11 & D1 & & \\
\hline 12 & E1 & & \\
\hline 13 & F2 & & \\
\hline 14 & G3 & & \\
\hline 15 & F1 & 1 & Instruction inputs \\
\hline 16 & G2 & & \\
\hline 17 & G1 & & \\
\hline 18 & H3 & & \\
\hline 19 & H2 & & \\
\hline 110 & H1 & & \\
\hline \(\overline{\mathrm{OEC}}\) & G15 & 1 & Comparison status output enable. Active low. \\
\hline \(\overline{\text { OES }}\) & F17 & 1 & Exception status and other status output enable. Active low. \\
\hline \(\overline{\mathrm{OEY}}\) & F16 & 1 & Y bus output enable. Active low. \\
\hline PIPESO & P2 & 1 & When low, enables instruction register and, depending on setting of ENRA and ENRB, the RA and RB input registers. When high, puts instruction, RA and RB registers in flowthrough mode. \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) Input, output, and high-impedance state.
}

Table 2. 'ACT8847 Pin Functional Description (Continued)
\begin{tabular}{|ll|c|l|l|}
\hline \multicolumn{2}{|c|}{\begin{tabular}{c} 
PIN \\
NAME
\end{tabular}} & NO.
\end{tabular} I/O/Z \({ }^{|c|}\) CLOCK, CONTROL, AND INSTRUCTION SIGNALS (46 PINS)

\footnotetext{
\({ }^{\dagger}\) Input, output, and high-impedance state.
}

Table 2. 'ACT8847 Pin Functional Description (Continued)
\begin{tabular}{|c|c|c|c|}
\hline NAME & NO. & I/0/2 \({ }^{\dagger}\) & DESCRIPTION \\
\hline \multicolumn{4}{|r|}{STATUS SIGNALS (17 PINS)} \\
\hline AGTB & E15 & 1/0/Z & Comparison status pin. When high, indicates that \(A\) operand is greater than B operand. \\
\hline CHEX & C17 & 1/0/Z & Status pin indicating an exception during a chained function. If 16 is low, indicates the multiplier is the source of an exception. If 16 is high, indicates the ALU is the source of an exception. \\
\hline DENIN & C15 & I/O/Z & Status pin indicating a denormal input to the multiplier. When DENIN goes high, the STEX pins indicate which port had the denormal input. \\
\hline DENORM & B16 & 1/0/Z & Status pin indicating a denormal output from the ALU or a wrapped output from the multiplier. In FAST mode, causes the result to go to zero when DENORM is high. \\
\hline DIVBYO & B17 & I/O/Z & Status pin indicating an attempted operation involved dividing by zero \\
\hline ED & B1 & 1/O/Z & Exception detect status signal representing logical OR of all enabled exceptions in the exception disable register \\
\hline INEX & C14 & I/O/Z & Status pin indicating an inexact output \\
\hline INF & A2 & I/O/Z & Status pin. When high, indicates output value is infinity. \\
\hline IVAL & A15 & I/O/Z & Status pin indicating that an invalid operation or a nonnumber ( NaN ) has been input to the multiplier or ALU. \\
\hline NEG & A16 & I/O/Z & Status pin. When high, indicates result has negative sign. \\
\hline OVER & B14 & 1/0/Z & Status pin indicating that the result is greater the largest allowable value for specified format (exponent overflow). \\
\hline SRCEX & C16 & I/O/Z & Status pin indicating source of exception, either ALU (SRCEX \(=\mathrm{L}\) ) or multiplier (SRCEX \(=\mathrm{H}\) ). \\
\hline \[
\begin{aligned}
& \text { STEXO } \\
& \text { STEX } 1
\end{aligned}
\] & \[
\begin{aligned}
& \text { D16 } \\
& \text { D15 }
\end{aligned}
\] & I/O/Z & Status pins indicating that a nonnumber ( NaN ) or denormal number has been input on A port (STEX1) or B port (STEXO). \\
\hline UNDER & C13 & I/O/Z & Status pin indicating that a result is inexact and less than minimum allowable value for format (exponent underflow). \\
\hline UNORD & D17 & I/O/Z & Comparison status pin indicating that the two inputs are unordered because at least one of them is a nonnumber ( NaN ). \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) Input, output, and high-impedance state.
}

Table 2．＇ACT8847 Pin Functional Description（Concluded）
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|l|}{PIN} & \(\mathbf{Z}^{\dagger}\) & DESCRIPTION \\
\hline NAME & NO． & 1／0／2 & DESCRIPTION \\
\hline \multicolumn{4}{|r|}{SUPPLY AND N／C SIGNALS（33 PINS）} \\
\hline \(\mathrm{V}_{\mathrm{CC}}\) & D5 & \multirow{10}{*}{1} & \multirow{10}{*}{5－V supply voltage pins} \\
\hline \(V_{\text {CC }}\) & D8 & & \\
\hline \(V_{\text {CC }}\) & D11 & & \\
\hline \(V_{\text {CC }}\) & D14 & & \\
\hline \(\mathrm{V}_{\mathrm{CC}}\) & G4 & & \\
\hline \(V_{\text {CC }}\) & G14 & & \\
\hline \(V_{\text {CC }}\) & J4 & & \\
\hline \(V_{\text {CC }}\) & J14 & & \\
\hline \(V_{\text {CC }}\) & L4 & & \\
\hline \(\mathrm{V}_{\text {CC }}\) & M14 & & \\
\hline GND & D4 & \multirow{16}{*}{1} & \multirow{16}{*}{Ground pins．NOTE：All ground pins should be used and connected．} \\
\hline GND & D6 & & \\
\hline GND & D7 & & \\
\hline GND & D9 & & \\
\hline GND & D10 & & \\
\hline GND & D12 & & \\
\hline GND & D13 & & \\
\hline GND & E4 & & \\
\hline GND & E14 & & \\
\hline GND & F4 & & \\
\hline GND & F14
H4 & & \\
\hline GND & H14 & & \\
\hline GND & K4 & & \\
\hline GND & K14 & & \\
\hline GND & L14 & & \\
\hline GND & M4 & & \\
\hline NC & A17 & & \multirow{5}{*}{No internal connection．Pins should be left floating．} \\
\hline NC & S1 & & \\
\hline NC & T1 & & \\
\hline NC & T16 & & \\
\hline NC & T17 & & \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) Input，output，and high－impedance state．
}

\section*{'ACT8847 Specifications}

\section*{absolute maximum ratings over operating free-air temperature range (unless otherwise noted) \({ }^{\dagger}\)}

Supply voltage, VCC . . . . . . . . . . . . . . . . . . . . . -0.5 V to 6 V
Input clamp current, \(\mathrm{l}_{\mathrm{K}}\left(\mathrm{V}_{\mathrm{I}}<0\right.\) or \(\left.\mathrm{V}_{\mathrm{I}}>\mathrm{V}_{\mathrm{CC}}\right)\). . ...... \(\pm 20 \mathrm{~mA}\)
Output clamp current, \(\mathrm{IOK}_{\mathrm{O}}\left(\mathrm{V}_{\mathrm{O}}<0\right.\) or \(\left.\mathrm{V}_{\mathrm{O}}>\mathrm{V}_{\mathrm{C}}\right) \ldots \ldots \mathrm{m}\). . \(\pm 50 \mathrm{~mA}\)
Continuous output current, \(\mathrm{I}_{\mathrm{O}}\) ( \(\mathrm{VO}_{\mathrm{O}}=\mathrm{V}_{\mathrm{C}}\) ) . . . . . . . . . \(\pm 50 \mathrm{~mA}\)
Continuous current through \(\mathrm{V}_{\mathrm{CC}}\) or GND pins . . . . . . . \(\pm 100 \mathrm{~mA}\) Operating free-air temperature range . . . . . . . . . . . . \(0^{\circ} \mathrm{C}\) to \(70^{\circ} \mathrm{C}\)
Storage temperature range . . . . . . . . . . . . . . . . . \(-65^{\circ} \mathrm{C}\) to \(150^{\circ} \mathrm{C}\)
†Stresses beyond those listed under "absolute maximum ratings" may cause permanent damage to the device. These are stress ratings only and functional operation of the device at these or any other conditions beyond those indicated under "recommended operating conditions" is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.

\section*{recommended operating conditions}
\begin{tabular}{|ll|rrr|c|}
\hline \multirow{2}{*}{ PARAMETER } & \multicolumn{3}{|c|}{ SN74ACT8847 } & \multirow{2}{*}{ UNIT } \\
\cline { 3 - 4 } & & MIN & NOM & MAX & \\
\hline \(\mathrm{V}_{\mathrm{CC}}\) & Supply voltage & 4.75 & 5.0 & 5.25 & V \\
\hline \(\mathrm{~V}_{\text {IH }}\) & High-level input voltage & 2 & & \(\mathrm{~V}_{\mathrm{CC}}\) & V \\
\hline \(\mathrm{V}_{\text {IL }}\) & Low-level input voltage & 0 & & 0.8 & V \\
\hline \(\mathrm{I}_{\mathrm{OH}}\) & High-level output current & & -8 & mA \\
\hline \(\mathrm{I}_{\mathrm{OL}}\) & Low-level output current & & 8 & mA \\
\hline \(\mathrm{~V}_{\mathrm{I}}\) & Input voltage & 0 & \(\mathrm{~V}_{\mathrm{CC}}\) & V \\
\hline \(\mathrm{V}_{\mathrm{O}}\) & Output voltage & 0 & \(\mathrm{~V}_{\mathrm{CC}}\) & V \\
\hline dt/dv & Input transition rise or fall rate & 0 & 15 & \(\mathrm{~ns} / \mathrm{V}\) \\
\hline \(\mathrm{T}_{\mathrm{A}}\) & Operating free-air temperature & 0 & 70 & \({ }^{\circ} \mathrm{C}\) \\
\hline
\end{tabular}
electrical characteristics over recommended operating free－air temperature range（unless otherwise noted）
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[b]{2}{*}{PARAMETER} & \multirow[b]{2}{*}{TEST CONDITIONS} & \multirow[b]{2}{*}{Vcc} & \multicolumn{2}{|l|}{\(\mathrm{T}_{\mathrm{A}}=25^{\circ} \mathrm{C}\)} & \multicolumn{2}{|l|}{SN74ACT8847} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & MIN & TYP MAX & MIN & TYP MAX & \\
\hline \multirow{4}{*}{\(\mathrm{V}_{\mathrm{OH}}\)} & \multirow[b]{2}{*}{\(\mathrm{IOH}^{\prime}=-20 \mu \mathrm{~A}\)} & 4.75 V & & 4.74 & 4.55 & & \multirow{4}{*}{V} \\
\hline & & 5.25 V & & 5.24 & 5.05 & & \\
\hline & \multirow[b]{2}{*}{\(\mathrm{IOH}=-8 \mathrm{~mA}\)} & 4.75 V & & & 3.7 & & \\
\hline & & 5.25 V & & & 4.7 & \(4^{3}\) & \\
\hline \multirow{4}{*}{\(\mathrm{V}_{\mathrm{OL}}\)} & \multirow[b]{2}{*}{\({ }^{\prime} \mathrm{OL}=20 \mu \mathrm{~A}\)} & 4.75 V & & 0.01 & & 0.10 & \multirow{4}{*}{V} \\
\hline & & 5.25 V & & 0.01 & 0 & 0.10 & \\
\hline & \multirow[b]{2}{*}{\(\mathrm{I}^{\text {OL }}=8 \mathrm{~mA}\)} & 4.75 V & & & & 0.45 & \\
\hline & & 5.25 V & & \({ }^{\circ}\) & & 0.45 & \\
\hline 1 & \(\mathrm{V}_{1}=\mathrm{V}_{\mathrm{CC}}\) or 0 & 5.25 V & & & & \(\pm 5\) & \(\mu \mathrm{A}\) \\
\hline Ioz & \(\mathrm{V}_{1}=\mathrm{V}_{\text {CC }}\) or \(0, \mathrm{I}_{0}\) & 5.25 V & & \({ }^{2}\) & & \(\pm 10\) & \(\mu \mathrm{A}\) \\
\hline Íco & \(\mathrm{V}_{1}=\mathrm{V}_{\text {cc }}\) or 0,10 & 5.25 V & & & & 200 & \(\mu \mathrm{A}\) \\
\hline \(\mathrm{C}_{\mathrm{i}}\) & \(\mathrm{V}_{\mathrm{i}}=\mathrm{V}_{\mathrm{CC}}\) or 0 & 5 V & & & & 10 & pF \\
\hline
\end{tabular}

\section*{switching characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{PARAMETER} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{TO (OUTPUT)} & \multirow[t]{2}{*}{\begin{tabular}{|c} 
PIPELINE \\
CONTROLS \\
PIPES2-PIPESO
\end{tabular}} & \multicolumn{2}{|l|}{SN74ACT8847-30} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN & MAX & \\
\hline 1 & \(\mathrm{t}_{\mathrm{pd} 1}\) & DA/DB/Inst & Y OUTPUT & 111 & & † & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{\text {p }}\) d2} & INPUT REG & Y OUTPUT & 110 & & 70 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & & 70 & \\
\hline \multirow[b]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{\text {t }}\) pd3} & PIPELN REG & Y OUTPUT & 10x & & 48 & \multirow[b]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & & 48 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 4\)} & OUTPUT REG & Y OUTPUT & OXX & & 20 & \multirow[b]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xX & & 20 & \\
\hline 5 & \({ }^{\text {p }}\) d5 & SELMS/ \(\overline{\text { LS }}\) & Y OUTPUT & XXX & & 18 & ns \\
\hline 6 & \({ }^{\text {tpd }}\) 6 & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 & & ns \\
\hline 7 & \({ }^{\text {tpd7 }}\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 & & ns \\
\hline 8 & \({ }^{t} \mathrm{pd8}\) & SELMS/[5] & Y OUTPUT INVALID & XXX & 1.5 & & ns \\
\hline 9 & \(\mathrm{t}_{\mathrm{d} 1}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 & 56 & & \multirow[b]{3}{*}{ns} \\
\hline 10 & \(\mathrm{t}_{\mathrm{d} 2}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 & 30 & & \\
\hline 11 & \(t_{\text {d }}\) & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock. (PIPES2-PIPESO \(=0 X X\) )} & 12 & \(\mathrm{t}_{\mathrm{d}} \mathrm{O}^{\text {¢ }}\) & \\
\hline 12 & ten1 & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 12 & \multirow{4}{*}{ns} \\
\hline 13 & \(\mathrm{t}_{\text {en } 2}\) & \(\overline{\text { OEC, }} \overline{O E S}\) & STATUS & \(x \times x\) & & 12 & \\
\hline 14 & \(\mathrm{t}_{\text {dis } 1}\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 12 & \\
\hline 15 & \({ }_{\text {dis }}\) 2 & \(\overline{\text { OEC, }}\) OES & STATUS & XXX & & 12 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\(\ddagger\) Minimum clock cycle period not guaranteed when operands are fed back using FLOWC to bypass the C register and operands are used on the same clock cycle.
\(\S_{t_{d}}\) is the clock cycle period.

\section*{setup and hold times}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO．} & \multicolumn{2}{|r|}{\multirow[t]{2}{*}{PARAMETER}} & \multirow[t]{2}{*}{PIPELINE
CONTROLS
PIPES2－PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847－30} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & MIN & MAX & \\
\hline 16 & \(\mathrm{t}_{\text {su }} 1\) & Inst／control before CLK \(\uparrow\) ． & XX0 & 12 & & \multirow{6}{*}{ns} \\
\hline 17 & \(\mathrm{t}_{\text {su }}\) & DA／DB before CLK \(\uparrow\) & XX0 & 11 & & \\
\hline 18 & \(t_{\text {su3 }}\) & DA／DB before 2nd CLK \(\uparrow\)（DP） & XX1 & 40 & & \\
\hline 19 & \(\mathrm{t}_{\text {su4 }}\) & CONFIG1－0 before CLK \(\uparrow\) & XX0 & 12 & & \\
\hline 20 & \(\mathrm{t}_{\text {su }}\) 5 & SRCC before CLKC \(\uparrow\) & XXX & 10 & & \\
\hline 21 & \(\mathrm{t}_{\text {su6 }}\) & \(\overline{\text { RESET before CLK } \uparrow ~}\) & XX0 & 12 & & \\
\hline 22 & th1 & Inst／control after CLK \(\uparrow\) & XXX & 1 & & \multirow{4}{*}{ns} \\
\hline 23 & th2 & DA／DB after CLK \(\uparrow\) & XXX & 1 & & \\
\hline 24 & th3 & SRCC after CLKC \(\uparrow\) & XXX & 1 & & \\
\hline 25 & th4 & \(\overline{\text { RESET after CLK } \uparrow}\) & XX0 & 6 & & \\
\hline
\end{tabular}

\section*{CLK／RESET requirements}
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{\multirow[b]{2}{*}{PARAMETER}} & \multicolumn{2}{|l|}{SN74ACT8847－30} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & MIN & MAX & \\
\hline \multirow{3}{*}{\(t_{\text {w }}\)} & \multirow{3}{*}{Pulse duration} & CLK high & 10 & & \multirow{3}{*}{ns} \\
\hline & & CLK Iow & 10 & & \\
\hline & & RESET & 10 & & \\
\hline
\end{tabular}
switching characteristics
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{PARAMETER} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{то (OUTPUT)} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847-40} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN & MAX & \\
\hline 1 & \(\mathrm{t}_{\mathrm{pd}} 1\) & DA/DB/Inst & Y OUTPUT & 111 & & † & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{\text {tpd2 }}\)} & INPUT REG & Y OUTPUT & 110 & & 90 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & & 90 & \\
\hline \multirow[b]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{\text {tpd }}\) 3} & PIPELN REG & Y OUTPUT & 10x & & 60 & \multirow[b]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & & 60 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 4\)} & OUTPUT REG & Y OUTPUT & 0xX & & 24 & \multirow[t]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xX & & 24 & \\
\hline 5 & \({ }^{\text {tpd5 }}\) & SELMS/[SS & Y OUTPUT & xxx & & 20 & ns \\
\hline 6 & \({ }^{\text {tpd }}\) 6 & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 & & ns \\
\hline 7 & \({ }^{\text {tpd }} 7\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 & & ns \\
\hline 8 & \(t_{\text {pd8 }}\) & SELMS/ \(\overline{\text { S }}\) & Y OUTPUT INVALID & XXX & 1.5 & & ns \\
\hline 9 & \(\mathrm{t}_{\mathrm{d} 1}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 & 72 & & \multirow[b]{3}{*}{ns} \\
\hline 10 & \(\mathrm{t}_{\mathrm{d} 2}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 & 40 & & \\
\hline 11 & \({ }^{\text {d }} 3\) & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock. (PIPES2-PIPESO \(=0 X X\) )} & 16 & \(t_{d}-0^{\S}\) & \\
\hline 12 & ten1 & \(\overline{\text { OEY }}\) & Y OUTPUT & XXX & & 16 & \multirow{4}{*}{ns} \\
\hline 13 & ten2 & \(\overline{\mathrm{OEC}}, \overline{\mathrm{OES}}\) & STATUS & \(x \times x\) & & 16 & \\
\hline 14 & \(\mathrm{t}_{\text {dis } 1}\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 16 & \\
\hline 15 & \(\mathrm{t}_{\text {dis } 2}\) & \(\overline{\mathrm{OEC}}, \overline{\mathrm{OES}}\) & STATUS & XXX & & 16 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\(\ddagger\) Minimum clock cycle period not guaranteed when operands are fed back using FLOWC to bypass the \(C\) register and operands are used on the same cycle.
\(\S_{t_{d}}\) is the clock cycle period.
setup and hold times
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO．} & \multicolumn{2}{|r|}{\multirow[t]{2}{*}{PARAMETER}} & \multirow[t]{2}{*}{PIPELINE
CONTROLS
PIPES2－PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847－40} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & MIN & MAX & \\
\hline 16 & \(\mathrm{t}_{\text {su }} 1\) & Inst／control before CLK \(\uparrow\) & XX0 & 14 & & \multirow{6}{*}{ns} \\
\hline 17 & \(\mathrm{t}_{\text {su }}\) & DA／DB before CLK \(\uparrow\) & XX0 & 13 & & \\
\hline 18 & \(\mathrm{t}_{\text {su }}\) & DA／DB before 2nd CLK \(\uparrow\)（DP） & XX1 & 52 & & \\
\hline 19 & \(\mathrm{t}_{\text {su4 }}\) & CONFIG1－0 before CLK \(\uparrow\) & XX0 & 14 & & \\
\hline 20 & \(\mathrm{t}_{\text {su }}\) & SRCC before CLKC \(\uparrow\) & XXX & 14 & & \\
\hline 21 & \(\mathrm{t}_{\text {su }}\) & RESET before CLK \(\uparrow\) & XX0 & 14 & & \\
\hline 22 & th1 & Inst／control after CLK \(\uparrow\) & XXX & 3 & & \multirow{4}{*}{ns} \\
\hline 23 & th2 & DA／DB after CLK \(\uparrow\) & XXX & 3 & & \\
\hline 24 & th3 & SRCC after CLKC \(\uparrow\) & XXX & 3 & & \\
\hline 25 & th4 & RESET after CLK \(\uparrow\) & XX0 & 6 & & \\
\hline
\end{tabular}

\section*{CLK／RESET requirements}
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{\multirow[b]{2}{*}{PARAMETER}} & \multicolumn{2}{|l|}{SN74ACT8847－40} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & MIN & MAX & \\
\hline \multirow{3}{*}{\({ }^{\text {w }}\) w} & \multirow{3}{*}{Pulse duration} & CLK high & 15 & & \multirow{3}{*}{ns} \\
\hline & & CLK low & 15 & & \\
\hline & & \(\overline{\text { RESET }}\) & 12 & & \\
\hline
\end{tabular}

\section*{switching characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{PARAMETER} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{TO (OUTPUT)} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847-50} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN & MAX & \\
\hline 1 & \(t_{\text {pd1 }}\) & DA/DB/Inst & Y OUTPUT & 111 & & \(\dagger\) & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 2\)} & INPUT REG & Y OUTPUT & 110 & & 120 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & & 120 & \\
\hline \multirow[b]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{\text {tpd3 }}\)} & PIPELN REG & Y OUTPUT & 10X & & 75 & \multirow[b]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & & 75 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{\text {p }}\) d4 4} & OUTPUT REG & Y OUTPUT & 0xX & & 36 & \multirow[t]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xX & & 36 & \\
\hline 5 & \(t_{\text {pd5 }}\) & SELMS/[̄] & Y OUTPUT & XXX & & 24 & ns \\
\hline 6 & \({ }^{t} \mathrm{pd} 6\) & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 & & ns \\
\hline 7 & \({ }^{t} \mathrm{pd} 7\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 & & ns \\
\hline 8 & \({ }^{t}\) pd8 & SELMS/[डS & Y OUTPUT INVALID & XXX & 1.5 & & ns \\
\hline 9 & \(\mathrm{t}_{\mathrm{d} 1}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 & 100 & & \multirow[b]{3}{*}{ns} \\
\hline 10 & \(\mathrm{t}_{\mathrm{d} 2}{ }^{\ddagger}\) & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 & 50 & & \\
\hline 11 & \({ }^{\text {t }} 3\) & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock. (PIPES2-PIPESO \(=0 X X\) )} & 16 & \(\mathrm{t}_{\mathrm{d}} \mathrm{O}^{\text {¢ }}\) & \\
\hline 12 & ten 1 & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 20 & \multirow{4}{*}{ns} \\
\hline 13 & ten2 & \(\overline{\mathrm{OEC}}, \overline{\mathrm{OES}}\) & Status & XXX & & 20 & \\
\hline 14 & \(\mathrm{t}_{\text {dis }} 1\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 20 & \\
\hline 15 & \(\mathrm{t}_{\text {dis } 2}\) & \(\overline{\text { OEC, }} \overline{\text { OES }}\) & STATUS & XXX & & 20 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\(\ddagger\) Minimum clock cycle period not guaranteed when operands are fed back using FLOWC to bypass the C register and operands are used on the same cycle.
\({ }^{\dagger}{ }_{t d}\) is the clock cycle period.
setup and hold times
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multicolumn{2}{|r|}{\multirow[t]{2}{*}{PARAMETER}} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847-50} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & MIN & MAX & \\
\hline 16 & \(\mathrm{t}_{\text {su }} 1\) & Inst/control before CLK \(\uparrow\) & XX0 & 16 & & \multirow{6}{*}{ns} \\
\hline 17 & \(\mathrm{t}_{\text {su }}\) & DA/DB before CLK \(\uparrow\) & XX0 & 16 & & \\
\hline 18 & \(\mathrm{t}_{\text {su3 }}\) & DA/DB before 2nd CLK \(\uparrow\) (DP) & XX1 & 75 & & \\
\hline 19 & \(\mathrm{t}_{\text {su4 }}\) & CONFIG1-0 before CLK \(\uparrow\) & XX0 & 18 & & \\
\hline 20 & \(\mathrm{t}_{\text {su }}\) & SRCC before CLKC \(\uparrow\) & XXX & 16 & & \\
\hline 21 & \(\mathrm{t}_{\text {su6 }}\) & \(\overline{\text { RESET }}\) before CLK \(\uparrow\) & XX0 & 16 & & \\
\hline 22 & th1 & Inst/control after CLK \(\uparrow\) & XXX & 3 & & \multirow{4}{*}{ns} \\
\hline 23 & th2 & DA/DB after CLK \(\uparrow\) & XXX & 3 & & \\
\hline 24 & th3 & SRCC after CLKC \(\uparrow\) & XXX & 3 & & \\
\hline 25 & th4 & \(\overline{\text { RESET after CLK } \uparrow}\) & XX0 & 6 & & \\
\hline
\end{tabular}

\section*{CLK/RESET requirements}
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{\multirow[b]{2}{*}{PARAMETER}} & \multicolumn{2}{|l|}{SN74ACT8847-50} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & MIN & MAX & \\
\hline \multirow{3}{*}{\(t_{w}\)} & \multirow{3}{*}{Pulse duration} & CLK high & 15 & & \multirow{3}{*}{ns} \\
\hline & & CLK low & 15 & & \\
\hline & & RESET & 15 & & \\
\hline
\end{tabular}

\section*{'ACT8847 Load Circuit}

The load circuit for the 'ACT8847 is shown in Figure 1.

\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multicolumn{2}{|l|}{TIMING PARAMETER} & \(C_{L}{ }^{\dagger}\) & IOL & IOH & \(\mathrm{V}_{\mathrm{L}}\) & S1 \\
\hline \multirow[b]{2}{*}{\({ }^{\text {ten }}\)} & tpZH & \multirow[t]{2}{*}{50 pF} & \multirow[t]{2}{*}{1 mA} & \multirow[t]{2}{*}{- 1 mA} & \multirow[t]{2}{*}{1.5 V} & \multirow[t]{2}{*}{CLOSED} \\
\hline & tPLH & & & & & \\
\hline \multirow[b]{2}{*}{\({ }^{\text {d }}\) dis} & tPHZ & \multirow[t]{2}{*}{50 pF} & \multirow[t]{2}{*}{16 mA} & \multirow[t]{2}{*}{- 16 mA} & \multirow[t]{2}{*}{1.5 V} & \multirow[t]{2}{*}{CLOSED} \\
\hline & tPLZ & & & & & \\
\hline \multicolumn{2}{|l|}{\(\mathrm{t}_{\mathrm{pd}}\)} & 50 pF & - & - & - & OPEN \\
\hline
\end{tabular}
\({ }^{\dagger} C_{L}\) includes probe and test fixture capacitance.
NOTE: All input pulses are supplied by generators having the following characteristics: \(P R R \leq 1 \mathrm{MHz}, \mathrm{Z}_{\mathrm{O}}=50 \Omega, \mathrm{t}_{\mathrm{r}} \leq 6 \mathrm{~ns}, \mathrm{t}_{\mathrm{f}} \leq 6 \mathrm{~ns}\).

Figure 1. Load Circuit

\section*{Lヤ881つもヤLNS}


NOTES：Assume the following mixed precision operation．
Single precision OPO + OP1 \(=\) RA + RB \(\rightarrow\) SUM1 \(\rightarrow\) CREG，where OPO is SP and OP1 is SP．
Mixed precision OP2＊OP2＝RA＊RB \(\rightarrow\) PRODUCT1，where OP3 is SP and OP2 is DP．
NOP（must be inserted）．
Mixed precision（OP3＊OP2）\(+(O P O+O P 1)=\) PREG + CREG \(\rightarrow\) SUM2（DP），and then convert to SP
Assume valid control signals for \(F A S T, \overline{H A L T}=1\), PIPES2－0 \(=000\)（fully pipelined mode），\(\overline{\text { RESET }}=1\), RND1－ 0, SELST1－0 \(=11, T P 1-0=11\) ．
Figure 2a．Timing Diagram for：SP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU \(\rightarrow\) Convert DP to SP


NOTES: Assume the following mixed precision operation.
Single precision OPO + OP1 \(=\) RA + RB \(\rightarrow\) SUM1 \(\rightarrow\) CREG, where OPO is SP and OP1 is SP.
Mixed precision OP2*OP2 \(=\) RA*RB \(\rightarrow\) PRODUCT1, where OP3 is SP and OP2 is DP.
NOP (must be inserted).
Mixed precision (OP3*OP2) \(+(O P O+O P 1)=\) PREG + CREG \(\rightarrow\) SUM2 (DP), and then convert to SP.
Assume valid control signals for \(F A S T, \overline{H A L T}=1\), PIPES2-0 \(=000\) (fully pipelined mode), \(\overline{R E S E T}=1\), RND1-0, SELST1-0 \(=11\), TP1-0 \(=11\).
Figure 2b. Timing Diagram for: SP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU \(\rightarrow\) Convert DP to SP

Lセ881つ甘もLNS


NOTES：Assume the following double precision operation．
\(\mathrm{OPO}+\mathrm{OP} 1=\mathrm{RA}+\mathrm{RB} \rightarrow \mathrm{SUM} 1 \rightarrow\) CREG
\((O P O+O P 1) * O P 2=\) SREG \(* R B \rightarrow\) PRODUCT 1
\([(\mathrm{OPO}+\mathrm{OP} 1) * \mathrm{OP} 2)]+(\mathrm{OPO}+\mathrm{OP} 1)=\) PREG + CREG \(\rightarrow\) SUM 2
Assume valid control signals for \(\mathrm{FAST}, \overline{\mathrm{HALT}}=1, \mathrm{PIPES} 2-0=000\)（fully pipelined mode），\(\overline{\mathrm{RESET}}=1, \mathrm{RND} 1-0, \mathrm{SELST} 1-0=11, \mathrm{TP} 1-0=11\) ．
Figure 3a．Timing Diagram for：DP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU


NOTES: Assume the following double precision operation.
\(\mathrm{OPO}+\mathrm{OP} 1=\mathrm{RA}+\mathrm{RB} \rightarrow \mathrm{SUM} 1 \rightarrow\) CREG
\((\mathrm{OPO}+\mathrm{OP} 1) * \mathrm{OP} 2=\) SREG \(* \mathrm{RB} \rightarrow\) PRODUCT1
\([(\mathrm{OPO}+\mathrm{OP} 1) * \mathrm{OP} 2)]+(\mathrm{OPO}+\mathrm{OP} 1)=\) PREG + CREG \(\rightarrow\) SUM2
Assume valid control signals for \(F A S T, \overline{H A L T}=1\), PIPES2-0 \(=000\) (fully pipelined mode), \(\overline{R E S E T}=1, R N D 1-0, S E L S T 1-0=11, T P 1-0=11\).
Figure 3b. Timing Diagram for: DP ALU \(\rightarrow\) DP MULT \(\rightarrow\) DP ALU

\section*{Lヤ88」つもヤLNS}


NOTES：Assume the following single precision operations．
\((\mathrm{K} * \mathrm{OPO})+\mathrm{OP} 1=\) PRODUCT \(1+\mathrm{OP} 1 \rightarrow\) SUM 1
\((\mathrm{K} * \mathrm{OP} 2)+\mathrm{OP} 3=\mathrm{PRODUCT} 2+\mathrm{OP} 3 \rightarrow\) SUM2
\((K * O P 4)+\) OP5 \(=\) PRODUCT3 + OP5 \(\rightarrow\) SUM3
\((K * O P 6)+\) OP7 \(=\) PRODUCT4 + OP7 \(\rightarrow\) SUM4
Assume valid control signals for \(\mathrm{FAST}, \overline{\mathrm{HALT}}=1, \operatorname{PIPES} 2-0=010, \overline{\operatorname{RESET}}=1\), RND \(1-0, \operatorname{SELST} 1-0=11\) ， TP1－0 \(=11\) ．

Figure 4a．Timing Diagram for：SP［（Scalar＊Vector）＋Vector］


NOTES: Assume the following single precision operations.
\((\mathrm{K} * \mathrm{OPO})+\mathrm{OP} 1=\) PRODUCT1 + OP1 \(\rightarrow\) SUM1
\((\mathrm{K} * \mathrm{OP} 2)+\mathrm{OP} 3=\) PRODUCT2 + OP3 \(\rightarrow\) SUM2
\((\mathrm{K} * \mathrm{OP} 4)+\mathrm{OP} 5=\) PRODUCT3 + OP5 \(\rightarrow\) SUM 3
\((\mathrm{K} * \mathrm{OP} 6)+\mathrm{OP} 7=\) PRODUCT \(4+\mathrm{OP} 7 \rightarrow\) SUM 4
Assume valid control signals for \(\mathrm{FAST}, \overline{\mathrm{HALT}}=1, \mathrm{PIPES} 2-0=010, \overline{\mathrm{RESET}}=1\), RND \(1-0, \mathrm{SELST} 1-0=11\), \(T P 1-0=11\).

Figure 4b. Timing Diagram for: SP [(Scalar * Vector) + Vector]

\title{
SN74ACT8847 64－Bit Floating Point Unit
}

\section*{Introduction}

Designing with the SN74ACT8847 floating point unit（FPU）requires a thorough understanding of computer architectures，microprogramming，and IEEE floating point arithmetic，as well as a detailed knowledge of the＇ACT8847 itself．This introduction presents a brief overview of the＇ACT8847 and discusses a number of issues when designing and programming with this FPU．

\section*{Major Architectural Features}

The overall architecture for a floating point system is determined by a combination of design factors．The principal consideration is the set of performance targets that the floating point processor has to achieve，usually expressed in terms of clock cycle period，operating mode（vector or scalar），and operand precision（ 32 bit， 64 bit，or other）．Of almost equal importance are design constraints of cost，complexity，chip count，power consumption，and requirements for interfacing to other processors．

The architecture of the＇ACT8847 is optimized to satisfy several processing and interface requirements．The FPU has two 32－bit input buses，the DA and DB data buses， and one 32－bit output bus，the Y bus．This three－port design provides much greater I／O bus bandwidth than can be achieved by a single－port device（one 32－bit I／O bus）． Two single－precision inputs can be simultaneously loaded on the input buses while a result is being output on the Y bus．

Internally，the＇ACT8847 FPU consists of two main functional blocks：the multiplier and the ALU（see Figure 5）．Either the multiplier or the ALU can operate independently， or the two functional units can be used simultaneously in＂chained＂mode．When operating independently，each block of the FPU performs a separate set of arithmetic or logical functions．The multiplier supports multiplication，division and square roots． The ALU supports addition，subtraction，format conversions，logical operations，and shifts．Integer division and integer square root require both the multiplier and the ALU； the final result comes from the ALU．

In chained mode，a multiplier operation executes in parallel with an ALU operation． Possible examples include calculations of a sum of products（multiply and accumulate） or a product of sums（add and then multiply）．The sum of products computation requires a total of four operands：two new inputs to be multiplied，the sum of previous products， and the current product to be added to the sum，as shown in Table 3.


Figure 5. High Level Block Diagram
Table 3. Sum of Products Calculation
\begin{tabular}{|c|c|}
\hline MULTIPLIER OPERATION & ALU OPERATION \\
\hline \(\mathrm{A} * \mathrm{~B}\) & - \\
\hline \(\mathrm{C} * \mathrm{D}\) & \((\mathrm{A} * \mathrm{~B})+0\) \\
\hline \(\mathrm{E} * \mathrm{~F}\) & \((\mathrm{C} * \mathrm{D})+(\mathrm{A} * \mathrm{~B})\) \\
\hline\(\bullet\) & \(\bullet\) \\
\(\bullet\) & \(\bullet\) \\
\hline
\end{tabular}

Because the 'ACT8847 has multiple internal data paths and data registers, this sum of products can be generated by simultaneous operations on new bus data and internal feedback, without the necessity of storing either the previous accumulation or the current product off chip. Data flow for the sum of products calculation is shown in Figure 6.


Figure 6. Multiply/Accumulate Operation

\section*{Data Flow in Pipelined Architectures}

Several levels of internal data registers are available to segment the internal data paths of the 'ACT8847. The most basic choice is whether to use the device in flowthrough mode (with no internal registers enabled) or whether to enable one or more registers. When none of the internal registers are enabled, the paths through the multiplier and the ALU are not segmented. In this case, the delay from data input to result output is the longest.

Enabling one or more registers divides the data paths so that data can be clocked into internal registers, instead of from an external source to an external destination. Enabling the input registers permits data and instruction inputs to be registered on chip. Also, the hardware division and square root operations which the 'ACT8847 performs require that the input registers be enabled.

In the main data paths, three sets of internal registers are available in the ACT8847: input registers, pipeline registers in the multiplier and ALU logic blocks, and output registers to capture results from the multiplier and the ALU. When all three levels of data registers are enabled, the register-to-register delay inside the device is minimized. This is the fastest operating mode, and in this configuration the 'ACT8847 is said to be "fully pipelined." While one instruction is executing, the next instruction along with its associated operands may be input to device so that overlapped operations occur (see Figure 7).

The selection of operating mode, from flowthrough to fully pipelined, determines the latency from input to output, the number of clock cycles required for inputs to be processed and results to appear. For each register level enabled in the data path, one clock cycle is added to the latency from input to output.


Figure 7. Example of Fully Pipelined Operation

\section*{Control Architectures for High－Speed Microprogrammed Architectures}

A separate control circuit is required to sequence the operation of the＇ACT8847．A sequencer function within the control circuit controls both the sequencer and FPU as determined by FPU status outputs．Either a standard microsequencer such as the SN74ACT8818，or a custom controller such as a PLA or gate array can be used to control the FPU．Figure 8 shows an example block diagram for a PLA control circuit．

If a standard microsequencer is used，execution addresses for routines stored in the microprogram memory are generated by the microsequencer．As its name implies， microprogram memory stores the sequences of microinstructions which control FPU execution．The＇ACT8847 can be programmed by generating all control bits in a given microinstruction to select an FPU operation．

One possible control circuit for the＇ACT8847 consists of a microsequencer， microprogram memory，and one or more microinstruction registers，together with status logic as required to support a specific floating point implementation．A control circuit without an instruction register is typically too slow for use with the＇ACT8847．At least one microinstruction register is used to hold the current instruction being executed by the FPU and sequencer（see Figure 9）．

Inclusion of the microinstruction register divides the critical path from the sequencer through the program memory to the FPU control inputs，permitting much faster execution times．However，when all the internal registers of the FPU are enabled，FPU operation may be fast enough to require a second register in the control circuit．In this case，a register on the output bus of the sequencer captures each microprogram address，and the microinstruction register captures each microinstruction（see Figure 10）．


Figure 8．PLA Control Circuit Example


Figure 9. Microprogrammed Architecture


Figure 10．Microprogrammed Architecture with Address Register

Introducing registers in the FPU data paths and the control circuit complicates I/O timing, status output timing, the status logic and the microprogram for the FPU and the sequencer. These timing relationships affect branches, jumps to subroutine, and other operations depending on FPU status. Some of these programming issues are discussed below.

\section*{Microprogram Control of an 'ACT8847 FPU Subsystem}

A microprogram to control the 'ACT8847 must take into account not only the FPU operation but also the sequencer operation, especially when the system is performing a branch on status or handling an exception.

Several options are available for dealing with such exceptions. The 'ACT8847 can be programmed to discard operands in invalid formats, and some exceptions caused by illegal operations. In general, though, the microprogram should be designed to handle a range of status results or exceptions. Hardware timing considerations such as pipeline delays in both control and data paths must be studied to minimize the difficulty of performing branches to status exception handlers.

Later sections of the 'ACT8847 user guide present detailed examples of microinstructions and timing waveforms, along with interpretations of status outputs and the choices involved in handling IEEE status exceptions.

\section*{'ACT8847 Data Formats}

The 'ACT8847 accepts either operands as normalized IEEE floating point numbers, (ANSI/IEEE standard 754-1985), unsigned 32-bit integers, or 2's complement integers. Floating point operands may be either single precision (32 bits) or double precision (64 bits).

IEEE formats for floating point operands, both single and double precision, consist of three fields: sign, exponent, and fraction, in that order. The leftmost (most significant) bit is the sign bit. The exponent field is 8 bits long in single-precision operands and 11 bits long in double-precision operands. The fraction field is 23 bits in single precision and 52 bits in double precision. The value of the fraction contains a hidden bit, an implicit leading " 1 ", as shown below:

\section*{1.fraction}

The representation of a normalized floating point number is:
\[
\left.(-1) s * 1 . f * 2^{(e-b i a s}\right)
\]
where the bias is either 127 for single-precision operands or 1023 for double-precision operands.

The formats for single-precision and double-precision numbers are shown in Figure 11 and Figure 12, respectively. Further details of IEEE formats and exceptions are provided in the IEEE Standard for Binary Floating Point Arithmetic, ANSI/IEEE Std 754-1985.


Figure 11. IEEE Single-Precision Format


Figure 12. IEEE Double-Precision Format

The 'ACT8847 also handles two other operand formats which permit operations with very small floating point numbers. The ALU accepts denormalized floating point numbers, that is, floating point numbers so small that they could not be normalized. If these denormal operands are input to the multiplier, they will cause status exceptions. Denormals can be passed through the ALU to be "wrapped," and the wrapped operands can then be input to the multiplier.

A denormalized input has the form of a floating point number with a zero exponent, a nonzero mantissa, and a zero in the leftmost bit of the mantissa (hidden or implicit bit). Using single precision, a denorm is equal to:
\[
(-1) s *(2)-126 * \text { fraction }
\]

For double precision, a denorm is equal to:
\[
(-1) s *(2)-1022 * \text { fraction }
\]

A denormalized number results from decrementing the biased exponent field to zero before normalization is complete. Since a denormalized number cannot be input to the multiplier, it must first be converted to a wrapped number by the ALU. A wrapped number is a number created by normalizing a denormalized number's fraction field and subtracting from the exponent the number of shift positions (minus one) required to do so. The exponent is encoded as a two's complement negative number. When the mantissa of the denormal is normalized by shifting it left, the exponent field decrements from all zeros (wraps past zero) to a negative two's complement number (except in the case of \(0.1 \times X X \ldots\), where the exponent is not decremented).

Floating point formats handled by the 'ACT8847 are presented in Table 4.

Table 4. IEEE Floating Point Representations
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline TYPE OF OPERAND & \multicolumn{2}{|l|}{EXPONENT (e)
SP (HEX)
DP (HEX)} & FRACTION (f) (BINARY) & \[
\begin{array}{|c}
\hline \text { HIDDEN } \\
\text { BIT } \\
\hline
\end{array}
\] & VALUE OF SP (DECIMAL) \({ }^{\dagger}\) & BER REPRESENTED DP (DECIMAL) \({ }^{\dagger}\) \\
\hline Normalized Number (max) & FE & 7FE & All 1's & 1 & \((-1)^{s}\left(2^{127}\right)(2-2-23)\) & \((-1)^{s}\left(2^{1023}\right)(2-2-52)\) \\
\hline Normalized Number (min) & 01 & 001 & All 0's & 1 & \((-1)^{\text {s }}(2-126)(1)\) & \((-1)^{\mathrm{s}}(2-1022)(1)\) \\
\hline Denormalized Number (max) & 00 & 000 & All 1's & 0 & \((1-)^{s}(2-126)(1-2-23)\) & \((-1)^{\mathrm{s}}\left(2^{-1022}\right)(1-2-52)\) \\
\hline Denormalized Number (min) & 00 & 000 & 000... 001 & 0 & \((-1)^{s}(2-126)(2-23)\) & \((-1)^{s}(2-1022)(2-52)\) \\
\hline Wrapped Number (max) & 00 & 000 & All 1's & 1 & \((-1)^{s}(2-127)(2-2-23)\) & \((-1)^{\mathrm{s}}(2-1023)(2-2-52)\) \\
\hline Wrapped Number (min) & EA & 7CD & All 0's & 1 & \((-1)^{s}(2-(22+127)\) (1) & \((-1)^{s}(2-(51+1023)\) (1) \\
\hline Zero & 00 & 000 & Zero & 0 & \((-1)^{\mathrm{s}}(0.0)\) & \((-1)^{s}(0.0)\) \\
\hline Infinity & FF & 7FF & Zero & 1 & (-1) \({ }^{\text {s }}\) (infinity) & (-1)s (infinity) \\
\hline NaN (Not a Number) & FF & 7FF & Nonzero & N/A & None & None \\
\hline
\end{tabular}
\({ }^{\dagger}\) s \(=\) sign bit.

\section*{Status Outputs}

Status flags are provided to signal both floating point and integer results．Integer status is provided using AEQB for zero，NEG for sign，and OVER for overflow／carryout．

Status exceptions can result from one or more error conditions such as overflow， underflow，operands in illegal formats，invalid operations，or rounding．Exceptions may be grouped into two classes：input exceptions resulting from invalid operations or denormal inputs to the multiplier，and output exceptions resulting from illegal formats， rounding errors，or both．

\section*{SN74ACT8847 Architecture}

\section*{Overview}

The SN74ACT8847 is a high－speed floating point unit implemented in \(\mathrm{Tl}^{\prime}\) s advanced \(1-\mu \mathrm{m}\) CMOS technology．The device is fully compatible with IEEE Standard 754－1985 for addition，subtraction，multiplication，division，square root，and comparison．

The＇ACT8847 FPU also performs integer arithmetic，logical operations，and logical shifts．Absolute value conversions，floating point to integer conversions，and integer to floating point conversions are also available．The ALU and multiplier are both included in the same device and can be operated in parallel to perform sums of products and products of sums（see Figure 13）．


Figure 13．＇ACT8847 Detailed Block Diagram

IEEE formatted denormal numbers are directly handled by the ALU. Denormal numbers must be wrapped by the ALU before being used in multiplication, division, or square root operations. A fast mode in which all denormals are forced to zero is provided for applications not requiring gradual underflow.

The 'ACT8847 input buses can be configured to operate as two 32-bit data buses or as a single 64-bit bus, providing a number of system interface options. Registers are provided at the inputs, outputs, and inside the ALU and multiplier to support multilevel pipelining. These registers can be bypassed for nonpipelined operation.

A clock mode control allows the temporary input register to be clocked on the rising edge or the falling edge of the clock to support double-precision ALU operations at the same rate as single-precision operations. A feedback register (C register) with a separate clock is provided for temporary internal storage of a multiplier result, ALU result or constant.

Four multiplexers select the multiplier and ALU operands from the input registers, C register or previous multiplier or ALU result. Results are output on the 32-bit Y bus; a Y output multiplexer selects the most significant or least significant half of the result if a double-precision number is being output.

To ensure data integrity, parity checking is performed on input data, and parity is generated for output data. A master/slave comparator supports fault-tolerant system design. Two test pin control inputs allow all I/Os and outputs to be forced high, low, or placed in a high-impedance state to facilitate system testing.

\section*{Pipeline Controls}

Six data registers in the 'ACT8847 are arranged in three levels along the data paths through the multiplier and the ALU. Each level of registers can be enabled or disabled independently of the other two levels by setting the appropriate PIPES2-PIPESO inputs. When enabled, data is latched into the register on the rising edge of the system clock (CLK). A separate instruction pipeline register stores the instruction bits corresponding to the operation being executed at each stage.

The levels of pipelining are shown in Figure 14. The first set of registers, the RA and RB input registers, are controlled by PIPESO. These registers may be used as inputs to the ALU, multiplier, or both.

The pipeline registers are the second register set. When enabled by PIPES1, these registers latch intermediate values in the multiplier or ALU.

The results of the ALU and multiplier operations may optionally be latched into two output registers by setting PIPES2 low. The P (product) register holds the result of the multiplier operation; the S (sum) register holds the ALU result.

Table 5 shows the settings of the registers controlled by PIPES2-PIPESO. Operating modes range from fully pipelined (PIPES2-PIPESO \(=000\) ) to flowthrough (PIPES2-PIPESO \(=111\) ). The instruction pipeline registers are also set accordingly.


Figure 14. Pipeline Controls

Table 5．Pipeline Controls（PIPES2－PIPESO）
\begin{tabular}{|ccl|l|}
\hline \multicolumn{2}{|c|}{ PIPES2－PIPESO } & \multicolumn{1}{|c|}{ REGISTER OPERATION SELECTED } \\
\hline X & X & 0 & Enables input registers（RA，RB） \\
X & X & 1 & Makes input registers（RA，RB）transparent \\
X & 0 & X & Enables pipeline registers \\
X & 1 & X & Makes pipeline registers transparent \\
0 & X & X & Enables output registers（PREG，SREG，Status） \\
1 & X & X & Makes output registers（PREG，SREG，Status）transparent \\
\hline
\end{tabular}

In flowthrough mode all three levels of registers are transparent，a circumstance which may affect some double－precision operations．Since double－precision operands require two steps to input，at least half of the data must be clocked into the temporary register before the remaining data is placed on the DA and DB buses．

When all registers（except the \(C\) register）are enabled，timing constraints can become critical for many double－precision operations．In clock mode 1，the ALU can perform a double－precision operation and output a result during every clock cycle，and both halves of the result must be read out before the end of the next cycle．Status outputs are valid only for the period during which the Y output data is valid．

Similarly，double－precision multiplication is affected by pipelining，clock mode，and sequence of operations．A double－precise multiply may require two cycles to execute and two cycles to output the result，depending on the settings of PIPES2－PIPESO．

Duration of valid outputs at the Y multiplexer depends on settings of PIPES2－PIPESO and CLKMODE，as well as whether all operations and operands are of the same type． For example，when a double－precision multiply is followed by a single－precision operation，one clock cycle must intervene between the dissimilar operations．The instruction inputs are ignored during this clock cycle．

\section*{Temporary Input Register}

A temporary input register is provided to enable loading of two double－precision numbers on two 32－bit input buses in one clock cycle．The contents of the DA bus are loaded into the upper 32 bits of the temporary register；the contents of DB are loaded into the lower 32 bits．

A clock mode signal（CLKMODE）determines the clock edge on which the data will be stored in the temporary register．When CLKMODE is low，data is loaded on the rising edge of the clock．With CLKMODE set high，the temporary register loads on a falling edge and the RA and RB registers can then be loaded on the next rising edge． The temporary register loads during every clock cycle．

\section*{RA and RB Input Registers}

Two 64-bit registers, RA and RB, are provided to hold input data for the multiplier and ALU. Data is taken from the DA bus, DB bus and the temporary input register. The registers are loaded on the rising edge of clock CLK if the enables ENRA and ENRB are set high. PIPESO must be low.

Data input combinations to the 'ACT8847 vary depending on the precision of the operands and whether they are being input as A or B operands. Loading of external data operands is controlled by the settings of CLKMODE and CONFIG1-CONFIGO, which determine the clock timing for loading and the registers that are used. (See Figure 15).

\section*{Configuration Controls}

Three input registers are provided to handle input of data operands, either single precision or double precision. The RA, RB, and temporary registers are each 64 bits wide. The temporary register is (ordinarily) used only during input of double-precision operands.

Double-precision operands are loaded by using the temporary register to store half of the operands prior to inputting the other half of the operands on the DA and DB buses. As shown in Table 6, four configuration modes for selecting input sources are available for loading data operands into the RA and RB registers.


Figure 15. Input Register Control

Table 6．Double Precision Input Data Configuration Modes
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multirow[b]{3}{*}{CONFIG1} & \multirow[b]{3}{*}{CONFIGO} & \multicolumn{4}{|c|}{LOADING SEQUENCE} \\
\hline & & \multicolumn{2}{|l|}{DATA LOADED INTO TEMP REGISTER ON FIRST CLOCK AND RA／RB REGISTERS ON SECOND CLOCK \({ }^{\dagger}\)} & \multicolumn{2}{|l|}{DATA LOADED INTO RA／RB REGISTERS ON SECOND CLOCK} \\
\hline & & DA & DB & DA & DB \\
\hline 0 & 0 & B operand （MSH） & B operand
（LSH） & A operand （MSH） & A operand （LSH） \\
\hline 0 & 1 & A operand （LSH） & B operand （LSH） & A operand （MSH） & B operand （MSH） \\
\hline 1 & 0 & A operand （MSH） & B operand （MSH） & A operand （LSH） & B operand （LSH） \\
\hline 1 & 1 & A operand （MSH） & A operand （LSH） & B operand （MSH） & B operand （LSH）） \\
\hline
\end{tabular}
\({ }^{\dagger}\) On the first active clock edge（see Clock Mode Settings），data in this column is loaded into the temporary register．On the next rising edge，operands in the temporary register and the DA／DB buses are loaded into the RA and RB registers．

When single－precision or integer operands are loaded，the ordinary setting of CONFIG1－CONFIGO is 01，as shown in Table 7．This setting loads each 32－bit operand in the most significant half（MSH）of its respective register．Single－precision operands are loaded into the MSHs and adjusted to double precision because the data paths internal to the device are all double precision．It is also possible to load single－precision operands with other CONFIG settings but two clock edges are required to load both the \(A\) and \(B\) operands on the DA bus．The operands are input as the MSHs of the A and \(B\) operands（see Table 6）．For example，to load single－precision operands using CONFIG1－CONFIGO \(=10\) ，the \(A\) and \(B\) operands are input one active clock edge before the instruction．

Table 7．Single－Precision Input Data Configuration Mode
\begin{tabular}{|cc|cc|c|}
\hline \multirow{3}{*}{ CONFIG1 } & CONFIGO & \multicolumn{2}{|c|}{\begin{tabular}{c} 
DATA LOADED INTO \\
RA／RB REGISTERS ON \\
FIRST CLOCK
\end{tabular}} & \\
\cline { 2 - 4 } & \multicolumn{2}{|c|}{ DA } & DB & \\
\hline 0 & 1 & A operand & B operand & \begin{tabular}{l} 
This mode is ordinarily used for single－ \\
precision operations．
\end{tabular} \\
\hline
\end{tabular}

\section*{Clock Mode Settings}

Timing of double－precision data inputs is determined by the clock mode setting，which allows the temporary register to be loaded on either the rising edge（CLKMODE \(=0\) ） or the falling edge of the clock（CLKMODE \(=1\) ）．Since the temporary register is not used when single－precision operands are input，clock modes 0 and 1 are functionally equivalent for single－precision operations using CONFIG1－CONFIGO \(=01\) ．

The setting of CLKMODE can be used to speed up the loading of double-precision operands. When the CLKMODE input is set high, data on the DA and DB buses are loaded on the falling edge of the clock into the MSH and LSH, respectively, of the temporary register. On the next rising edge, contents of the DA bus, DB bus, and temporary register are loaded into the RA and RB registers, and execution of the current instruction begins. The setting of CONFIG1-CONFIGO determines the exact pattern in which operands are loaded, whether as MSH or LSH in RA or RB.

Double-precision operation in clock mode 0 is similar except that the temporary register loads only on a rising edge. For this reason, the RA and RB registers do not load until the next rising edge, when all operands are available and execution can begin.

A considerable advantage in speed can be realized by performing double-precision operations with CLKMODE set high. In this clock mode, both double-precision operands can be loaded on successive clock edges, one falling and one rising. If the instruction is an ALU operation, then the operation can be executed in the time from one rising edge of the clock to the next rising edge. Both halves of a double-precision ALU result must be read out on the \(Y\) bus within one clock cycle when the 'ACT8847 is operated in clock mode 1.

The discussion above assumes that the system is able to furnish two sets of operands in one cycle (one set on the falling edge of the clock and the other set on the next rising edge). This assumption may not be valid, since the system is required to "double pump" the input data buses.

Even for a system that is not able to double pump the input data buses, using clock mode 1 can reduce microcode size substantially resulting in increased system throughput. To illustrate, take the case of an operation where the operand(s) are furnished by one or more of the feedback registers (refer to Table 8). Since the input data buses are not being used to furnish the operands, the data on the buses at the time of the instruction is unimportant. By setting CLKMODE high, the instruction begins after the first cycle, resulting in a savings of one cycle.

Table 8a. Double-Precision CREG + PREG Using CLKMODE \(=0\), PIPES2-0 \(=010\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline CYCLE & CLKMODE & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
TEMP \\
REG
\end{tabular} & \begin{tabular}{c} 
INSTR \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} \\
\hline 1 & 0 & X & X & X & \(\mathrm{C}+\mathrm{P}\) & X & X & X \\
\hline 2 & 0 & X & X & X & \(\mathrm{C}+\mathrm{P}\) & X & X & X \\
\hline 3 & X & X & X & X & X & X & X & \(\mathrm{C}+\mathrm{P}\) \\
\hline
\end{tabular}

Table 8b. Double-Precision CREG + PREG Using CLKMODE \(=0\), PIPES2-0 \(=010\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline CYCLE & CLKMODE & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
TEMP \\
REG
\end{tabular} & \begin{tabular}{c} 
INSTR \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} \\
\hline 1 & 1 & X & X & X & \(\mathrm{C}+\mathrm{P}\) & X & X & X \\
\hline 2 & X & X & X & X & X & X & X & \(\mathrm{C}+\mathrm{P}\) \\
\hline
\end{tabular}

Going one step further，take the case of an operation where only one operand needs to be furnished by the input data buses（refer to Table 9）．To take advantage of clock mode 1，set the CONFIG lines so that the external operand comes directly from the DA and DB bus，as opposed to coming from the temporary register．Since the temporary register is not used to provide an operand，the data latched into it is inconsequential． It naturally follows then that the clock edge used to load the temporary register is unimportant．So by setting CLKMODE high，a double－precision instruction will begin after one cycle，instead of two cycles．

Table 9a．Double－Precision PREG＋RB Using CLKMODE \(=0\), PIPES2－0 \(=010\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline CYCLE & CLKMODE & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
TEMP \\
REG
\end{tabular} & \begin{tabular}{c} 
INSTR \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} \\
\hline 1 & 0 & X & X & X & \(\mathrm{P}+\mathrm{RB}\) & X & X & X \\
\hline 2 & 0 & \(\mathrm{RB}(\mathrm{M})\) & \(\mathrm{RB}(\mathrm{L})\) & RB & \(\mathrm{P}+\mathrm{RB}\) & X & RB & X \\
\hline 3 & X & X & X & X & X & X & X & \(\mathrm{P}+\mathrm{RB}\) \\
\hline
\end{tabular}

Table 9b．Double－Precision PREG + RB Using CLKMODE \(=1\), PIPES2－0 \(=010\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline CYCLE & CLKMODE & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
TEMP \\
REG
\end{tabular} & \begin{tabular}{c} 
INSTR \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} \\
\hline 1 & 1 & RB（M） & \(\mathrm{RB}(\mathrm{L})\) & RB & \(\mathrm{P}+\mathrm{RB}\) & X & RB & X \\
\hline 2 & X & X & X & X & X & X & X & \(\mathrm{P}+\mathrm{RB}\) \\
\hline
\end{tabular}

\section*{Operand Selection}

Four multiplexers select the multiplier and ALU operands from the RA and RB registers， the previous multiplier or ALU result，or the C register（see Figure 16）．The multiplexers are controlled by input signals SELOP7－SELOPO as shown in Tables 10 and 11．For division and square root operations，operands must be sourced from the input registers RA and RB．

Table 10．Multiplier Input Selection
\begin{tabular}{|cc|c|cc|c|}
\hline \multicolumn{4}{c|}{ A1（MUX1）\(^{2}\)} & \multicolumn{3}{c|}{ B1（MPUT } & （MUX2）INPUT \\
\hline SELOP7 & SELOP6 & OPERAND SOURCE \(^{\dagger}\) & SELOP5 & SELOP4 & OPERAND SOURCE \(^{\dagger}\) \\
\hline 0 & 0 & Reserved & 0 & 0 & Reserved \\
0 & 1 & C register & 0 & 1 & C register \\
1 & 0 & ALU feedback & 1 & 0 & Multiplier feedback \\
1 & 1 & RA input register & 1 & 1 & RB input register \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) For division or square root operations，only RA and RB registers can be selected as sources．
}


Figure 16. Operand Selection Multiplexer

Table 11．ALU Input Selection
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{A2（MUX3）INPUT} & \multicolumn{3}{|r|}{B2（MUX4）INPUT} \\
\hline SELOP3 & SELOP2 & OPERAND SOURCE \({ }^{\dagger}\) & SELOP1 & SELOPO & OPERAND SOURCE \({ }^{\dagger}\) \\
\hline 0 & 0 & Reserved & 0 & 0 & Reserved \\
\hline 0 & 1 & C register & 0 & 1 & C register \\
\hline 1 & 0 & Multiplier feedback & 1 & 0 & ALU feedback \\
\hline 1 & 1 & RA input register & 1 & 1 & RB input register \\
\hline
\end{tabular}
\(\dagger\) For division or square root operations，only RA and RB registers can be selected as sources．
As shown in Tables 10 and 11，data operands can be selected from five possible sources，including external inputs from the RA and RB registers，feedback from the \(P\)（Product）and \(S\)（Sum）registers，and a stored value in the \(C\) register．Contents of the \(C\) register may be selected as either the \(A\) or the \(B\) operand in the ALU，the multiplier， or both．When an external input is selected，the RA input always becomes the A operand，and the RB input is the B operand．

Feedback from the ALU can be selected as the A operand to the multiplier or as the \(B\) operand to the ALU．Similarly，multiplier feedback may be used as the A operand to the ALU or the B operand to the multiplier．During division or square root operations， operands may not be selected except from the RA and RB input registers （SELOP7－SELOPO＝11111111）．

Selection of operands also interacts with the selected operation in the ALU or the multiplier．ALU operations with one operand are performed only on the A operand（with the exception of the Pass B operation）．Also，depending on the instruction selected， the B operand may optionally be forced to zero in the ALU or to one in the multiplier．

If an operation uses one or more feedback registers as operands，the unused bus（es） can be used to preload operand（s）for a later operation．The data is loaded into the RA or RB input register（s）；when the data is needed as an operand，the SELOPS pins are set to select the RA or RB register（s），but the register input enables（ENRA，ENRB） are not enabled．The one restriction on preloading data is that the operation being performed during the preload MUST use the same data type（single－precision，double－ precision，or integer）as the data being loaded．Operands cannot be preloaded within square root or divide instructions．

\section*{C Register}

The 64－bit constant（C）register is available for storing the result of an ALU or multiplier operation before feedback to the multiplier or ALU．The C register has a separate clock input（CLKC），input source select（SRCC），and write enable（ENRC，active low）．

The \(C\) register loads from the \(P\) or the \(S\) register output，depending on the setting of SRCC．SRCC \(=1\) selects the multiplier as the input source．Otherwise，the ALU is selected when \(S R C C=0\) ．The SRCC input is not registered with the instruction inputs． Depending on the operation selected and the settings of PIPES2－PIPESO，an offset of one or more cycles may be necessary to load the desired result into the C register． The register only loads on a rising edge of CLCK when \(\overline{E N R C}\) is low．（See Figure 17）．

\({ }^{\dagger}\) td is the clock cycle period.
Figure 17. C Register Timing

A separate control (FLOWC) is available to bypass the C register when feeding an operand back on the \(C\) register feedback bus. When FLOWC is high, the output of the \(P\) or \(S\) register (as selected by SRCC) bypasses the \(C\) register without affecting the \(C\) register's contents. Direct \(P\) or \(S\) feedback is unaffected by the FLOWC setting.

\section*{Pipelined ALU}

The pipelined ALU contains a circuit for floating point addition and/or subtraction of aligned operands, a pipeline register, an exponent adjuster and a normalizer/rounder as shown in Figure 18. An exception circuit is provided to detect denormal inputs; these can be flushed to zero if the FAST input is set high. If the FAST input is low, the ALU accepts a denormal as input. A denorm exception flag (DENORM) goes high when the ALU output is a denormal.

Integer processing in the ALU includes both arithmetic and logical operations on either two's complement numbers or unsigned integers. The ALU performs addition, subtraction, comparison, logical shifts, logical AND, logical OR, and logical XOR.

The ALU may be operated independently or in parallel with the multiplier. Possible ALU functions during independent operation are given in Table 12.


Figure 18. Functional Diagram for ALU

Table 12. Independent ALU Operations
\begin{tabular}{|l|l|}
\hline \multicolumn{1}{|c|}{ SINGLE OPERAND } & TWO OPERANDS \\
\hline Pass & Add \\
Move & Subtract \\
Format Conversions & Compare \\
Wrap Denormalized Number & AND \\
Unwrap & OR \\
Shift & XOR \\
\hline
\end{tabular}

\section*{Pipelined Multiplier}

The pipelined multiplier (see Figure 19) performs a basic multiply function, division and square root. The operands can be single-precision or double-precision floating point numbers and can be converted to absolute values before multiplication takes place. Integer operands may also be used. Independent multiplier operations are summarized in Table 13.

If the operands to the multiplier are double precision or mixed precision (ie. one single precision and one double precision), then one extra clock cycle is required to get the product through the multiplier pipeline. This means that for PIPES1 = 1, one clock cycle is required for the multiplier pipeline; for PIPES1 \(=0\), two clock cycles are required for the multiplier pipeline.


Figure 19. Functional Diagram for Multiplier

Table 13．Independent Multiplier Operations
\begin{tabular}{|c|c|}
\hline SINGLE OPERAND & TWO OPERANDS \\
\hline Square Root & \begin{tabular}{c} 
Multiply \\
Divide
\end{tabular} \\
\hline
\end{tabular}

An exception circuit is provided to detect denormalized inputs；these are indicated by a high on the DENIN signal．Denormalized inputs must be wrapped by the ALU before multiplication，division，or square root．If results are wrapped（signaled by a high on the DENORM status pin），they must be unwrapped by the ALU．

The multiplier and ALU can be operated simultaneously by setting the 110 instruction input high．Division and square root are performed as independent multiplier operations， even though both multiplier and ALU are active during divide and SQRT operations．

\section*{Data Output Controls}

Selection and duration of results from the Y output multiplexer may be affected by several factors，including the operation selected，precision of the operands，registers enabled，and the next operation to be performed．The data output controls are not registered with the data and instruction inputs．When the device is microprogrammed， the effects of pipelining and sequencing of operations should be taken into account．

Two particular conditions need to be considered．Depending on which registers are enabled，an offset of one or more cycles must be allowed before a valid result is available at the Y output multiplexer．Also，certain sequences of operations may require both halves of a double－precision result to be read out within a single clock cycle．This is done by toggling the SELMS／［／S signal in the middle of the clock period．

When a single－precision result is output，the SELMS／\(\overline{L S}\) signal has no effect．The SELMS／\(\overline{L S}\) signal is set low only to read out the LSH of a double－precision result（see Figure 20）．To read out a result on the Y bus，the output enable \(\overline{\mathrm{OEY}}\) must be low． \(\overline{\mathrm{OEY}}\) is an asynchronous signal．


Figure 20. Y Output Control

\section*{Parity Checker/Generator}

When BYTEP is high, internal even parity is generated for each byte of input data at the DA and DB ports and compared to the PA and PB parity inputs respectively. If an odd number of bits is set high in a data byte, a parity check can also be performed on the entire input data word by setting BYTEP low. In this mode, PAO is the parity input for DA data and PBO is the parity input for DB data.

Even parity is generated for the Y multiplexer output, either for each byte or for each word of output, depending on the setting of BYTEP. When BYTEP is high, the parity generator computes four parity bits, one for each byte of the Y multiplexer output. Parity bits are output on the PY3-PYO pins; PYO represents parity for the least significant byte. A single parity bit can also be generated for the entire output data word by setting BYTEP low. In this mode, PYO is the parity output.

\section*{Master/Slave Comparator}

A master/slave comparator is provided to compare data bytes from the Y output multiplexer and the status outputs with data bytes on the external Y and status ports when \(\overline{O E Y}, \overline{O E S}\) and \(\overline{O E C}\) are high. If the data bytes are not equal, a high signal is generated on the master/slave error output pin (MSERR).

Figure 21 shows an example master/slave circuit. Two 'ACT8847 slave devices verify the data/status integrity of the 'ACT8847 master.


Figure 21．Example of Master／Slave Operation

\section*{Status and Exception Generation}

A status and exception generator produces several output signals to indicate invalid operations as well as overflow, underflow, non-numerical and inexact results, in conformance with IEEE Standard 754-1985. If output registers are enabled (PIPES2 \(=0\) ), status and exception results are latched in the status register on the rising edge of the clock. Status results are valid at the same time as associated data results are valid.

Duration and availability of status results are affected by the same timing constraints that apply to data results on the \(Y\) bus. Status outputs are enabled by two signals, \(\overline{\mathrm{OEC}}\) for comparison status and \(\overline{\mathrm{OES}}\) for other status and exception outputs. Status outputs are summarized in Tables 14 and 15.

Table 14. Comparison Status Outputs
\begin{tabular}{|c|l|}
\hline SIGNAL & \multicolumn{1}{c|}{ RESULT OF COMPARISON (ACTIVE HIGH) } \\
\hline AEQB & \begin{tabular}{l} 
The A and B operands are equal. A high signal on the AEQB output indicates a \\
zero result from the selected source except during a compare operation in the ALU. \\
During integer operations, indicates zero status output.
\end{tabular} \\
AGTB & \begin{tabular}{l} 
The A operand is greater than the B operand. \\
UNORD two inputs of a comparison operation are unordered, i.e., one or both of the \\
The \\
inputs is a NaN.
\end{tabular} \\
\hline
\end{tabular}

During a compare operation in the ALU, the AEOB output goes high when the A and \(B\) operands are equal. When any operation other than a compare is performed, either by the ALU or the multiplier, the AEOB signal is used as a zero detect.

Table 15．Status Outputs
\begin{tabular}{|c|l|}
\hline SIGNAL & \multicolumn{1}{c|}{ STATUS RESULT } \\
\hline CHEX & \(\begin{array}{l}\text { If I6 is low，indicates the multiplier is the source of an exception during a chained } \\
\text { function．If I6 is high，indicates the ALU is the source of an exception during a } \\
\text { chained function．} \\
\text { Input to the multiplier is a denorm．When DENIN goes high，the STEX pins indicate } \\
\text { which port had the denormal input．} \\
\text { The multiplier output is a wrapped number or the ALU output is a denorm．In the }\end{array}\) \\
DENIN \\
FAST mode，this condition causes the result to go to zero．It also indicates an \\
invalid integer operaion，i．e．，PASS（－A）with unsigned integer operand．
\end{tabular}\(\}\) An invalid operation involving a zero divisor has been detected by the multiplier．

In chained mode，results to be output are selected based on the state of the 16 （source output）pin（if 16 is low，ALU status will be selected；if 16 is high，multiplier status will be selected）．If the nonselected output source generates an exception，CHEX is set high．Status of the nonselected output source can be forced using the SELST pins， as shown in Table 16.


Figure 22. Status Output Control

Table 16．Status Output Selection（Chained Mode）
\begin{tabular}{|c|l|}
\hline \begin{tabular}{c} 
SELST1－ \\
SELSTO
\end{tabular} & \multicolumn{1}{|c|}{ STATUS SELECTED } \\
\hline 00 & Logical OR of ALU and multiplier exceptions（bit by bit） \\
01 & Selects multiplier status \\
10 & Selects ALU status \\
11 & Normal operation（selection based on result source specified by 16 input） \\
\hline
\end{tabular}

An exception detect mask register is available to mask out selected exceptions from the multiplier，ALU，or both．Multiply status is disabled during an independent ALU instruction，and ALU status is disabled during multiplier instructions．During chained operation，both status outputs are enabled．

When the exception mask register has been loaded with a mask，the mask is applied to the contents of the status register to disable unnecessary exceptions．Status results for enabled exceptions are then ORed together and，if true，the exception detect（ED） status output pin is set high（see Figure 23）．Individual status outputs remain active and can be read independently from mask register operations．


Figure 23. Exception Detect Mask Logic

\section*{Microprogramming the＇ACT8847}

Because the＇ACT8847 is microprogrammable，it can be configured to operate on either integer or single－or double－precision data operands，and the operations of the registers， ALU，and multiplier can be programmed to support a variety of applications．The following sections present not only control settings but the timings of the specific operations required to execute the sample instructions．

\section*{Control Inputs}

Control inputs to the＇ACT8847 are summarized in Table 17 below．Several of the inputs have already been discussed；refer to the page listed in the table for detailed information．

The remaining inputs are discussed in the following sections．All control signals and their associated tables are also listed in the＇ACT8847 Reference Guide to provide a complete，easy－to－access reference for the programmer already familiar with ＇ACT8847 operation．

Table 17. Control Inputs
\begin{tabular}{|c|c|c|c|}
\hline SIGNAL & HIGH & LOW & \[
\begin{aligned}
& \text { PAGE } \\
& \text { NO. }
\end{aligned}
\] \\
\hline BYTEP & Selects byte parity generation and test & Selects single bit parity generation and test & 7-75 \\
\hline CLK & Clocks all registers (except C ) on rising edge & No effect & 7-62 \\
\hline CLKC & Clocks C register on rising edge & No effect & 7-70 \\
\hline CLKMODE & Enables temporary input register load on falling clock edge & Enables temporary input register load on rising clock edge & 7-66 \\
\hline CONFIG1CONFIGO & See Table 6 (RA and RB register data source selects) & See Table 42 (RA and RB register data source selects) & 7-65 \\
\hline ENRC & No effect & Enables C register load when CLKC goes high. & 7-70 \\
\hline ENRA & If register is not in flowthrough, enables clocking of RA register & If register is not in flowthrough, holds contents of RA register & 7-65 \\
\hline ENRB & If register is not in flowthrough, enables clocking of RB register & If register is not in flowthrough, holds contents of RB register & 7-65 \\
\hline FAST & Places device in FAST mode & Places device in IEEE mode & 7-84 \\
\hline FLOW_C & Causes output value to bypass C register and appear on C register output bus. & No effect & 7-72 \\
\hline \(\overline{\text { HALT }}\) & No effect & Stalls device operation but does not affect registers, internal states, or status. C register loading is not disabled & 7-85 \\
\hline \(\overline{\mathrm{OEC}}\) & Disables compare pins & Enables compare pins & 7-77 \\
\hline \(\overline{\mathrm{OES}}\) & Disables status outputs & Enables status outputs & 7-77 \\
\hline \(\overline{\mathrm{OEY}}\) & Disables Y bus & Enables Y bus & 7-74 \\
\hline \[
\begin{aligned}
& \hline \text { PIPES2- } \\
& \text { PIPESO }
\end{aligned}
\] & See Table 5 (Pipeline Mode Control) & See Table 5 (Pipeline Mode Control) & 7-62 \\
\hline \(\overline{\text { RESET }}\) & No effect & Clears internal states, status, internal pipeline registers, and exception disable register. Does not affect other data registers. & 7-86 \\
\hline RND1RNDO & See Table 18 (Rounding Mode Control) & See Table 18 (Rounding Mode Control) & 7-84 \\
\hline \[
\begin{aligned}
& \hline \text { SELOP7- } \\
& \text { SELOPO }
\end{aligned}
\] & See Tables 10 and 11 (Multiplier/ ALU operand selection) & \begin{tabular}{l}
See Tables 10 and 11 \\
(Multiplier/ALU operand selection
\end{tabular} & 7-68 \\
\hline SELMS/ \(\overline{L S}\) & Selects MSH of 64-bit result for output on the Y bus (no effect on single-precision operands) & Selects LSH of 64-bit result for output on the Y bus (no effect on single-precision operands) & 7-74 \\
\hline \begin{tabular}{l}
SELST1- \\
SELSTO
\end{tabular} & See Table 16 (Status Output Selection) & See Table 16 (Status Output Selection) & 7-78 \\
\hline SRCC & Selects multiplier result for input to C register & Selects ALU result for input to C register & 7-70 \\
\hline TP1-TPO & See Table 22 (Test Pin Control Inputs) & See Table 22 (Test Pin Control Inputs) & 7-86 \\
\hline
\end{tabular}

\section*{Rounding Modes}

The 'ACT8847 supports the four IEEE standard rounding modes: round to nearest, round towards zero (truncate), round towards infinity (round up), and round towards minus infinity (round down). The rounding function is selected by control pins RND1 and RNDO, as shown in Table 18.

Table 18. Rounding Modes
\begin{tabular}{|c|l|}
\hline \begin{tabular}{c} 
RND1- \\
RNDO
\end{tabular} & \multicolumn{1}{|c|}{ ROUNDING MODE SELECTED } \\
\hline 0 & 0 \\
0 & 1
\end{tabular} Round towards nearest \(\quad\) Round towards zero (truncate) 1 Round towards infinity (round up)

Rounding mode should be selected to minimize procedural errors which may otherwise accumulate and affect the accuracy of results. Rounding to nearest introduces a procedural error not exceeding half of the least significant bit for each rounding operation. Since rounding to nearest may involve rounding either upward or downward in successive steps, rounding errors tend to cancel each other.

In contrast, directed rounding modes may introduce errors approaching one bit for each rounding operation. Since successive rounding operations in a procedure may all be similarly directed, each introducing up to a one-bit error, rounding errors may accumulate rapidly, especially in single-precision operations.

\section*{FAST and IEEE Modes}

The device can be programmed to operate in FAST mode by asserting the FAST pin. In the FAST mode, all denormalized inputs and outputs are forced to zero.

Placing a zero on the FAST pin causes the chip to operate in IEEE mode. In this mode, the ALU can operate on denormalized inputs and return denormals. If a denorm is input to the multiplier, the DENIN flag will be asserted, and the result will be invalid. Denormal numbers must be wrapped before being input to the multiplier. If the multiplier result underflows, a wrapped number will be output.

\section*{Handling of Denormalized Numbers (FAST)}

The FAST input selects the mode for handling denormalized inputs and outputs. When the FAST input is set low, the ALU accepts denormalized inputs but the multiplier generates an exception when a denormal is input. When FAST is set high, the DENIN status exception is disabled and all denormalized numbers, both inputs and results, are forced to zero.

A denormalized input has the form of a floating point number with a zero exponent, a nonzero mantissa, and a zero in the leftmost bit of the mantissa (hidden or implicit bit). A denormalized number results from decrementing the biased exponent field to
zero before normalization is complete. Since a denormalized number cannot be input to the multiplier, it must first be converted to a wrapped number by the ALU. When the mantissa of the denormal is normalized by shifting it left, the exponent field decrements from all zeros (wraps past zero) to a negative two's complement number (except in the case of \(0.1 \times X X \ldots\) ), where the exponent is not decremented.

Exponent underflow is possible during multiplication of small operands even when the operands are not wrapped numbers. Setting FAST \(=0\) selects gradual underflow so that denormal inputs can be wrapped and wrapped results are not automatically discarded. When FAST is set high, denormal inputs and wrapped results are forced to zero immediately.

When the multiplier is in IEEE mode and produces a wrapped number as its result, the result may be passed to the ALU and unwrapped. If the wrapped number can be unwrapped to an exact denormal, it can be output without causing the underflow status flag (UNDER) to be set. UNDER goes high when a result is an inexact denormal, and a zero is output from the FPU if the wrapped result is too small to represent as a denormal (smaller than the minimum denorm). Table 10 describes the handling of wrapped multiplier results and the status flags that are set when wrapped numbers are output from the multiplier.

Table 19. Handling Wrapped Multiplier Outputs
\begin{tabular}{|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{TYPE OF RESULT} & \multicolumn{3}{|c|}{STATUS FLAGS SET} & \multirow[b]{2}{*}{NOTES} \\
\hline & DENORM & INEX & RNDCO & \\
\hline Wrapped, exact & 1 & 0 & 0 & Unwrap with 'Wrapped exact' ALU instruction \\
\hline Wrapped, inexact & 1 & 1 & 0 & Unwrap with 'Wrapped inexact' ALU instruction \\
\hline Wrapped, increased in magnitude & 1 & 1 & 1 & Unwrap with 'Wrapped rounded' ALU instruction \\
\hline
\end{tabular}

When operating in chained mode, the multiplier may output a wrapped result to the ALU during the same clock cycle that the multiplier status is output. In such a case the ALU cannot unwrap the operand prior to using it, for example, when accumulating the results of previous multiplications. To avoid this situation, the FPU can be operated in FAST mode to simplify exception handling during chained operations. Otherwise, wrapped outputs from the multiplier may adversely affect the accuracy of the chained operation, because a wrapped number may appear to be a large normalized number instead of a very small denormalized number.

Because of the latency associated with interpreting the FPU status outputs and determining how to process the wrapped output, it is necessary that a wrapped operand be stored external to the FPU (for example, in an external register file) and reloaded to the A port of the ALU for unwrapping and further processing.

\section*{Stalling the Device}

Operation of the 'ACT8847 can be stalled nondestructively by means of the HALT signal. Bringing the HALT input low causes the device to inhibit the next rising clock edge. Register contents are unaltered when the device is stalled, and normal operation resumes at the next low clock period after the \(\overline{\text { HALT }}\) signal is set high.

Stalling the device does not stall the C register. If ENRC is low, CLKC will clock in data from the source selected by SRCC.

For some operations, such as a double-precision multiply with CLKMODE \(=1\), setting the HALT input low may interrupt loading of the RA, RB, and instruction registers, as well as stalling operation. In clock mode 1, the temporary register loads on the falling edge of the clock, but the HALT signal going low would prevent the RA, RB, and instruction registers from loading on the next rising clock edge. It is therefore necessary to have the instruction and data inputs on the pins when the \(\overline{\text { HALT }}\) signal is set high again and normal operation resumes.

\section*{RESET}

The RESET input is an active-low signal that asynchronously clears the internal states, status, and exception disable mask. Internal pipeline registers are cleared, but the RA, RB, and \(C\) registers are not. Operation resumes when \(\overline{R E S E T}\) goes high again.

\section*{Test Pins}

Two pins, TP1-TPO, support system testing. These may be used, for example, to place all outputs in a high-impedance state, isolating the chip from the rest of the system (see Table 20).

Table 20. Test Pin Control Inputs
\begin{tabular}{|ll|l|}
\hline \begin{tabular}{c} 
TP1- \\
TPO
\end{tabular} & \multicolumn{1}{c|}{ OPERATION } \\
\hline 0 & 0 & All outputs and \(\mathrm{I} /\) Os are forced low \\
0 & 1 & All outputs and I/Os are forced high \\
1 & 0 & All outputs are placed in a high impedance state \\
1 & 1 & Normal operation \\
\hline
\end{tabular}

\section*{Independent ALU Operations}

Configuration and operation of the 'ACT8847 can be selected to perform single- or double-precision floating point and integer calculations in operating modes ranging from flowthrough to fully pipelined. Timing and sequences of operations are affected by settings of clock mode, data and status registers, input data configurations, and rounding mode, as well as the instruction inputs controlling the ALU and the multiplier.

Three modes of operation can be selected with inputs \(110-10\), including independent ALU operation, independent multiplier operation, or simultaneous (chained) operation of ALU and multiplier. Each of these operating modes is treated separately in the following sections.

The ALU executes single- and double-precision operations which can be divided according to the number of operands involved, one or two. Tables 21 and 22 show independent ALU operations with one operand, along with the inputs 110-10 which select each operation. Conversions from one format to another are handled in this mode, with the exception of adjustments to precision during two-operand ALU operations. The wrapping and unwrapping of operands is also done in this mode.

Most format conversions involve double-precision timing. Conversions between singleand double-precision floating point format are treated as mixed-precision operations requiring two cycles to load the operands. A single-precision number is loaded in the upper half ( MSH ) of its input register. During integer to floating point conversions, the integer input should be loaded into the upper half of the RA register. If converting from integer to double precision, then two cycles are required.

Logical shifts can be performed on integer operands using the instructions shown in Table 22. The data operand to be shifted is input from any valid operand source and the number of bit positions the operand is to be shifted is input only from the DB bus. The shift number on the DB bus should be in positive 32-bit integer format, although only the lowest eight bits are used. The shift number cannot be selected from sources other than the RB register, and the shift number must be loaded on the same cycle as the instruction.

Table 21．Independent ALU Operations，Single Floating Point Operand
\((110=0,19=0,16=0,15=1)\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{CHAINED OPERATION 110} & \multirow[t]{2}{*}{OPERAND FORMAT I9} & \multirow[t]{2}{*}{\[
\begin{gathered}
\text { PRECISION } \\
\text { RA } \\
\text { I8 } \\
\hline
\end{gathered}
\]} & \multirow[t]{2}{*}{\[
\begin{gathered}
\text { PRECISON } \\
\text { RB } \\
17 \\
\hline
\end{gathered}
\]} & \multirow[t]{2}{*}{OUTPUT SOURCE 16} & \multirow[t]{2}{*}{```
OPERAND
```} & \multirow[t]{2}{*}{\begin{tabular}{l}
ABSOLUTE \\
VALUE A \\
14
\end{tabular}} & \multicolumn{2}{|r|}{ALU OPERATION} \\
\hline & & & & & & & 13－10 & RESULT \\
\hline \multirow[t]{13}{*}{\begin{tabular}{l}
\[
0=\text { Not }
\] \\
Chained
\end{tabular}} & \multirow[t]{13}{*}{\begin{tabular}{l}
\[
0=
\] \\
Floating point
\end{tabular}} & \multirow[t]{13}{*}{\[
\begin{aligned}
& 0=A(S P) \\
& 1=A(D P)
\end{aligned}
\]} & \multirow[t]{13}{*}{\begin{tabular}{l}
\[
\begin{aligned}
& 0=B(S P) \\
& 1=B(D P)
\end{aligned}
\] \\
must equal 18
\end{tabular}} & \multirow[t]{13}{*}{\[
\begin{aligned}
& 0=A L U \\
& \text { result }
\end{aligned}
\]} & \multirow[t]{13}{*}{\[
\begin{gathered}
1=\text { Single } \\
\text { Operand }
\end{gathered}
\]} & \multirow[t]{13}{*}{\[
\begin{gathered}
0=A \\
1=|A|
\end{gathered}
\]} & 0000 & Pass A operand \\
\hline & & & & & & & 0001 & Pass－A operand \\
\hline & & & & & & & 0010 & 2＇s complement integer to floating point conversion \({ }^{\dagger}\) \\
\hline & & & & & & & 0011 & Floating point to 2 ＇s complement integer conversion \({ }^{\dagger}\) \\
\hline & & & & & & & 0100 & Move A operand（pass without NaN detect or exception flags active） \\
\hline & & & & & & & 0101 & Pass B operand \\
\hline & & & & & & & 0110 & Floating point to floating point conversion \({ }^{\ddagger}\) \\
\hline & & & & & & & 0111 & Floating point to unsigned integer conversion \({ }^{\dagger}\) \\
\hline & & & & & & & 1000 & Wrap（denormal）input operand \\
\hline & & & & & & & 1010 & Unsigned integer to
floating point conversion \(\dagger\) \\
\hline & & & & & & & 1100 & Unwrap exact number \\
\hline & & & & & & & 1101 & Unwrap inexact number \\
\hline & & & & & & & 1110 & Unwrap rounded input \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) The precision of the integer to floating point conversion is set by 18 ．If \(18=1\) ，the operation is timed like a double－precision operation，requiring clock edges to load．
\(\ddagger\) This converts single－precision floating point to double－precision floating point and vice versa．If the 18 pin is low to indicate a single－precision input，the result of the conversion will be double precision．If the 18 pin is high，indicating a double－precision input，the result of the conversion will be single precision．This operation is timed like a double－precision operation，requiring 2 clock edges to load．
}

Table 22. Independent ALU Operations, Single Integer Operand \((110=0,19=1,16=015=1)\)
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{\(\qquad\)} & \multicolumn{3}{|l|}{OPERAND FORMAT/PRECISION} & \multirow[t]{2}{*}{\begin{tabular}{l}
SOURCE \\
16
\end{tabular}} & \multirow[t]{2}{*}{\begin{tabular}{l}
OPERAND \\
TYPE \\
15
\end{tabular}} & \multicolumn{2}{|r|}{ALU OPERATION} \\
\hline & 19 & 18 & 17 & & & 14-10 & RESULT \\
\hline \begin{tabular}{l}
\[
0=\text { Not }
\] \\
Chained
\end{tabular} & \[
\begin{gathered}
1= \\
\text { Integer }
\end{gathered}
\] & 0
1 & \begin{tabular}{l}
\[
0=S P 2 ' s
\] \\
complement
\[
1=S P
\] \\
unsigned integer
\end{tabular} & \[
\begin{gathered}
0=\mathrm{ALU} \\
\text { result }
\end{gathered}
\] & \[
\begin{gathered}
\hline 1=\text { Single } \\
\text { Operands }
\end{gathered}
\] & \[
\begin{aligned}
& \hline 00000 \\
& 00001 \\
& 00010 \\
& 00101 \\
& 01000 \\
& 01001 \\
& 01101
\end{aligned}
\] & \begin{tabular}{l}
Pass A operand \\
Pass ( - A) operand \\
Negate A operand (1's complement) \\
Pass B operand \\
Shift A operand left logical \({ }^{\dagger}\) \\
Shift A operand right logical \({ }^{\dagger}\) \\
Shift A operand right arithmetic \({ }^{\dagger}\)
\end{tabular} \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger} B\) operand is number of bit positions \(A\) is to be shifted (See instruction description for "Independent ALU Operations".) The B operand must be input on the same cycle that shift is to be performed.
}

Tables 23 and 24 present independent ALU operations with two operands．When the operands are different in precision，one single and the other double，the settings of the precision selects \(18-17\) will identify the single－precision operand so that it can automatically be reformatted to double－precision before the selected operation is executed，and the result of the operation will be double precision．

Precision of each data operand is indicated by the setting of instruction input 18 for single－operand ALU instructions，or the settings of \(18-17\) for two－operand instructions． For single－operand instructions， 17 must be set equal to 18 ．When the ALU receives mixed－precision operands（one operand in single precision and the other in double precision），the single－precision data input is converted to double and the operation is executed in double precision．It is unnecessary to use the＇convert float－to－float＇ instruction to convert the single－precision operand prior to performing the desired operation on the mixed－precision operands．Setting 18 and 17 properly achieves the same effect without wasting an instruction cycle．

Timing for operations with mixed－precision operands is the same as for a corresponding double－precision operation．In a mixed－precision operation，the single－precision operand must be loaded into the upper half of its input register．If both operands are single precision，a single－precision result is output by the ALU．Operations on mixed－precision data inputs produce double－precision results．

Table 23. Independent ALU Operations, Two Floating-Point Operands \((110=0,19=0,15=0)\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{\(\qquad\)} & \multirow[t]{2}{*}{OPERAND FORMAT 19} & \multirow[t]{2}{*}{\[
\begin{gathered}
\hline \text { PRECISION } \\
\text { RA } \\
\text { 18 } \\
\hline
\end{gathered}
\]} & \multirow[t]{2}{*}{\[
\begin{array}{|c|}
\hline \text { PRECISION } \\
\text { RB } \\
17 \\
\hline
\end{array}
\]} & \multirow[t]{2}{*}{\[
\begin{array}{|c|}
\hline \text { OUTPUT } \\
\text { SOURCE } \\
\text { 16 } \\
\hline
\end{array}
\]} & \multirow[t]{2}{*}{\begin{tabular}{l}
OPERAND \\
TYPE 15
\end{tabular}} & \multirow[t]{2}{*}{ABSOLUTE VALUE A 14} & \multirow[t]{2}{*}{\[
\begin{array}{|c|}
\hline \text { ABSOLUTE } \\
\text { VALUE B } \\
13 \\
\hline
\end{array}
\]} & \multirow[t]{2}{*}{\[
\begin{gathered}
\hline \text { ABSOLUTE } \\
\text { VALUE Y } \\
12 \\
\hline
\end{gathered}
\]} & \multicolumn{2}{|l|}{ALU OPERATION} \\
\hline & & & & & & & & & 11-10 & RESULT \\
\hline \begin{tabular}{l}
\[
0=\mathrm{Not}
\] \\
chained
\end{tabular} & \begin{tabular}{l}
\[
0=
\] \\
Floating point
\end{tabular} & \[
\begin{aligned}
& 0=A(S P) \\
& 1=A(D P)
\end{aligned}
\] & \[
\begin{aligned}
& 0=B(S P) \\
& 1=B(D P)
\end{aligned}
\] & \[
\begin{gathered}
0=A L U \\
\text { result }
\end{gathered}
\] & \[
\begin{aligned}
& 0=\text { Two } \\
& \text { operands }
\end{aligned}
\] & \[
\begin{gathered}
0=A \\
1=|A|
\end{gathered}
\] & \[
\begin{gathered}
0=B \\
1=|B|
\end{gathered}
\] & \[
\begin{gathered}
0=Y \\
1=|Y|
\end{gathered}
\] & \[
\begin{aligned}
& \hline 00 \\
& 01 \\
& 10 \\
& 11 \\
& \hline
\end{aligned}
\] & \[
\begin{aligned}
& A+B \\
& A-B \\
& \text { Compare } A, B \\
& B-A \\
& \hline
\end{aligned}
\] \\
\hline
\end{tabular}

Table 24. Independent ALU Operations, Two Integer Operands
\((110=0,19=1, I 6=0, I 5=0)\)
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{CHAINED OPERATION 110} & \multicolumn{3}{|l|}{OPERAND FORMAT/PRECISION} & \multirow[t]{2}{*}{OUTPUT SOURCE 16} & \multirow[t]{2}{*}{OPERAND TYPE 15} & \multicolumn{2}{|r|}{ALU OPERATION} \\
\hline & 19 & 18 & 17 & & & 14-10 & RESULT \\
\hline \begin{tabular}{l}
\[
0=\mathrm{Not}
\] \\
Chained
\end{tabular} & \[
\begin{gathered}
1= \\
\text { Integer }
\end{gathered}
\] & 0
0 & \begin{tabular}{l}
\[
0=S P 2 \prime s
\] \\
complement
\[
1=S P
\] \\
unsigned integer
\end{tabular} & \[
\begin{aligned}
& 0=A L U \\
& \text { result }
\end{aligned}
\] & \begin{tabular}{l}
\[
0 \text { = Two }
\] \\
Operands
\end{tabular} & 00000
00001
00010
00011
01000
01001
01010
01011
01100
01101 & \begin{tabular}{l}
\(A+B\) \\
\(A-B\) \\
Compare A, B \\
B - A \\
Logical AND (A, B) \\
Logical AND (A, NOT B) \\
Logical AND (NOT A, B) \\
Logical AND (NOT A, NOT B) \\
Logical OR (A, B) \\
Logical XOR (A, B)
\end{tabular} \\
\hline
\end{tabular}

Two additional independent ALU operations may also be coded．The first of these is for loading the exception detect mask register．

The exception detect mask register can be loaded with a mask to enable or disable selected status exceptions．Status bits for enabled exceptions are logically ORed，and when the result is true，the ED pin goes high．During chained operations，both multiplier and ALU results are ORed．During independent operation，the nonselected status results are forced to zero．

If the FPU is reset（ \(\overline{\text { RESET }}=0\) ），the exception detect mask register is cleared．Table 25 describes the settings for the mask register load instruction and the status exceptions which can be enabled or disabled with the mask．

Table 25．Loading the Exception Disable Mask Register
\begin{tabular}{|c|c|}
\hline INSTRUCTION INPUTS & RESULTS \\
\hline \(110-17=0111\) & Exception mask load instruction \\
\hline 16 & \begin{tabular}{l}
\(0=\) Load ALU exception disable register \\
\(1=\) Load multiplier exception disable register
\end{tabular} \\
\hline \(15^{\dagger}\) & \begin{tabular}{l}
0 ＝IVAL exception enabled \\
1 ＝IVAL exception disabled
\end{tabular} \\
\hline 14 & \begin{tabular}{l}
\(0=\) OVER exception enabled \\
1 ＝OVER exception disabled
\end{tabular} \\
\hline 13 & \begin{tabular}{l}
0 ＝UNDER exception enabled \\
1 ＝UNDER exception disabled
\end{tabular} \\
\hline 12 & \begin{tabular}{l}
\(0=\) INEX exception enabled \\
\(1=\) INEX exception disabled
\end{tabular} \\
\hline 11 & \begin{tabular}{l}
0 ＝DIVBYO exception enabled \\
1 ＝DIVBYO exception disabled \({ }^{\ddagger}\)
\end{tabular} \\
\hline 10 & \begin{tabular}{l}
0 ＝DENORM exception enabled \\
1 ＝DENORM exception disabled
\end{tabular} \\
\hline
\end{tabular}
\({ }^{\dagger}\) Disabling IVAL in multiplier exception mask register also disables DENIN exception
\(\ddagger\) Only significant when \(16=1\)
The second additional independent ALU operation is the NOP（no operation）．The table below shows the coding for the NOP instruction．

Table 26．NOP Instruction
\begin{tabular}{|c|c|}
\hline \(110-10\) & Operation \\
\hline 01100000000 & NOP \\
\hline
\end{tabular}

Because NOP，in effect，just prevents loading of the \(P\) or \(S\) registers，these registers must be enabled（PIPES2 \(=0\) ）for the NOP to work correctly．

Timing of a NOP instruction is the same as any single-precision ALU operation, taking one clock cycle per pipeline stage that is enabled. For example, when the 'ACT8847 is fully pipelined (PIPES2-PIPESO \(=000\) ), a NOP's effect (preventing the overwriting of the P and S registers) will be seen on the third cycle. To hold the results of an operation on the \(Y\) bus for an extra cycle, the NOP instruction is inserted directly after the instruction whose results are to be held.

The NOP freezes the output register's contents until new results are to be loaded into these registers.

\section*{Independent Multiplier Operations}

In this mode, the multiplier operates on two of five input sources which can be either single precision, double precision, or mixed. Multiplication, division and square root may be coded as independent multiplier operations.

Operand precision is selected by 18 and 17 , as for ALU operations. The multiplier can multiply the \(A\) and \(B\) operands, either operand with the absolute value of the other, or the absolute values of both operands. The result can also be negated when it is output. Operations involving absolute value or negated results are valid only when floating point format is selected. If both operands are single precision, a single-precision result is output. Operations on mixed-precision data inputs produce double-precision results.

Floating point operands may be normalized or wrapped numbers, as indicated by the settings for instruction inputs I1-IO. As shown in Table 27, the multiplier can be set to operate on the absolute value of either or both floating point operands, and the result of any operation can be negated when it is output from the multiplier. Converting a single-precision denormal number to double precision does not normalize or wrap the denormal, so it is still an invalid input to the multiplier. Independent multiplier operations are summarized in Tables 27 thru 29.

Table 27．Independent Multiplier Operations
（ \(110=0,16=1\) ）

\({ }^{\dagger}\) See also Tables 13 and 14 ．Operations involving absolute values，negated results or wrapped numbers are valid only when floating point format is selected （19＝0）．
\(\ddagger\) For square root operations， 17 must be equal to 18 ．

Table 28. Independent Multiply Operations Selected by \(14-12(110=0, I 6=1, I 5=0)\)
\begin{tabular}{|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{\begin{tabular}{l}
ABSOLUTE \\
value A 14
\end{tabular}} & \multirow[t]{2}{*}{ABSOLUTE VALUE B 13} & \multirow[t]{2}{*}{NEGATE RESULT 12} & \multicolumn{2}{|l|}{OPERATION SELECTED} \\
\hline & & & 14-12 & RESULTS \({ }^{\text { }}\) \\
\hline \(0=A\) & \(0=B\) & \(0=Y\) & 000 & A * B \\
\hline \(1=|A|\) & \(1=|B|\) & \(1=-Y\) & 001 & \(-(A * B)\) \\
\hline & & & 010 & A * \(\mid\) B| \\
\hline & & & 011 & \(-(A *|B|)\) \\
\hline & & & 100 & \(|A| * B\) \\
\hline & & & 101 & \(-(|A| * B)\) \\
\hline & & & 110 & \(|A| *|B|\) \\
\hline & & & 111 & \(-\| A|*| B \mid)\) \\
\hline
\end{tabular}
\({ }^{\ddagger}\) Operations involving absolute values or negated results are valid only when floating point format is selected ( \(19=0\) ).

Table 29. Independent Divide/Square Root Operations Selected by \(14-12(110=0,16=1,15=1)\)
\begin{tabular}{|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{ABSOLUTE VALUE A 14} & \multirow[t]{2}{*}{DIVIDE/ SQRT 13} & \multirow[t]{2}{*}{NEGATE RESULT 12} & \multicolumn{2}{|l|}{OPERATION SELECTED} \\
\hline & & & 14-12 & RESULTS \({ }^{\text { }}\) \\
\hline \(0=A\) & 0 = Divide & \(0=Y\) & 000 & A / B \\
\hline \(1=A\) & 1 = SQRT & \(1=-Y\) & 001 & - (A / B) \\
\hline & & & 010 & SQRT A \\
\hline & & & 011 & - (SQRT A) \\
\hline & & & 100 & \(|A| / B\) \\
\hline & & & 101 & \(-(|A| / B)\) \\
\hline & & & 110 & SQRT |A| \\
\hline & & & 111 & -(SQRT \(\mid\) A \({ }^{\text {( }}\) \\
\hline
\end{tabular}
\(\dagger\) Operations involving absolute values or negated results are valid only when floating point format is selected
\((19=0)\).

\section*{Chained Multiplier／ALU Operations}

In chained mode，the＇ACT8847 performs simultaneous operations in the multiplier and the ALU．Operations not only include addition，subtraction，and multiplication， but also several optional operations which increase the flexibility of the device（see Table 30）．Division and square root operations are not available in chained mode．Format conversions，absolute values，and wrapping or unwrapping of denormal numbers are also not available．

The B operand to the ALU can be set to zero so that the ALU passes the A operand unaltered．The B operand to the multiplier can be forced to the value 1 so that the A operand to the multiplier is passed unaltered．

Since in chained mode there are four operands but only two bits（I8 and I7）to select the operand precision，care must be taken with mixed－precision operations．The A input to the ALU and to the multiplier must be of the same precision；just as the B input to the ALU and to the multiplier must be of the same precision．

Table 30. Chained Multiplier/ALU Operations ( \(110=1\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{CHAINED OPERATION 110} & \multicolumn{3}{|l|}{OPERAND FORMAT/PRECISION} & \multirow[b]{2}{*}{\begin{tabular}{l}
SOURCE \\
16
\end{tabular}} & \multirow[b]{2}{*}{\[
\begin{gathered}
\text { ADD } \\
\text { ZERO } \\
15
\end{gathered}
\]} & \multirow[t]{2}{*}{\begin{tabular}{l}
MULTIPLY \\
BY ONE \\
14
\end{tabular}} & \multirow[t]{2}{*}{\begin{tabular}{l}
NEGATE \\
ALU \\
RESULT \\
\(13^{\dagger}\)
\end{tabular}} & \multirow[t]{2}{*}{NEGATE MULTIPLIER RESULT \(12{ }^{\dagger}\)} & \multicolumn{2}{|l|}{ALU OPERATIONS} \\
\hline & 19 & 18 & 17 & & & & & & 11-10 & RESULT \\
\hline \multirow[t]{5}{*}{\[
\begin{aligned}
& 1= \\
& \text { Chained }
\end{aligned}
\]} & 0 = & \(0=A(S P)\) & \(0=B(S P)\) & \multirow[t]{5}{*}{\begin{tabular}{l}
\(0=\) \\
ALU \\
result \\
\(1=\) \\
Multi- \\
plier \\
result
\end{tabular}} & \multirow[t]{5}{*}{\(0=\) Normal operation \(1=\) Forces B2 input of ALU to zero} & \multirow[t]{5}{*}{\begin{tabular}{l}
\[
0=
\] \\
Normal operation
\[
1=
\] \\
Forces B1 input of multiplier to one
\end{tabular}} & \multirow[t]{5}{*}{\begin{tabular}{l}
\(0=\) \\
Normal operation
\[
1=
\] \\
Negate ALU result
\end{tabular}} & \multirow[t]{5}{*}{\begin{tabular}{l}
\(0=\) \\
Normal operation
\[
1=
\] \\
Negate multiplier result
\end{tabular}} & 00 & \multirow[t]{5}{*}{\[
\begin{aligned}
& A+B \\
& A-B \\
& 2-A \\
& B-A
\end{aligned}
\]} \\
\hline & floating point & \(1=A(D P)\) & 1 = B(DP) & & & & & & 01
10 & \\
\hline & & & & & & & & & 11 & \\
\hline & \[
1=
\] & \[
0
\] & \[
0=S P 2 \text { 's }
\] & & & & & & & \\
\hline & & 0 & \begin{tabular}{l}
\[
1=S P
\] \\
unsigned integer
\end{tabular} & & & & & & & \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) Operations involving negated results are valid only when floating point format is selected \((19=0)\).
}

\section*{Sample Independent ALU Microinstructions}

The following independent ALU timing diagram examples show four register settings， ranging from fully flowthrough to fully pipelined． \(\mathrm{X}=\) don＇t care．


OUT（ 31,0\()\) ，STATUS \((18,0)\)
NOTE：Assume PIPES2－0 \(=111\), CONFIG1－0 \(=01\), ENRA \(=X, E N R B=X, \operatorname{SELMS} / \overline{L S}=X, \overline{O E Y}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\mathrm{HALT}=1, \mathrm{TP} 1-0=11\)

Figure 24．Single－Precision Independent ALU Operation，All Registers Disabled （PIPES2－PIPESO \(=111\) ，CLKMODE \(=\) X）
(PIPES2-PIPESO = 110, CLKMODE = X)


NOTE：Assume PIPES2－0 \(=010\), CONFIG1－ \(0=01\) ，ENRA \(=1, ~ E N R B=1\), SELMS \(/ \overline{L S}=X, \overline{O E Y}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 26．Single－Precision Independent ALU Operation，Input and Output Registers Enabled（PIPES2－PIPES0 \(=010\), CLKMODE \(=\) X）


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 27. Single-Precision Independent ALU Operation, All Registers Enabled \((\) PIPES2-PIPESO \(=000\), CLKMODE \(=X)\)


OUT（31，0）STATUS（18，0）
NOTE：Assume PIPES2－ \(0=111\), CLKMODE \(=0\), CONFIG1－ \(0=11, E N R A=X, E N R B=X, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 28．Double－Precision Independent ALU Operation，All Registers Disabled（PIPES2－PIPESO \(=111\), CLKMODE \(=0\) ）


NOTE: Assume PIPES2-0 \(=110, \mathrm{CLKMODE}=0\), CONFIG1-0 \(=00, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\operatorname{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 29. Double-Precision Independent ALU Operation, Input Registers Enabled
(PIPES2-PIPESO \(=110\), CLKMODE \(=0\) )


NOTE：Assume PIPES2－0 \(=010\), CLKMODE \(=1\), CONFIG1－ \(0=11\) ，ENRA \(=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 30．Double－Precision Independent ALU Operation，Input and Output Registers Enabled \((\) PIPES2－PIPESO \(=010\), CLKMODE \(=1\) ）


NOTE: Assume PIPES2-0 \(=000\), CLKMODE \(=0\), CONFIG1-O \(=11\), ENRA \(=1, E N R B=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\operatorname{RESET}}=\overline{\mathrm{HALT}}=1\), TP1-O \(=11\)
Figure 31. Double-Precision Independent ALU Operation, All Registers Enabled
(PIPES2-PIPESO \(=000\), CLKMODE \(=0\) )
SN74ACT8847

\section*{Sample Independent Multiplier Microinstructions}

The following independent multiplier timing diagram examples show five register settings，ranging through fully pipelined．Examples for divide and square root are included in this section．\(X=\) don＇t care．


OUT（31，0），STATUS（18，0）

NOTE：Assume PIPES2－ \(0=111\), CONFIG1－ \(0=01\), ENRA \(=X, E N R B=X, S E L M S / \overline{L S} X, \overline{O E Y}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1 \mathrm{TP} 1-0=11\)

Figure 32．Single－Precision Independent Multiplier Operation，All Registers Disabled（PIPES2－PIPESO＝111，CLKMODE＝X）


Figure 33. Single-Precision Independent Multiplier Operation, Input Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X)


\footnotetext{
NOTE：Assume PIPES2－0 \(=010\), CONFIG1－0 \(=01\), ENRA \(=1, ~ E N R B=1\), SELMS \(/ \overline{L S}=X, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1 \mathrm{TP} 1-0=11\)
}

Figure 34．Single－Precision Independent Multiplier Operation，Input and Output Registers Enabled（PIPES2－PIPESO \(=010\), CLKMODE \(=\) X）


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=X, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 35. Single-Precision Independent Multiplier Operation, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=X\) )


NOTE：Assume PIPES2－0 \(=111\), CLKMODE \(=0\), CONFIG1－0 \(=11, \mathrm{ENRA}=\mathrm{X}, \mathrm{ENRB}=\mathrm{X}, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 36．Double－Precision Independent Multiplier Operation，All Registers Disabled（PIPES2－PIPESO \(=111\) ，CLKMODE \(=0\) ）


NOTE: \(\frac{\text { ASSume }}{\text { RESET }} \frac{\text { PIPES2 }}{}\) - \(0=110\), CONFIG1- \(0=11, E N R A=1, E N R B=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\), \(\overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 37. Double-Precision Independent Multiplier Operation, Input Registers Enabled (PIPES2-PIPESO \(=110\), CLKMODE \(=1\) )


NOTE：Assume PIPES2－0 \(=010\), CONFIG1－0 \(=10, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) ， \(\overline{\text { RESET }}=\overline{\text { HALT }}=1\), TP1－0 \(=11\)

Figure 38．Double－Precision Independent Multiplier Operation，Input and Output Registers Enabled（PIPES2－PIPESO \(=010\), CLKMODE \(=0\) ）


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=0, \overline{\mathrm{OES}}=0, \overline{\operatorname{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 39. Double-Precision Independent Multiplier Operation, All Registers Enabled
(PIPES2-PIPESO \(=000\), CLKMODE \(=0\) )


NOTE：Assume PIPES2－0 \(=110\) ，CONFIG1－0 \(=01\) ，ENRA \(=1, E N R B=1, \operatorname{SELMS} / \overline{L S}=X, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 40．Single－Precision Floating Point Division \((\) PIPES2－PIPESO \(=110\), CLKMODE \(=X)\)


NOTE：Assume PIPES2－ \(0=100\), CONFIG1－ \(0=01, E N R A=1, E N R B=1, S E L M S / \overline{L S}=X, \overline{O E Y}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1 \mathrm{TP} 1-0=11\)

Figure 41．Single－Precision Floating Point Division \((\) PIPES2－PIPESO \(=100\), CLKMODE \(=X)\)


NOTE: Assume PIPES2 \(-0=010\), CONFIG1 \(-0=01\), ENRA \(=1\), ENRB \(=1\), SELMS \(/ \overline{\mathrm{LS}}=X, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 42. Single-Precision Floating Point Division (PIPES2-PIPESO \(=010\), CLKMODE \(=X)\)


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01, E N R A=1, E N R B=1, S E L M S / \overline{L S}=x, \overline{O E Y}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 43. Single-Precision Floating Point Division (PIPES2-PIPESO \(=000\), CLKMODE \(=\mathrm{X}\) )


Figure 44．Double－Precision Floating Point Division （PIPES2－PIPESO \(=110\), CLKMODE \(=0)\)


NOTE：Assume PIPES2－0 \(=100\), CONFIG1－0 \(=01\), ENRA \(=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) ， \(\overline{\text { RESET }}=\overline{\text { HALT }}=1\), TP1－0 \(=11\)

Figure 45．Double－Precision Floating Point Division （PIPES2－PIPESO \(=100\), CLKMODE \(=0)\)


NOTE: Assume PIPES2-0 \(=010\), CONFIG \(1-0=01\), ENRA \(=1\), ENRB \(=1\), SELMS \(/ \overline{L S}=X, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 46. Double-Precision Floating Point Division (PIPES2-PIPESO \(=010\), CLKMODE \(=1\) )


NOTE: \(\frac{\text { Assume }}{\overline{\operatorname{RESET}}=\overline{\mathrm{HALS}}-\mathrm{O}=000, \text { CONFIG1-0 }=00, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0,7}\)
Figure 47. Double-Precision Floating-Point Division, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=1\) )


NOTE: Assume PIPES2-0 \(=110\), CONFIG1-0 \(=01\), ENRA \(=1\), ENRB \(=1\), SELMS \(/ \overline{L S}=X, \overline{O E Y}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 48. Integer Division, Input Registers Enabled (PIPES2-PIPESO \(=110\), CLKMODE \(=\mathrm{X}\) )


NOTE: Assume PIPES2-0 \(=100\), CONFIG1-0 \(=01\), ENRA \(=1\), ENRB \(=1\), SELMS \(/ \overline{\mathrm{LS}}=X, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\operatorname{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 49. Integer Division, Input and Pipeline Registers Enabled (PIPES2-PIPESO \(=100\), CLKMODE \(=\) X)


NOTE: Assume PIPES2-0 \(=010\), CONFIG1-0 \(=01\), ENRA \(=1\), \(\mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1\), TP1-0 \(=11\). The result appears in the SREG.

Figure 50. Integer Division, Input and Output Registers Enabled (PIPES2-PIPES0 \(=010\), CLKMODE \(=\) X)


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{SS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 51. Integer Division, All Registers Enabled (PIPES2-PIPESO = 000, CLKMODE = X)


NOTE: Assume PIPES2-0 \(=110\), CONFIG1-0 \(=01\), ENRA \(=1, \mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 52. Single-Precision Floating Point Square Root, Input Registers Enabled (PIPES2-PIPES0 \(=110\), CLKMODE \(=\) X)


NOTE: Assume PIPES2-0 \(=110\), CONFIG1-0 \(=01, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 53. Single-Precision Floating Point Square Root, Input and Pipeline Registers Enabled (PIPES2-PIPES0 \(=100\), CLKMODE \(=\) X)


NOTE: Assume PIPES2-0 \(=010\), CONFIG1-0 \(=01\), ENRA \(=1\), SELMS \(/ \overline{L S}=X, \overline{O E Y}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 54. Single-Precision Floating Point Square Root, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE = X)


NOTE: \(\frac{\text { Assume }}{\mathrm{OEIPES}} 2-0=000, \operatorname{CONFIG1-0}=00\), ENRA \(=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 55. Single-Precision Floating Point Square Root, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=\mathrm{X}\) )


NOTE：\(\frac{\text { Assume }}{\text { PIPES2－0 }}=110\), CONFIG1－0 \(=11\), ENRA \(=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) \(\overline{\text { RESET }}=\overline{\text { HALT }}=1\), TP1－0 \(=11\)

Figure 56．Double－Precision Floating Point Square Root，Input Registers Enabled（PIPES2－PIPESO＝110，CLKMODE \(=1\) ）


NOTE：Assume PIPES2－0 \(=100\), CONFIG1－0 \(=01\), ENRA \(=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) ， \(\overline{\text { RESET }}=\overline{\text { HALT }}=1\), TP1－0 \(=11\)

Figure 57．Double－Precision Floating Point Square Root，Input and Pipeline Registers Enabled（PIPES2－PIPESO \(=100\), CLKMODE \(=0\) ）


NOTE: \(\frac{\text { Assume PIPES2-0 }}{\text { RESET }}=010\), CONFIG1-0 \(=10\), ENRA \(=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\), \(\overline{\text { RESET }}=\overline{\text { HALT }}=1\), TP1-0 \(=11\)

Figure 58. Double-Precision Floating Point Square Root, Input and Output Registers Enabled (PIPES2-PIPESO = 010, CLKMODE =1)


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=00\), ENRA \(=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\), RESET \(=\) HALT \(=1\), TP1-0 \(=11\)

Figure 59. Double-Precision Floating Point Square Root, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=0\) )


NOTE: Assume PIPES2-0 \(=110\), CONFIG1-0 \(=01\), ENRA \(=1, \quad\) SELM \(/ \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1 \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 60. Integer Square Root, Input Registers Enabled (PIPES2-PIPESO \(=110\), CLKMODE \(=\mathrm{X}\) )


NOTE: Assume PIPES2-0 \(=100\), CONFIG1-0 \(=00\), ENRA \(=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 61. Integer Square Root, Input and Pipeline Registers Enabled (PIPES2-PIPESO \(=100\), CLKMODE \(=\) X)


NOTE: Assume PIPES2-0 \(=010\), CONFIG \(1-0=01\), ENRA \(=1, \operatorname{SELMS} / \overline{L S}=X, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\). The result appears in the SREG.

Figure 62. Integer Square Root, Input and Output Registers Enabled (PIPES2-PIPESO \(=010\), CLKMODE \(=\mathrm{X}\) )


NOTE: \(\frac{\text { Assume }}{\mathrm{OEC}}=\overline{\mathrm{PIPES} 2-0}=000\), CONFIG \(1-0=00\), ENRA \(=1, \mathrm{SELMS} / \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP1}-0=11\). The result appears in the SREG.

Figure 63. Integer Square Root, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=\mathrm{X}\) )

\section*{Sample Chained Mode Microinstructions}

The following chained mode timing diagram examples show four register settings， ranging from fully flowthrough to fully pipelined．


OUT（31，0），STATUS（18，0）

NOTE：Assume PIPES2－0 \(=111\), CONFIG1－0 \(=01\), ENRA \(=X\), ENRB \(=X\), SELMS \(/ \overline{\text { LS }}, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \mathrm{RESET}=\mathrm{HALT}=1, \mathrm{TP} 1-0=11\)

Figure 64．Single－Precision Chained Mode Operation，All Registers Disabled （PIPES2－PIPESO \(=111\), CLKMODE \(=\mathrm{X})\)


NOTE: Assume PIPES2-0 \(=110\), CONFIG1-0 \(=11\), ENRA \(=1\), ENRB \(=1\), SELMS \(/ \overline{\mathrm{LS}}=\mathrm{X}, \overline{\mathrm{OEY}}=0\), \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 65. Single-Precision Chained Mode Operation, Input Registers Enabled (PIPES2-PIPESO \(=110\), CLKMODE \(=1\) )


NOTE：Assume PIPES2－0 \(=010\) ，CONFIG1－0 \(=01\), ENRA \(=1\), SELMS \(/ \overline{L S}=X, \overline{\mathrm{OEY}}=0\) ， \(\overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)

Figure 66．Single－Precision Chained Mode Operation，Input and Output Registers Enabled（PIPES2－PIPES0 \(=010\), CLKMODE \(=\) X）


Figure 67. Single-Precision Chained Mode Operation, All Registers Enabled (PIPES2-PIPESO \(=000\), CLKMODE \(=\) X)


NOTE：Assume PIPES2－0 \(=111\), CONFIG1－0 \(=11\), ENRA \(=1\), ENRB \(=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) ， RESET \(=\) HALT \(=1\), TP1－0 \(=11\)

Figure 68．Double－Precision Chained Mode Operation，All Registers Disabled （PIPES2－PIPESO \(=111\), CLKMODE \(=0\) ）


Figure 69. Double-Precision Chained Mode Operation, Input Registers Enabled (PIPES2-PIPESO \(=110\), CLKMODE \(=1\) )


NOTE：Assume PIPES2－0 \(=010\), CONFIG1－0 \(=10, \mathrm{ENRA}=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0\) ， RESET \(=\) HALT \(=1\), TP1－0 \(=11\)

Figure 70．Double－Precision Chained Mode Operation，Input and Output Registers Enabled（PIPES2－PIPESO \(=010\), CLKMODE \(=0\) ）


NOTE: Assume PIPES2-0 \(=000\), CONFIG1-0 \(=01\), ENRA \(=1, \mathrm{ENRB}=1, \overline{\mathrm{OEY}}=0, \overline{\mathrm{OEC}}=\overline{\mathrm{OES}}=0, \overline{\mathrm{RESET}}=\overline{\mathrm{HALT}}=1, \mathrm{TP} 1-0=11\)
Figure 71. Double-Precision Chained Mode Operation, All Registers Enabled
\((\) PIPES2-PIPESO \(=000\), CLKMODE \(=0)\)

\section*{Instruction Timing}

The following table details the number of clock cycles required to complete an operation in different pipelined modes. For more detail, see the sample microinstructions shown in the previous section.

Clock duration and output delay depend on the pipeline mode selected. See the note in the table and timing parameters listed at the beginning of this document.

Table 31. Number of Clocks Required to Complete an Operation
\begin{tabular}{|c|c|c|c|c|c|}
\hline OPERATION & \[
\begin{gathered}
\text { PIPES2-0 } \\
=000 \\
\left(t_{\text {pd }}\right)
\end{gathered}
\] & \[
\begin{gathered}
\text { PIPES2-0 } \\
=100 \\
\left(\mathbf{t}_{\text {pd3 }}\right)
\end{gathered}
\] & \[
\begin{gathered}
\text { PIPES2-0 } \\
=110 \\
\text { (t } \mathbf{t p d}^{2} \text { ) } \\
\hline
\end{gathered}
\] & \[
\begin{gathered}
\text { PIPES2-0 } \\
=111 \\
\left(t_{\text {pd } 1}\right) \\
\hline
\end{gathered}
\] & \[
\begin{gathered}
\hline \text { PIPES2-0 } \\
=010 \\
\text { (t } \mathbf{t}_{\text {pd4 }} \text { ) } \\
\hline
\end{gathered}
\] \\
\hline \multicolumn{6}{|l|}{Single-Precision Floating Point} \\
\hline ALU Operation or Multiply \({ }^{\ddagger}\) & 3 & 2 & 1 & 0 & 2 \\
\hline Divide & 8 & 7 & 7 & \(x\) & 8 \\
\hline Square Root & 11 & 10 & 10 & X & 11 \\
\hline \multicolumn{6}{|l|}{Double-Precision Floating Point} \\
\hline ALU Operation \({ }^{\dagger}\) & 4 & 3 & 2 & 1 & 3 \\
\hline Multiply \({ }^{\ddagger}\) & 5 & 4 & 3 & 2 & 4 \\
\hline Divide & 14 & 13 & 13 & \(x\) & 14 \\
\hline Square Root & 17 & 16 & 16 & X & 17 \\
\hline \multicolumn{6}{|l|}{Integer} \\
\hline ALU Operation or Multiply \(\ddagger\) & 3 & 2 & 1 & 0 & 2 \\
\hline Divide & 16 & 15 & 15 & \(x\) & 16 \\
\hline Square Root & 20 & 19 & 19 & X & 20 \\
\hline
\end{tabular}

Y output and status valid following this \(t_{\text {pd }}\) delay after the designated number of clocks
\(\dagger\) Includes every conversion involving double-precision (DP \(\leftrightarrow\) SP or DP \(\leftrightarrow\) Integer)
\(\ddagger\) Includes all chained mode operations
\(X=\) invalid

When using fast cycle times and double-precision operations, two cycles may be required to output and capture both halves of a double-precision result. To insure the result remains valid for two cycles, a NOP instruction may need to be inserted between the operations. Table 32 shows the number of NOPs necessary to insert into the instruction stream for fully pipelined operation (PIPES2-PIPESO \(=000\) ).

Table 32. NOPs Inserted to Guarantee That Double-Precision Results Remain Valid for Two Clock Cycles (PIPES2-PIPESO \(=000\) )
\begin{tabular}{|c|c|c|c|}
\hline 1ST OPERATION & FOLLOWED BY 2ND OPERATION & \# NOPs INSERTED BETWEEN OPERATIONS & \# CYCLES RESULT IS VALID \\
\hline \multirow[t]{7}{*}{DP \(\rightarrow 32 \mathrm{BIT}\)} & DP \(\rightarrow 32\) BIT & 0 & 2 \\
\hline & \(32 \mathrm{BIT} \rightarrow\) DP & 0 & 2 \\
\hline & 32 BIT OP & 0 & 1 \\
\hline & DP ALU & 0 & 2 \\
\hline & DP Multiply & 0 & 2 \\
\hline & DP Sqrt & 0 & 2 \\
\hline & DP Divide & 0 & 2 \\
\hline \multirow[t]{7}{*}{\(32 \mathrm{BIT} \rightarrow\) DP} & DP \(\rightarrow 32 \mathrm{BIT}\) & 0 & 2 \\
\hline & \(32 \mathrm{BIT} \rightarrow\) DP & 0 & 2 \\
\hline & 32 BIT OP & 1 & 2 \\
\hline & DP ALU & 0 & 2 \\
\hline & DP Multiply & 0 & 2 \\
\hline & DP Sqrt & 0 & 2 \\
\hline & DP Divide & 0 & 2 \\
\hline \multirow[t]{7}{*}{32 BIT OP} & DP \(\rightarrow 32 \mathrm{BIT}\) & 0 & 2 \\
\hline & \(32 \mathrm{BIT} \rightarrow\) DP & 0 & 2 \\
\hline & 32 BIT OP & 0 & 1 \\
\hline & DP ALU & 0 & 2 \\
\hline & DP Multiply & 0 & 2 \\
\hline & DP Sqrt & 0 & 2 \\
\hline & DP Divide & 0 & 2 \\
\hline \multirow[t]{7}{*}{DP ALU} & DP \(\rightarrow 32\) BIT & 0 & 2 \\
\hline & \(32 \mathrm{BIT} \rightarrow\) DP & 0 & 2 \\
\hline & 32 BIT OP & 1 & 2 \\
\hline & DP ALU & 0 & 2 \\
\hline & DP Multiply & 0 & 2 \\
\hline & DP Sqrt & 0 & 2 \\
\hline & DP Divide & 0 & 2 \\
\hline \multirow[t]{7}{*}{DP Multiply} & DP \(\rightarrow 32\) BIT & 1 & 2 \\
\hline & \(32 \mathrm{BIT} \rightarrow\) DP & 1 & 2 \\
\hline & 32 BIT OP & \(2^{\dagger}\) & 2 \\
\hline & DP ALU & 1 & 2 \\
\hline & DP Multiply & 0 & 2 \\
\hline & DP Sqrt & 1 & 2 \\
\hline & DP Divide & 1 & 2 \\
\hline
\end{tabular}

NOTE: 32-bit operation refers to a single-precision floating point or integer ALU operation or multiply, except conversion to or from double-precision. This assumes the instruction following a double-precision divide may begin loading on the 12 th clock cycle, following a double-precision square root on the 15 th cycle.
\({ }^{\dagger}\) The device will not load a single-precision operation on the first clock edge following this operation, so any single-precision instruction may be used. A NOP is recommended. The second instruction must be a NOP.

Table 32. NOPs Inserted to Guarantee That Double-Precision Results Remain Valid for Two Clock Cycles (PIPES2-PIPESO = 000) (Continued)
\begin{tabular}{|llcc|}
\hline 1ST OPERATION & \begin{tabular}{c} 
FOLLOWED BY \\
2ND OPERATION
\end{tabular} & \begin{tabular}{c} 
\# NOPs INSERTED \\
BETWEEN OPERATIONS
\end{tabular} & \begin{tabular}{c} 
\# CYCLES RESULT \\
IS VALID
\end{tabular} \\
\hline DP SQRT & DP \(\rightarrow\) 32 BIT & 1 & 2 \\
& 32 BIT \(\rightarrow\) DP & 1 & 2 \\
& 32 BIT OP & \(2^{\dagger}\) & 2 \\
& DP ALU & 1 & 2 \\
& DP Multiply & 0 & 2 \\
& DP Sqrt & 0 & 2 \\
& DP Divide & 0 & 2 \\
& DP \(\rightarrow 32\) BIT & 1 & 2 \\
& 32 BIT \(\rightarrow\) DP & 1 & 2 \\
& 32 BIT OP & \(2 \dagger\) & 2 \\
& DP ALU & 1 & 2 \\
& DP Multiply & 0 & 2 \\
& DP Sqrt & 0 & 2 \\
& DP Divide & 0 & 2 \\
\hline
\end{tabular}

NOTE: 32-bit operation refers to a single-precision floating point or integer ALU operation or multiply, except conversion to or from double-precision. This assumes the instruction following a double-precision divide may begin loading on the 12 th clock cycle, following a double-precision square root on the 15 th cycle.
\({ }^{\dagger}\) The device will not load a single-precision operation on the first clock edge following this operation, so any single-precision instruction may be used. A NOP is recommended. The second instruction must be a NOP.

\section*{Exception and Status Handling}

Exception and status flags for the 'ACT8847 were listed previously in Tables 14 and 15.
Output exception signals are provided to indicate both the source and type of the exception. DENORM, INEX, OVER, UNDER, and RNDCO indicate the exception type, and CHEX and SRCEX indicate the source of an exception. SRCEX indicates the source of a result as selected by instruction bit I6, and SRCEX is active whenever a result is output, not only when an exception is being signalled. The chained-mode exception signal CHEX indicates that an exception has be generated by the source not selected for output by 16. The exception type signalled by CHEX cannot be read unless status select controls SELST1-SELSTO are used to force status output from the deselected source.

Output exceptions may be due either to a result in an illegal format or to a procedural error. Results too large or too small to be represented in the selected precision are signalled by OVER and UNDER. When INF is high, the output is the IEEE representation of infinity. Any ALU output which has been increased in magnitude by rounding causes INEX to be set high. DENORM is set when the multiplier output is wrapped or the ALU output is denormalized. DENORM is also set high when an illegal operation on an integer is performed. Wrapped outputs from the multiplier may be inexact or increased in magnitude by rounding, which may cause the INEX and RNDCO status signals to be set high. A denormal output from the ALU (DENORM = 1) may also cause INEX to be set, in which case UNDER is also signalled.

Ordinarily, SELST1-SELSTO are set high so that status selection defaults to the output source selected by instruction input I6. The ALU is selected as the output source when 16 is low, and the multiplier when 16 is high.

When the device operates in chained mode, it may be necessary to read the status results not associated with the output source. As shown in Table 16, SELST1-SELSTO can be used to read the status of either the ALU or the multiplier regardless of the 16 setting.

Status results are registered only when the output ( \(P\) and \(S\) ) registers are enabled (PIPES2 \(=0\) ). Otherwise, the status register is transparent. In either case, to read the status outputs, the output enables ( \(\overline{\mathrm{OES}}, \overline{\mathrm{OEC}}\), or both) must be low.

Status flags are provided to signal both floating point and integer results. Integer status is provided using AEQB for zero, NEG for sign, and OVER for overflow/carryout.

Several status exceptions are generated by illegal data or instruction inputs to the FPU. Input exceptions may cause the following signals to be set high: IVAL, DIVBYO, DENIN, and STEX1-STEX0. If the IVAL flag is set, either an invalid operation such as the square root of \(-|X|\), has been requested or a NaN (Not a Number) has been input. When DENIN is set, a denormalized number has been input to the multiplier. DIVBYO is set when the divisor is zero. STEX1-STEX0 indicate which port (RA, RB, or both) is the source of the exception when either a denormal is input to the multiplier (DENIN \(=1\) ) or a NaN (IVAL \(=1\) ) is input to the multiplier or the ALU.

NaN inputs are all treated as IEEE signalling NaNs, causing the IVAL flag to be set. When output from the FPU, the fraction field from a NaN is set high (all 1s) and the sign bit is 0 , regardless of the original fraction and sign fields of the input NaN .

When the 'ACT8847 outputs a NaN , it is always in the form of a signalling NaN along with the IVAL (Invalid) and appropriate STEX flag set high (except for the MOVE A instruction which passes any operand as is without setting exception flags).

Certain operations involving floating point zeros and infinities are invalid, causing the 'ACT8847 to set the IVAL flag and output a NaN. Operations involving zero and infinity are detailed below.

A floating point zero is represented by an all zero exponent and fraction field. The sign bit may be 0 or 1 , to represent +0 OR -0 respectively.

Zero divided by zero is an invalid operation. The result is a NaN with the IVAL and DIVBYO flags set. Any other number divided by zero results in the appropriately signed infinity with the DIVBYO flag set.

For operations with floating point zeros：\(\pm 0\) multiplied by any number is the appropriately signed 0 ．
\[
\begin{aligned}
& +0+(-0)=+0 \\
& +0+(+0)=+0 \\
& -0+(-0)=-0 \\
& -0+(+0)=+0 \\
& +0-(-0)=+0 \\
& +0-(+0)=+0 \\
& -0-(-0)=+0 \\
& -0-(+0)=-0
\end{aligned}
\]

Floating point infinity is represented by an all 1 exponent field with an all 0 fraction field．The sign bit determines positive or negative infinity（ 0 or 1 respectively）．

Infinity divided by infinity is an invalid operation，setting the IVAL flag and resulting in a NaN output．Division of infinity by any other number results in the appropriately signed infinity．Division of any number（except infinity or zero）by infinity results in an appropriately signed zero．Infinity divided by zero results in the appropriately signed infinity with the DIVBYO flag set．

For invalid operations with infinity listed below，the output is a signalling NaN with the IVAL flag set．
\(\pm\) infinity multiplied by \(\pm 0\)
\(\pm\) infinity divided by \(\pm 0\)
+ infinity \(+(-\) infinity）
- infinity \(+(+\) infinity \()\)
＋infinity－（＋infinity）
－infinity－（－infinity）
Any other number added to or multiplied by infinity results in the appropriately signed infinity as output．

\section*{'ACT8847 Reference Guide}

\section*{Instruction Inputs}

Operations are summarized in Tables 33 thru 41.
Table 33. Independent ALU Operations, Single Floating Point Operand
\begin{tabular}{|c|c|c|}
\hline ALU OPERATION ON A OPERAND & INSTRUCTION INPUTS I10-IO & NOTES \\
\hline Pass A operand & 00x x01x 0000 & \\
\hline Pass - A operand & 00x x01x 0001 & \\
\hline Convert from 2's complement integer to floating point \({ }^{\dagger}\) & \(00 \times \times 0100010\) & \\
\hline Convert from floating point to 2's complement integer \({ }^{\dagger}\) & \(00 \times \times 01 \times 0011\) & \begin{tabular}{l}
x = Don't care \\
18 selects precision of \(A\) operand
\end{tabular} \\
\hline Move A operand (pass without NaN detect or status flags active) & \(00 \times \times 01 \times 0100\) & \begin{tabular}{l}
\[
\begin{aligned}
& 0=A(S P) \\
& 1=A(D P)
\end{aligned}
\] \\
17 selects precision of B
\end{tabular} \\
\hline Pass B operand & \(00 \times \times 01 \times 0101\) & operand and must equal 18. \\
\hline Convert from floating point to floating point (adjusts precision of input: \(S P \rightarrow D P, D P \rightarrow S P)^{\ddagger}\) & \(00 \times \times 01 \times 0110\) & 14 selects absolute value of a operand:
\[
\begin{aligned}
0 & =A \\
1 & =|A|
\end{aligned}
\] \\
\hline Floating point to unsigned integer conversion \({ }^{\dagger}\) & 00x x01x 0111 & During integer to floating point conversion, \(|A|\) is not allowed as a result. \\
\hline Wrap denormal operand Unsigned integer to floating point conversion \({ }^{\dagger}\) & \[
\begin{aligned}
& 00 \times \times 01 \times 1000 \\
& 00 \times \times 01 \times 1010
\end{aligned}
\] & \\
\hline Unwrap exact number & 00x x01x 1100 & \\
\hline Unwrap inexact number & \(00 \mathrm{x} \times 01 \times 1101\) & \\
\hline Unwrap rounded input & \(00 \times \times 01 \times 1110\) & \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) During this operation, 18 selects the precision of the result. If the conversion involves double-precision, the operation requires 2 cycles to load.
\({ }^{\ddagger}\) Requires 2 cycles to load the operation, even if input is SP.
}

Table 34．Independent ALU Operations，Two Floating Point Operands
\begin{tabular}{|l|l|l|}
\hline \multicolumn{1}{|c|}{\begin{tabular}{l} 
ALU OPERATIONS \\
AND OPERANDS
\end{tabular}} & \multicolumn{1}{c|}{\begin{tabular}{l} 
INSTRUCTION \\
INPUTS \(110-10\)
\end{tabular}} & \multicolumn{1}{c|}{ NOTES } \\
\hline Add \(A+B\) & \(00 \times \times 0000 \times 00\) & \\
Add \(|A|+B\) & \(00 \times \times 0010 \times 00\) & \(x=\) Don＇t Care \\
Add \(A+|B|\) & \(00 \times \times 0001 \times 00\) & 18 selects precision of \(A\) \\
Add \(|A|+|B|\) & \(00 \times \times 0011 \times 00\) & operand： \\
Subtract \(A-B\) & \(00 \times \times 0000 \times 01\) & \(0=A\)（SP） \\
Subtract \(|A|-B\) & \(00 \times \times 0010 \times 01\) & \(1=A\)（DP） \\
Subtract \(A-|B|\) & \(00 \times \times 0001 \times 01\) & 17 selects precision of \(B\) \\
Subtract \(|A|-|B|\) & \(00 \times \times 0011 \times 01\) & operand： \\
Compare \(A, B\) & \(00 \times \times 0000 \times 10\) & \(0=B\)（SP） \\
Compare \(|A|, B\) & \(00 \times \times 0010 \times 10\) & \(1=B\)（DP） \\
Compare \(A,|B|\) & \(00 \times \times 0001 \times 10\) & 12 selects either \(Y\) or its \\
Compare \(|A|,|B|\) & \(00 \times \times 0011 \times 10\) & absolute value： \\
Subtract \(B-A\) & \(00 \times \times 0000 \times 11\) & \(0=Y\) \\
Subtract \(B-|A|\) & \(00 \times \times 0010 \times 11\) & \(1=|Y|\) \\
Subtract \(|B|-A\) & \(00 \times \times 0001 \times 11\) & \\
Subtract \(|B|-|A|\) & \(00 \times \times 0011 \times 11\) & \\
\hline
\end{tabular}

Table 35．Independent ALU Operations，One Integer Operand
\begin{tabular}{|c|c|c|}
\hline alu operation ON A OPERAND & INSTRUCTION INPUTS I10－10 & NOTES \\
\hline Pass A operand & \(010 \times 100000\) & x \(=\) Don＇t Care \\
\hline Pass－A operand（ 2 ＇s complement \()^{\ddagger}\) & \(010 \times 100001\) & 17 selects format of \(A\) or \(B\) \\
\hline Negate A operand（1＇s complement） & \(010 \times 100010\) & integer operand： \\
\hline Pass B operand & \(010 \times 100101\) & \(0=\) Single－precision 2 ＇s \\
\hline Shift left logical \({ }^{\dagger}\) & \(010 \times 101000\) & 1 ＝Single－precision unsigned \\
\hline Shift right logical \({ }^{\dagger}\) & \(010 \times 101001\) & integer \\
\hline Shift right arithmetic \({ }^{\dagger}\) & \(010 \times 101101\) & 18 must equal 17 \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger} \mathrm{B}\) operand is number of bit positions A is to be shifted and must be input on the same cycle as the instruction．
\(\ddagger\) Pass（ \(-A\) ）of unsigned integer takes 1 ＇s complement．
}

Table 36. Independent ALU Operations, Two Integer Operands
\begin{tabular}{|l|l|l|}
\hline \begin{tabular}{l} 
ALU OPERATIONS \\
AND OPERANDS
\end{tabular} & \begin{tabular}{l} 
INSTRUCTION \\
INPUTS I10-IO
\end{tabular} & \multicolumn{1}{c|}{ NOTES } \\
\hline Add A + B & \(010 \times 0000000\) & \\
Subtract A - B & \(010 \times 0000001\) & \(x=\) Don't Care \\
Compare A, B & \(010 \times 0000010\) & 17 selects format of A and B \\
Subtract B - A & \(010 \times 0000011\) & operands: \\
Logical AND A, B & \(010 \times 0001000\) & \(0=\) Single-precision 2's \\
Logical AND A, NOT B & \(010 \times 0001001\) & complement \\
Logical AND NOT A, B & \(010 \times 0001010\) & \(1=\) Single-precision unsigned \\
Logical OR A, B & \(010 \times 0001100\) & integer \\
Logical XOR A, B & \(010 \times 0001101\) & \\
\hline
\end{tabular}

Table 37. Independent Floating Point Multiply Operations
\begin{tabular}{|c|c|c|}
\hline MULTIPLIER OPERATION AND OPERANDS & INSTRUCTION INPUTS 110-IO & NOTES \\
\hline \begin{tabular}{l}
Multiply A * B \\
Multiply - (A*B) \\
Multiply A * |B| \\
Multiply -(A*|B|) \\
Multiply \(|A| * B\) \\
Multiply \(-(|A| * B)\) \\
Multiply \(|A| *|B|\) \\
Multiply \(-(|A| *|B|)\)
\end{tabular} & 00x x100 00xx \(00 x \times 10001 x x\) \(00 x \times 10010 x x\) \(00 x \times 10011 x x\) 00x x 10100 xx 00x x 10101 xx 00x x 101 10xx 00x x 101 11xx & \begin{tabular}{l}
\(x=\) Don't Care \\
18 selects A operand precision ( \(0=S P, 1=D P\) ) \\
17 selects B operand precision ( \(0=\mathrm{SP}, 1=\mathrm{DP}\) ) \\
11 selects A operand format ( \(0=\) Normal, \(1=\) Wrapped) \\
10 selects B operand format \\
( \(0=\) Normal, \(1=\) Wrapped)
\end{tabular} \\
\hline
\end{tabular}

Table 38. Independent Floating Point Divide/Square Root Operations
\begin{tabular}{|c|c|c|}
\hline MULTIPLIER OPERATION AND OPERANDS \({ }^{\dagger}\) & INSTRUCTION INPUTS I10-IO & NOTES \\
\hline \begin{tabular}{l}
Divide A / B \\
SQRT A \\
Divide \(|A| / B\) \\
SQRT \(|A|\)
\end{tabular} & \[
\begin{array}{lll}
00 x \times 110 & 0 x x x \\
00 x \times 1 & 10 & 1 x x x \\
00 x \times 11 & 0 x x x \\
00 x & 111 & 1 \times x x
\end{array}
\] & \begin{tabular}{l}
\(x=\) Don't Care \\
18 selects A operand precision and 17 selects B operand precision ( \(0=S P, 1=D P\) ) \\
12 negates multiplier result \\
( \(0=\) Normal, \(1=\) Negated) \\
11 selects A operand format and 10 selects B operand format \\
( \(0=\) Normal, \(1=\) Wrapped)
\end{tabular} \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger} 17\) should be equal to 18 for square root operations
}

Table 39．Independent Integer Multiply／Divide／Square Root Operations
\begin{tabular}{|c|c|c|}
\hline \begin{tabular}{c} 
MULTIPLIER OPERATION \\
AND OPERANDS
\end{tabular} \\
\hline Multiply A＊B & INSTRUCTION \\
INPUTS \(110-10\) & NOTES \\
\hline Divide A／B & \(010 \times 1000000\) & \(x=\) Don＇t care \\
SORT A & \(010 \times 1100000\) & 17 selects operand format： \\
\(0=\) SP 2＇s complement \\
\hline
\end{tabular}
\({ }^{\dagger}\) Operations involving absolute values，wrapped operands，or negated results are valid only when floating point format is selected（ \(19=0\) ）．

Table 40．Chained Multiplier／ALU Floating Point Operations \({ }^{\ddagger}\)
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{2}{|l|}{CHAINED OPERATIONS} & \multirow[t]{2}{*}{\begin{tabular}{l}
OUTPUT \\
SOURCE
\end{tabular}} & \multirow[t]{2}{*}{INSTRUCTION INPUTS I10－IO} & \multirow[b]{2}{*}{NOTES} \\
\hline MULTIPLIER & ALU & & & \\
\hline A＊B & \(A+B\) & ALU & \(10 \times \times 000 \times \times 00\) & \\
\hline A＊B & \(A+B\) & Multiplier & \(10 \mathrm{x} \times 100 \times \times 00\) & \\
\hline A＊B & \(A-B\) & ALU & \(10 \mathrm{x} \times 000 \times \times 01\) & \\
\hline A＊B & A－B & Multiplier & \(10 \times \times 100 \times \times 01\) & \\
\hline A＊B & 2－A & ALU & \(10 \times \times 000 \times \times 10\) & \(\mathrm{x}=\) Don＇t Care \\
\hline A＊B & \(2-A\) & Multiplier & \(10 \times \times 100 \times \times 10\) & 18 selects precision of \\
\hline A＊B & B－A & ALU & \(10 \times \times 000 \times \times 11\) & RA inputs： \\
\hline A＊B & \(B-A\) & Multiplier & \(10 \mathrm{x} \times 100 \times \times 11\) & \(0=R A(S P)\) \\
\hline A＊B & \(A+0\) & ALU & \(10 \mathrm{x} \times 010 \mathrm{xx00}\) & 1 ＝RA（DP） \\
\hline A＊B & \(A+0\) & Multiplier & \(10 \times \times 110 \times \times 00\) & 17 selects precision of \\
\hline A＊B & O－A & ALU & \(10 \mathrm{x} \times 010 \times \times 11\) & RB inputs： \\
\hline A＊B & O－A & Multiplier & \(10 \times \times 110 \times \times 11\) & \(\mathrm{O}=\mathrm{RB}\)（SP） \\
\hline A＊ 1 & A + B & ALU & 10x x001 xx00 & 1 ＝RB（DP） \\
\hline A＊ 1 & \(A+B\) & Multiplier & 10x x101 xx00 & 13 negates ALU result： \\
\hline A＊ 1 & A－B & ALU & 10x x001 xx01 & \[
0=\text { Normal }
\] \\
\hline A＊ 1 & A－B & Multiplier & \(10 \times \times 101 \times \times 01\) & 1 ＝Negated \\
\hline A＊ 1 & \(2-\mathrm{A}\) & ALU & 10x x001 xx10 & 12 negates multiplier \\
\hline A＊ 1 & \(2-A\) & Multiplier & \(10 \times \times 101 \times \times 10\) &  \\
\hline A＊ 1 & \(B-A\) & ALU & 10x x001 xx11 & \\
\hline A＊ 1 & B－A & Multiplier & \(10 \mathrm{x} \times 101 \mathrm{xx11}\) & 1 ＝Negated \\
\hline A＊ 1 & A +0 & ALU & 10x x011 xx00 & \\
\hline A＊ 1 & A +0 & Multiplier & \(10 \mathrm{x} \times 111 \mathrm{xx00}\) & \\
\hline A＊ 1 & O－A & ALU & 10x x011 xx11 & \\
\hline A＊ 1 & O－A & Multiplier & \(10 \mathrm{x} \times 111 \times \times 11\) & \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\ddagger}\) The \(110-10\) setting \(1 \times x \times x 1 \times \times x 10\) is invalid，since it attempts to force the B operand of the ALU to both 0 and 2 simultaneously．
}

Table 41. Chained Multiplier/ALU Integer Operations
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{2}{|l|}{CHAINED OPERATIONS} & \multirow[t]{2}{*}{\begin{tabular}{l}
OUTPUT \\
SOURCE
\end{tabular}} & \multirow[t]{2}{*}{INSTRUCTION INPUTS I10-IO} & \multirow[b]{2}{*}{NOTES} \\
\hline MULTIPLIER & ALU & & & \\
\hline A * B & \(A+B\) & ALU & \(110 \times 0000000\) & \\
\hline A * B & \(A+B\) & Multiplier & \(110 \times 1000000\) & \\
\hline A * B & \(A-B\) & ALU & \(110 \times 0000001\) & \\
\hline A * B & A-B & Multiplier & \(110 \times 1000001\) & \\
\hline A * B & \(2-A\) & ALU & \(110 \times 0000010\) & \\
\hline A * B & \(2-A\) & Multiplier & \(110 \times 1000010\) & \\
\hline A * B & B - A & ALU & \(110 \times 0000011\) & \\
\hline A * B & \(B-A\) & Multiplier & \(110 \times 1000011\) & \\
\hline A * B & \(A+0\) & ALU & \(110 \times 0100000\) & \(\mathrm{x}=\) Don't Care \\
\hline A * B & \(A+0\) & Multiplier & \(110 \times 1100000\) & 17 selects format of A and \(B\) operands: \\
\hline A * B & \(0-A\) & ALU & \(110 \times 0100011\) & \[
0=S P 2 \text { 's }
\] \\
\hline A * B & O-A & Multiplier & \(110 \times 1100011\) & complement \\
\hline A * 1 & \(A+B\) & ALU & \(110 \times 0010000\) & 1 = SP unsigned \\
\hline A * 1 & \(A+B\) & Multiplier & \(110 \times 1010000\) & integer \\
\hline A * 1 & A - B & ALU & \(110 \times 0010001\) & \\
\hline A * 1 & A-B & Multiplier & \(110 \times 1010001\) & \\
\hline A * 1 & \(2-\mathrm{A}\) & ALU & \(110 \times 0010010\) & \\
\hline A * 1 & \(2-A\) & Multiplier & \(110 \times 1010010\) & \\
\hline A * 1 & B - A & ALU & \(110 \times 0010011\) & \\
\hline A * 1 & B - A & Multiplier & \(110 \times 1010011\) & \\
\hline A * 1 & \(A+0\) & ALU & \(110 \times 0110000\) & \\
\hline A * 1 & \(A+0\) & Multiplier & \(110 \times 1110000\) & \\
\hline A * 1 & O-A & ALU & \(110 \times 0110011\) & \\
\hline A * 1 & \(0-\mathrm{A}\) & Multiplier & \(110 \times 111\) xx11 & \\
\hline
\end{tabular}

\section*{Input Configuration}

CONFIG1-CONFIGO control the order in which double-precision operands are loaded, as shown in the Table 42.

Table 42. Double-Precision Input Data Configuration Modes
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multirow[b]{3}{*}{CONFIG1} & \multirow[b]{3}{*}{CONFIGO} & \multicolumn{4}{|c|}{LOADING SEQUENCE} \\
\hline & & \multicolumn{2}{|l|}{DATA LOADED INTO TEMP REGISTER ON FIRST CLOCK AND RA/RB REGISTERS ON SECOND CLOCK \({ }^{\dagger}\)} & \multicolumn{2}{|l|}{DATA LOADED INTO RA/RB REGISTERS ON SECOND CLOCK} \\
\hline & & DA & DB & DA & DB \\
\hline 0 & 0 & \[
\begin{aligned}
& \text { B operand } \\
& \text { (MSH) }
\end{aligned}
\] & B operand
(LSH) & A operand (MSH) & A operand (LSH) \\
\hline 0 & \(1{ }^{\ddagger}\) & A operand (LSH) & B operand (LSH) & A operand (MSH) & B operand (MSH) \\
\hline 1 & 0 & A operand (MSH) & B operand (MSH) & A operand (LSH) & B operand (LSH) \\
\hline 1 & 1 & A operand (MSH) & A operand (LSH) & B operand (MSH) & B operand (LSH)) \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) On the first active clock edge (see CLKMODE), data in this column is loaded into the temporary register. On the next rising edge, operands in the temporary register and the DA/DB buses are loaded into the RA and RB registers.
\(\ddagger\) Use CONFIG1-0 \(=01\) as normal single-precision input configuration.
}

\section*{Operand Source Select}

Multiplier and ALU operands are selected by SELOP7-SELOPO as shown in Tables 43 and 44.

Table 43. Multiplier Input Selection
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{A1 (MUX1) INPUT} & \multicolumn{3}{|r|}{B1 (MUX2) INPUT} \\
\hline SELOP7 & SELOP6 & OPERAND SOURCE \({ }^{\dagger}\) & SELOP5 & SELOP4 & OPERAND SOURCE \({ }^{\dagger}\) \\
\hline 0 & 0 & Reserved & 0 & 0 & Reserved \\
\hline 0 & 1 & C register & 0 & 1 & C register \\
\hline 1 & 0 & ALU feedback & 1 & 0 & Multiplier feedback \\
\hline 1 & 1 & RA input register & 1 & 1 & RB input register \\
\hline
\end{tabular}
\({ }^{\dagger}\) For division or square root operations, only RA and RB registers can be selected as sources.

Table 44. ALU Input Selection
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{A2 (MUX3) INPUT} & \multicolumn{3}{|r|}{B2 (MUX4) INPUT} \\
\hline SELOP3 & SELOP2 & OPERAND SOURCE \({ }^{\dagger}\) & SELOP1 & SELOPO & OPERAND SOURCE \({ }^{\dagger}\) \\
\hline 0 & 0 & Reserved & 0 & 0 & Reserved \\
\hline 0 & 1 & C register & 0 & 1 & C register \\
\hline 1 & 0 & Multiplier feedback & 1 & 0 & ALU feedback \\
\hline 1 & 1 & RA input register & 1 & 1 & RB input register \\
\hline
\end{tabular}
\({ }^{\dagger}\) For division or square root operations, only RA and RB registers can be selected as sources.

\section*{Pipeline Control}

Pipelining levels are turned on by PIPES2-PIPESO as shown below.

Table 45. Pipeline Controls (PIPES2-PIPESO)
\begin{tabular}{|lcc|l|}
\hline \multicolumn{2}{|c|}{ PIPES2- } & \multicolumn{1}{|c|}{ REGISTER OPERATION SELECTED } \\
PIPES0 & \multicolumn{1}{|c|}{} \\
\hline X & X & 0 & Enables input registers (RA, RB) \\
X & X & 1 & Makes input registers (RA, RB) transparent \\
X & 0 & X & Enables pipeline registers \\
X & 1 & X & Makes pipeline registers transparent \\
0 & X & X & Enables output registers (PREG, SREG, Status) \\
1 & X & X & Makes output registers (PREG, SREG, Status) transparent \\
\hline
\end{tabular}

\section*{Round Control}

RND1-RNDO select the rounding mode as shown in Table 46.

Table 46. Rounding Modes
\begin{tabular}{|c|c|}
\hline RND1RNDO & ROUNDING MODE SELECTED \\
\hline 00 & Round towards nearest \\
\hline 01 & Round towards zero (truncate) \\
\hline 10 & Round towards infinity (round up) \\
\hline 11 & Round towards negative infinity (round down) \\
\hline
\end{tabular}

\section*{Status Output Selection}

SELST1－SELSTO choose the status output as shown below．

Table 47．Status Output Selection（Chained Mode）
\begin{tabular}{|c|l|}
\hline \begin{tabular}{c} 
SELST1－ \\
SELSTO
\end{tabular} & \multicolumn{1}{|c|}{ STATUS SELECTED } \\
\hline 00 & Logical OR of ALU and multiplier exceptions（bit by bit） \\
01 & Selects multiplier status \\
10 & Selects ALU status \\
11 & Normal operation（selection based on result source specified by 16 input） \\
\hline
\end{tabular}

\section*{Test Pin Control}

Testing is controlled by TP1－TPO as shown below．

Table 48．Test Pin Control Inputs
\begin{tabular}{|l|l|}
\hline TP1－ & \multicolumn{1}{c|}{ OPERATION } \\
TPO & \\
\hline 0 & 0 \\
0 & 1 \\
1 & 0
\end{tabular} All outputs and I／Os are forced low \begin{tabular}{l} 
All outputs and I／Os are forced high \\
1
\end{tabular} \(1 .\)\begin{tabular}{l} 
Normal operation \\
\hline
\end{tabular}

\section*{Miscellaneous Control Inputs}

The remaining control inputs are shown in the Table 49.

Table 49. Miscellaneous Control Inputs
\begin{tabular}{|c|c|c|}
\hline SIGNAL & HIGH & LOW \\
\hline BYTEP & Selects byte parity generation and test & Selects single bit parity generation and test \\
\hline CLKMODE & Enables temporary input register load on falling clock edge & Enables temporary input register load on rising clock edge \\
\hline \(\overline{\text { ENRC }}\) & No effect & Enables C register load when CLKC goes high. \\
\hline ENRA & If register is not in flowthrough, enables clocking of RA register & If register is not in flowthrough, through, holds contents of RA register \\
\hline ENRB & If register is not in flowthrough, enables enables clocking of RB register & If register is not in flowthrough, holds contents of RB register \\
\hline FAST & Places device in FAST mode & Places device in IEEE mode \\
\hline FLOW_C & Causes output value to bypass C register and appear on C register output bus. & No effect \\
\hline \(\overline{\text { HALT }}\) & No effect & Stalls device operation but does not affect registers, internal states, or status \\
\hline \(\overline{\mathrm{OEC}}\) & Disables compare pins & Enables compare pins \\
\hline \(\overline{\mathrm{OES}}\) & Disables status outputs & Enables status outputs \\
\hline OEY & Disables Y bus & Enables Y bus \\
\hline RESET & No effect & Clears internal states, status, internal pipeline registers, and exception disable register. Does not affect other data registers. \\
\hline SELMS/ \(\overline{L S}\) & Selects MSH of 64-bit result for output output on the \(Y\) bus (no effect on singleprecision operands) & Selects LSH of 64-bit result for output on the Y bus (no effect on single-precision operands) \\
\hline SRCC & Selects multiplier result for input to C register & Selects ALU result for input to C register \\
\hline
\end{tabular}

\section*{Glossary}

Biased exponent - The true exponent of a floating point number plus a constant called the exponent field's excess. In IEEE data format, the excess or bias is 127 for singleprecision numbers and 1023 for double-precision numbers.

Denormalized number (denorm) - A number with an exponent equal to zero and a nonzero fraction field, with the implicit leading (leftmost) bit of the fraction field being 0 .

NaN (not a number) - Data that has no mathematical value. The 'ACT8847 produces a NaN whenever an invalid operation such as \(0 * \infty\) is executed. The output format for an NaN is an exponent field of all ones, a fraction field of all ones, and a zero sign bit. Any number with an exponent of all ones and a nonzero fraction is treated as a NaN on the input.

Normalized number - A number in which the exponent field is between 1 and 254 (single precision) or 1 and 2046 (double precision). The implicit leading bit is 1.

Wrapped number - A number created by normalizing a denormalized number's fraction field and subtracting from the exponent the number of shift positions required to do so. The exponent is encoded as a two's complement negative number.

\section*{SN74ACT8847 Application Notes}

\section*{Sum of Products and Product of Sums}

Performing fully pipelined double-precision operations requires a detailed understanding of timing constraints imposed by the multiplier. In particular, sum of products and product of sums operations can be executed very quickly, mostly in chained mode, assuming that timing relationships between the ALU and the multiplier are coded properly.

Pseudocode tables for these sequences are provided, (Table 38 and Table 39) showing how data and instructions are input in relation to the system clock. The overall patterns of calculations for an extended sum of products and an extended product of sums are presented. These examples assume FPU operation in CLKMODE 0, with the CONFIG setting 10 to load operands by MSH and LSH, all registers enabled (PIPES2 - PIPESO \(=000\) ), and the \(C\) register clock tied to the system clock.

In the sum of products timing table, the two initial products are generated in independent multiplier mode. Several timing relationships should be noted in the table. The first chained instruction loads and begins to execute following the sixth rising edge of the clock, after the first product P 1 has already been held in the P register for one clock. For this reason, P 1 is loaded into the C register so that P 1 will be stable for two clocks.

On the seventh clock, the ALU pipeline register loads with an unwanted sum, \(\mathrm{P} 1+\mathrm{P} 1\). However, because the ALU timing is constrained by the multiplier, the S register will not load until the rising edge of CLK9, when the ALU pipe contains the desired sum, P1 + P2. The remaining sequence of chained operations then execute in the desired manner.

Table 50. Pseudocode for Fully Pipelined Double-Precision Sum of Products \({ }^{\dagger}\)
(CLKMODE \(=0\), CONFIG1-CONFIG0 \(=10\), PIPES2-PIPES0 \(=000\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA & DB BUS & TEMP REG & \begin{tabular}{l}
INS \\
BUS
\end{tabular} & INS
REG & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & MUL PIPE & \[
\begin{gathered}
\mathbf{P} \\
\mathbf{R E G}
\end{gathered}
\] & \begin{tabular}{l}
C \\
REG
\end{tabular} & ALU PIPE & \[
\begin{gathered}
\mathrm{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{Y} \\
\mathrm{BUS}
\end{gathered}
\] \\
\hline \(\sqrt{1}\) & A1 MSH & B1 MSH & A1,B1MSH & A1 * B1 & & & & & & & & & \\
\hline \(\sqrt{2}\) & A1 LSH & B1 LSH & A1,B1LSH & A1 * B1 & A1 * B1 & A1 & B1 & & & & & & \\
\hline 3 & A2 MSH & B2 MSH & A2,B2MSH & A2 * B2 & A1 * B1 & A1 & B1 & A1 * B1 & & & & & \\
\hline \(\square 4\) & A2 LSH & B2 LSH & A2,B2LSH & A2 * B2 & A2 * B2 & A2 & B2 & A1 * B1 & & & & & \\
\hline \[
\sqrt{5}
\] & A3 MSH & B3 MSH & A3,B3MSH & \[
\begin{aligned}
& P R+C R \\
& A 3 * B 3
\end{aligned}
\] & A2 * B2 & A2 & B2 & A2 * B2 & P1 & & & & \\
\hline \[
\sqrt{6}
\] & A3 LSH & B3 LSH & A3,B3LSH & \[
\begin{aligned}
& P R+C R \\
& A 3 * B 3
\end{aligned}
\] & \[
\left|\begin{array}{l}
P R+C R, \\
A 3 * B 3
\end{array}\right|
\] & A3 & B3 & A2 * B2 & P1 & P1 & & & \\
\hline \[
\sqrt{7}
\] & A4 MSH & B4 MSH & A4,B4MSH & \[
\begin{aligned}
& P R+S R \\
& A 4 * B 4
\end{aligned}
\] & \[
\begin{array}{|l|}
\hline \mathrm{PR}+\mathrm{SR}, \\
\mathrm{~A} 3 * B 3 \\
\hline
\end{array}
\] & A3 & B3 & A3 * B3 & P2 & P1 & \(\mathrm{P} 1+\mathrm{P} 1\) & & \\
\hline \(\sqrt{ } \sqrt{ }\) & A4 LSH & B4 LSH & A4,B4LSH & \[
\begin{aligned}
& P R+S R \\
& \text { A4 * B4 }
\end{aligned}
\] & \[
\left\lvert\, \begin{aligned}
& P R+S R, \\
& A 4 * B 4
\end{aligned}\right.
\] & A4 & B4 & A3 * B3 & P2 & P1 & \(\mathrm{P} 1+\mathrm{P} 2\) & & \\
\hline \[
\sqrt{9}
\] & A5 MSH & B5 MSH & A5,B5MSH & \[
\begin{aligned}
& \mathrm{PR}+\mathrm{SR} \\
& \mathrm{~A} 5 * B 5 \\
& \hline
\end{aligned}
\] & \[
\begin{array}{|l|}
\hline P R+S R, \\
\mathrm{~A} 4 * B 4 \\
\hline
\end{array}
\] & A4 & B4 & A4 * B4 & P3 & P2 & S1 + P2 & S1 & \\
\hline \[
\int 10
\] & A5 LSH & B5 LSH & A5,B5LSH & \[
\begin{aligned}
& \mathrm{PR}+\mathrm{SR} \\
& \mathrm{~A} 5 * \mathrm{~B} 5 \\
& \hline
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{PR}+\mathrm{SR}, \\
& \mathrm{~A} 5 \text { * B5 } \\
& \hline
\end{aligned}
\] & A5 & B5 & A4 * B4 & P3 & P2 & S1 + P3 & S1 & \\
\hline \[
\int 11
\] & A6 MSH & B6 MSH & A6,B6MSH & \[
\begin{aligned}
& P R+S R \\
& A 6 * B 6
\end{aligned}
\] & \[
\begin{array}{|l|}
\hline P R+S R, \\
A 5 * B 5 \\
\hline
\end{array}
\] & A5 & B5 & A5 * B5 & P4 & P2 & XXXXX & S2 & \\
\hline \(\sqrt{12}\) & & & & & & & & & P4 & P2 & & S2 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) PR \(=\) Product Register
SR = Sum Register
\(C R=\) Constant (C) Register

\section*{Lヤ881つもちLNS}

Table 51．Pseudocode for Fully Pipelined Double－Precision Product of Sums \({ }^{\dagger}\)
（CLKMODE \(=0\) ，CONFIG1－CONFIG0 \(=10\) ，PIPES2－PIPES0＝000）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA
BUS & DB BUS & TEMP REG & \begin{tabular}{l}
INS \\
BUS
\end{tabular} & \begin{tabular}{l}
INS \\
REG
\end{tabular} & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \begin{tabular}{l}
RB \\
REG
\end{tabular} & MUL PIPE & \[
\begin{gathered}
\mathrm{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & ALU PIPE & \[
\begin{gathered}
\mathrm{S} \\
\mathrm{REG}
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] \\
\hline 1 & A1MSH & B1MSH & A1，B1MSH & A1＋B1 & & & & & & & & & \\
\hline 2 & A1LSH & B1LSH & A1，B1LSH & A1＋B1 & A1＋B1 & A1 & B1 & & & & & & \\
\hline 3 & A2MSH & B2MSH & A2，B2MSH & A2＋B2 & A1＋B1 & A1 & B1 & & & & A1＋B1 & & \\
\hline ］ 4 & A2LSH & B2LSH & A2，B2LSH & A2＋B2 & A2＋B2 & A2 & B2 & & & & A1＋B1 & S1 & \\
\hline \(\sqrt{5}\) & A3MSH & B3MSH & A3，B3MSH & \[
\begin{aligned}
& C R * S R \\
& A 3+B 3
\end{aligned}
\] & A2＋B2 & A2 & B2 & & & \[
\begin{gathered}
\overline{\mathrm{ENRC}}=0 \\
\mathrm{~S} 1
\end{gathered}
\] & A2＋B2 & S1 & \\
\hline \(\sqrt{6}\) & A3LSH & B3LSH & A3，B3LSH & \[
\begin{aligned}
& C R * S R \\
& A 3+B 3
\end{aligned}
\] & \[
\begin{aligned}
& C R * S R \\
& A 3+B 3
\end{aligned}
\] & A3 & B3 & & & S1 & A2＋B2 & S2 & \\
\hline \(\sqrt{7}\) & XXX & XXX & XXX & NOP & \[
\begin{aligned}
& C R * S R \\
& \mathrm{~A} 3+\mathrm{B} 3
\end{aligned}
\] & A3 & B3 & S1＊S2 & & S1 & A3＋B3 & S2 & \\
\hline ［ 8 & A4MSH & B4MSH & A4，B4MSH & \[
\begin{aligned}
& P R * S R \\
& A 4+B 4
\end{aligned}
\] & NOP & \[
\begin{gathered}
\text { ENRA }=0 \\
\mathrm{~A} 3
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{ENRB}=0 \\
\mathrm{~B} 3
\end{gathered}
\] & S1＊S2 & & S1 & & XXX & \\
\hline \(\sqrt{ } 5\) & A4LSH & B4LSH & A4，B4LSH & \[
\begin{aligned}
& \text { PR * SR } \\
& \mathrm{A} 4+\mathrm{B4}
\end{aligned}
\] & \[
\left\lvert\, \begin{aligned}
& P R * S R \\
& A 4+B 4
\end{aligned}\right.
\] & A4 & B4 & XXX & P1 & S1 & XXX & S3 & \\
\hline \(\sqrt{10}\) & XXX & XXX & XXX & NOP & \[
\left|\begin{array}{l}
\mathrm{PR} * \mathrm{SR} \\
\mathrm{~A} 4+\mathrm{B4}
\end{array}\right|
\] & A4 & B4 & P1＊S3 & P1 & S1 & A4＋B4 & S3 & \\
\hline \[
\sqrt{11}
\] & A5MSH & B5MSH & A5，B5MSH & \[
\begin{aligned}
& P R * S R \\
& A 5+B 5
\end{aligned}
\] & NOP & \[
\begin{gathered}
\mathrm{ENRA}=0 \\
\mathrm{~A} 4
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{ENRB}=0 \\
\mathrm{B4} \\
\hline
\end{gathered}
\] & P1＊S3 & XXX & S1 & A4＋B4 & XXX & \\
\hline \(\sqrt{12}\) & A5LSH & B5LSH & A5，B5LSH & \[
\begin{aligned}
& P R * S R \\
& A 5+B 5
\end{aligned}
\] & \[
\left|\begin{array}{l}
P R * S R \\
A 5+B 5
\end{array}\right|
\] & A5 & B5 & XXX & P2 & S1 & X & S4 & \\
\hline
\end{tabular}

NOTE：NOP instruction is 01100000000.
\({ }^{\dagger}\) PR \(=\) Product Register
SR＝Sum Register
\(C R=\) Constant（C）Register

\section*{Matrix Operations}

The 'ACT8847 floating point unit can also be used to perform matrix manipulations involved in graphics processing or digital signal processing. The FPU multiplies and adds data elements, executing sequences of microprogrammed calculations to form new matrices.

\section*{Representation of Variables}

In state representations of control systems, an \(n\)-th order linear differential equation with constant coefficients can be represented as a sequence of \(n\) first-order linear differential equations expressed in terms of state variables:
\[
\frac{d_{x 1}}{d t}=x_{2}, \ldots, \quad \frac{d x_{(n-1)}}{d t}=x_{n}
\]

For example, in vector-matrix form the equations of an nth-order system can be represented as follows:
\[
\begin{aligned}
& \frac{\mathrm{d}}{\mathrm{dt}}\left[\begin{array}{c}
x_{1} \\
x_{2} \\
: \\
: \\
x_{n}
\end{array}\right]=\left[\begin{array}{cccc}
a_{11} & a_{12} & \cdots & a_{1 n} \\
: & : & & : \\
: & : & & : \\
: & : & & : \\
a_{n 1} & a_{n 2} & \ldots & a_{n n}
\end{array}\right]\left[\begin{array}{c}
x_{1} \\
x_{2} \\
: \\
: \\
x_{n}
\end{array}\right]+\left[\begin{array}{|ccc}
b_{11} & \cdots & b_{1 n} \\
: & & : \\
: & & : \\
\vdots & & : \\
b_{1} & \ldots & b_{n n}
\end{array}\right]\left[\begin{array}{c}
u_{1} \\
u_{2} \\
: \\
\vdots \\
u_{n}
\end{array}\right] \\
& \text { or, } \dot{x}=a_{x}+b_{u}
\end{aligned}
\]

Expanding the matrix equation for one state variable, \(\mathrm{dx}_{1} / \mathrm{dt}\), results in the following expression:
\[
\dot{x}_{1}=\left(a_{11} * x_{1}+\ldots+a_{1 n} * x_{n}\right)+\left(b_{11} * u_{1}+\ldots+b_{1 n} * u_{n}\right)
\]
where \(\dot{X}_{1}=\mathrm{d} \mathrm{x}_{1} / \mathrm{dt}\).
Sequences of multiplications and additions are required when such state space transformations are performed, and the 'ACT8847 has been designed to support such sum-of-products operations. An \(n \times n\) matrix A multiplied by an \(n \times n\) matrix \(X\) yields an \(n \times n\) matrix \(C\) whose elements cij are given by this equation:
\[
c_{i j}=\sum_{k=1}^{n} a_{i k} * x_{k j} \text { for } i=1, \ldots, n \quad j=1, \ldots, n
\]

For the \(\mathrm{c}_{\mathrm{ij}}\) elements to be calculated by the＇ACT8847，the corresponding elements \(\mathrm{a}_{\mathrm{ik}}\) and \(\mathrm{x}_{\mathrm{kj}}\) must be stored outside the＇ACT8847 and fed to the＇ACT8847 in the proper order required to effect a matrix multiplication such as the state space system representation just discussed．

\section*{Sample Matrix Transformation}

The matrix manipulations commonly performed in graphics systems can be regarded as geometrical transformations of graphic objects．A matrix operation on another matrix representing a graphic object may result in scaling，rotating，transforming，distorting， or generating a perspective view of the image．By performing a matrix operation on the position vectors which define the vertices of an image surface，the shape and position of the surface can be manipulated．

The generalized \(4 \times 4\) matrix for transforming a three－dimensional object with homogeneous coordinates is shown below：
\[
T=\begin{array}{|ccccc}
\hline a & b & c & : & d \\
e & f & g & : & h \\
i & j & k & : & l \\
\cdots & \cdots & \cdots & : & \cdots \\
m & n & o & : & p \\
\hline
\end{array}
\]

The matrix \(T\) can be partitioned into four component matrices，each of which produces a specific effect on the resultant image：


The \(3 \times 3\) matrix produces linear transformation in the form of scaling，shearing and rotation．The \(1 \times 3\) row matrix produces translation，while the \(3 \times 1\) column matrix produces perspective transformation with multiple vanishing points．The final single element \(1 \times 1\) produces overall scaling．Overall operation of the transformation matrix T on the position vectors of a graphic object produces a combination of shearing， rotation，reflection，translation，perspective，and overall scaling．

The rotation of an object about an arbitrary axis in a three－dimensional space can be carried out by first translating the object such that the desired axis of rotation passes through the origin of the coordinate system，then rotating the object about the axis
through the origin, and finally translating the rotated object such that the axis of rotation resumes its initial position. If the axis of rotation passes through the point \(P=[a b c 1]\), then the transformation matrix is representable in this form:
\[
\begin{align*}
& {[x y z h]=\left[\begin{array}{lll}
x y & y & 1
\end{array}\right] \begin{array}{|rrrr}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-a & -b & -c & 1 \\
\hline
\end{array}}  \tag{2}\\
& \text { translation } \\
& \text { to origin } \\
& \text { rotation } \\
& \text { about } \\
& \text { origin } \\
& \text { translation } \\
& \text { back to initial } \\
& \text { position }
\end{align*}
\]
where R may be expressed as:
\[
R=\begin{array}{cccc}
n 1^{2}+(1-n)^{2} \cos \phi & n 1 n 2(1-\cos \phi)+n 3 \sin \phi & n 1 n 3(1-\cos \phi)-n 2 \sin \phi & 0 \\
n 1 n 2(1-\cos \phi)-n 3 \sin \phi & n 2^{2}+(1-n 2)^{2} \cos \phi & n 2 n 3(1-\cos \phi)+n 1 \sin \phi & 0 \\
n 1 n 3(1-\cos \phi)+n 2 \sin \phi & n 2 n 3(1-\cos \phi)-n 1 \sin \phi & n 3^{2}+(1-n 3)^{2} \cos \phi & 0 \\
0 & 0 & 0 & 1 \\
\hline
\end{array}
\]
and
\[
\mathrm{n} 1=\mathrm{q} 1 /\left(\mathrm{q} 1^{2}+\mathrm{q} 2^{2}+\mathrm{q} 3^{2}\right)^{1 / 2}=\underset{\text { direction cosine for } x \text {-axis of }}{\text { rotation }}
\]
\(\mathrm{n} 2=\mathrm{q} 2 /\left(\mathrm{q} 1^{2}+\mathrm{q} 2^{2}+\mathrm{q} 3^{2}\right)^{1 / 2}=\) direction cosine for \(y\)-axis of rotation
\(n 3=q 3 /\left(q 1^{2}+q 2^{2}+q 3^{2}\right)^{1 / 2}=\) direction cosine for \(z\)-axis of rotation
\(\bar{n}=(\mathrm{n} 1 \mathrm{n} 2 \mathrm{n} 3) \quad=\) unit vector for \(\overline{\mathrm{Q}}\)
\(\overline{\mathrm{Q}}=\) vector defining axis of rotation \(=[\mathrm{q} 1 \mathrm{q} 2 \mathrm{q} 3]\)
\(\phi=\) the rotation angle about \(\overline{\mathrm{Q}}\)

A general rotation using equation (2) is effected by determining the \([x y z]\) coordinates of a point \(A\) to be rotated on the object, the direction cosines of the axis of rotation [ \(\mathrm{n} 1, \mathrm{n} 2, \mathrm{n} 3\) ], and the angle \(\phi\) of rotation about the axis, all of which are needed to
define matrix［R］．Suppose，for example，that a tetrahedron ABCD，represented by the coordinate matrix below is to be rotated about an axis of rotation RX which passes through a point \(P=[5-631]\) and whose direction cosines are given by unit vector ［ \(\mathrm{n} 1=0.866, \mathrm{n} 2=0.5, \mathrm{n} 3=0.707\) ］．The angle of rotation 0 is 90 degrees（see Figure 72）．The rotation matrix［R］becomes
\[
\begin{gathered}
\begin{array}{|cccc}
\hline 2 & -3 & 3 & 1 \\
1 & -2 & 2 & 1 \\
2 & -1 & 2 & 1 \\
2 & -2 & 2 & 1
\end{array}, \quad \begin{array}{l}
\text { A } \\
\mathrm{B} \\
\mathrm{C} \\
\mathrm{D}
\end{array} \\
\mathrm{R}=\begin{array}{|cccc|}
\hline 0.750 & 1.140 & 0.112 & 0 \\
-0.274 & 0.250 & 1.220 & 0 \\
1.112 & -0.513 & 0.500 & 0 \\
0 & 0 & 0 & 1 \\
\hline
\end{array}
\end{gathered}
\]

（1）THIS ARROW DEPICTS THE FIRST TRANSLATION
（2）THIS AROW DEPICTS THE \(90^{\circ}\) ROTATION
（3）THIS ARROW DEPICTS THE BACK TRANSLATION

Figure 72．Sequence of Matrix Operations

The point transformation equation (2) can be expanded to include all the vertices of the tetrahedron as follows:
\begin{tabular}{|llll|}
\hline\(x a\) & \(y a\) & \(z a\) & \(h 1\) \\
\(x b\) & \(y b\) & \(z b\) & \(h 2\) \\
\(x c\) & \(y c\) & \(z c\) & \(h 3\) \\
\(x d\) & \(y d\) & \(z d\) & \(h 4\) \\
\hline
\end{tabular}


The 'ACT8847 floating point unit can perform matrix manipulation involving multiplications and additions such as those represented by equation (1). The matrix equation (3) can be solved by using the 'ACT8847 to compute, as a first step, the product matrix of the coordinate matrix and the first translation matrix of the righthand side of equation (3) in that order. The second step involves postmultiplying the rotation matrix by the product matrix. The third step implements the back-translation by premultiplying the matrix result from the second step by the second translation matrix of equation (3). Details of the procedure to produce a three-dimensional rotation about an arbitrary axis are explained in the following steps:

Step 1
Translate the tetrahedron so that the axis of rotation passes through the origin．This process can be accomplished by multiplying the coordinate matrix by the translation matrix as follows：
\begin{tabular}{|llll|}
\hline 2 & -3 & 3 & 1 \\
1 & -2 & 2 & 1 \\
2 & -1 & 2 & 1 \\
2 & -2 & 2 & 1 \\
\hline
\end{tabular}
\begin{tabular}{|rrrr|}
\hline 1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
-5 & 6 & -3 & 1 \\
\hline & &
\end{tabular}
translation to origin
\(=\)\begin{tabular}{llll|}
\hline\((2-5)\) & \((-3+6)\) & \((3-3)\) & 1 \\
\((1-5)\) & \((-2+6)\) & \((2-3)\) & 1 \\
\((2-5)\) & \((-1+6)\) & \((2-3)\) & 1 \\
\((2-5)\) & \((-2+6)\) & \((2-3)\) & 1 \\
\hline
\end{tabular}
vertices of translated tetrahedron
\[
=\begin{array}{rrrr}
-3+3 & 0 & 1 \\
-4 & +4 & -1 & 1 \\
-3 & +5 & -1 & 1 \\
-3 & +4 & -1 & 1
\end{array} \square \quad \begin{aligned}
& \text { AT } \\
& \text { BT } \\
& \text { DT }
\end{aligned}
\]

The＇ACT8847 could compute the translated coordinates AT，BT，CT，DT as indicated above．However，an alternative method resulting in a more compact solution is presented below．

Step 2
Rotate the tetrahedron about the axis of rotation which passes through the origin after the translation of Step 1．To implement the rotation of the tetrahedron，postmultiply the rotation matrix［R］by the translated coordinate matrix from Step 1．The resultant matrix represents the rotated coordinates of the tetrahedron about the origin as follows：

rotation about origin
\[
=\begin{array}{|llll|}
-3.072 & -2.670 & 3.324 & 1 \\
-5.208 & -3.047 & 3.932 & 1 \\
-4.732 & -1.657 & 5.264 & 1 \\
-4.458 & -1.907 & 4.044 & 1 \\
\mid
\end{array}
\]
rotated coordinates

\section*{Step 3}

Translate the rotated tetrahedron back to the original coordinate space. This is done by premultiplying the resultant matrix of Step 2 by the translation matrix. The following calculations produces the final coordinate matrix of the transformed object:
\begin{tabular}{|lllll|}
\hline-3.072 & -2.670 & 3.324 & 1 \\
-5.208 & -3.047 & 3.932 & 1 \\
-4.732 & -1.657 & 5.264 & 1 \\
-4.458 & -1.907 & 4.044 & 1 \\
\hline
\end{tabular}
\begin{tabular}{|cccc|}
\hline 1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
5 & -6 & 3 & 1
\end{tabular}\(=\)
translate back

final rotated coordinates

A more compact solution to these transformation matrices is a product matrix that combines the two translation matrices and the rotation matrix in the order shown in equation (3). Equation (3) will then take the following form:
```

xa ya za h1
xb yb zb h2
xc yc zc h3
xd yd zd h4

```
\begin{tabular}{|llll|}
\hline 2 & -3 & 3 & 1 \\
1 & -2 & 2 & 1 \\
2 & -1 & 2 & 1 \\
2 & -2 & 2 & 1 \\
\hline
\end{tabular}

transformation matrix

The newly transformed coordinates resulting from the postmultiplication of the transformation matrix by the coordinate matrix of the tetrahedron can be computed using equation（1）which was cited previously：
\[
\begin{equation*}
c_{i j}=\sum_{k=1}^{n} a_{i k} * x_{k j} \text { for } i=1, \ldots, n \quad j=1, \ldots, n \tag{1}
\end{equation*}
\]

For example，the coordinates may be computed as follows：
\[
\begin{aligned}
& x a=c_{11}=a_{11} * x_{11}+a_{12} * x_{21}+a_{13}^{*} x_{31}+a_{14}^{*} x_{41} \\
& =2 * 0.750+(-3) *(-0.274)+3 * 1.112+1 *(-3.73) \\
& =1.5+0.822+3.336-3.73 \\
& =1.928 \\
& y \mathrm{a}=\mathrm{c}_{12}=\mathrm{a}_{11} * \mathrm{x}_{12}+\mathrm{a}_{12} * \mathrm{x}_{22}+\mathrm{a}_{13} * \mathrm{x}_{32}+\mathrm{a}_{14} * \mathrm{x}_{42} \\
& =2 * 1.140+(-3) * 0.250+3 *(-0.513)+1 \times(-8.661) \\
& =2.28-0.75-1.539-8.661 \\
& =-8.67 \\
& z a=c 13=a_{11} * x_{13}+a_{12} * x_{23}+a_{13} * x_{33}+a_{14} * \times 43 \\
& =2 * 0.112+(-3) * 1.220+3 * 0.500+1 * 8.260 \\
& =0.224-3.66+1.5+8.260 \\
& =6.324 \\
& h 1=c_{14}=a_{11} * x_{14}+a_{12} * x_{24}+a_{13} * x_{34}+a_{14} * x_{44} \\
& =2 * 0+(-3) * 0+3 * 0+1 * 1 \\
& =0+0+0+1 \\
& =1 \\
& A^{\prime}=[1.928-8.676 .3241]
\end{aligned}
\]

The other rotated vertices are computed in a similar manner：
\[
\begin{aligned}
& B^{\prime}=\left[\begin{array}{llll}
-5.208 & -3.047 & 3.932 & 1
\end{array}\right] \\
& C^{\prime}=\left[\begin{array}{llll}
-4.732 & -1.657 & 5.264 & 1
\end{array}\right] \\
& D^{\prime}=\left[\begin{array}{lll}
-4.458 & -1.907 & 4.044
\end{array}\right]
\end{aligned}
\]

\section*{Microinstructions for Sample Matrix Manipulation}

The＇ACT8847 FPU can compute the coordinates for graphic objects over a broad dynamic range．Also，the homogeneous scalar factors h1，h2，h3 and h4 may be made unity due to the availability of large dynamic range．In the example presented below， some of the calculations pertaining to vertex \(A^{\prime}\) are shown but the same approach can be applied to any number of points and any vector space．

The calculations below show the sequence of operations for generating two coordinates, \(x a\) and ya, of the vertex \(A^{\prime}\) after rotation. The same sequence could be continued to generate the remaining two coordinates for \(A^{\prime}\) (za and h1). The other vertices of the tetrahedron, \(\mathrm{B}^{\prime}, \mathrm{C}^{\prime}\), and \(\mathrm{D}^{\prime}\), can be calculated in a similar way.

Table 52 presents a pseudocode description of the operations, clock cycles, and register contents for a single-precision matrix multiplication using the sum-of-products sequence presented in an earlier section. Registers used include the RA and RB input registers and the product \((\mathrm{P}\) ) and sum ( S ) registers.

Table 52. Single-Precision Matrix Multiplication (PIPES2-PIPESO \(=010\) )
\begin{tabular}{|c|c|c|}
\hline CLOCK CYCLE & MULTIPLIER/ALU OPERATIONS & PSEUDOCODE \\
\hline 1 & Load a11, x11 SP Multiply & \[
\begin{aligned}
& \mathrm{a} 11 \rightarrow \mathrm{RA}, \times 11 \rightarrow \mathrm{RB} \\
& \mathrm{p} 1=\mathrm{a} 11 * \times 11
\end{aligned}
\] \\
\hline 2 & \begin{tabular}{l}
Load a12, x21 \\
SP Multiply Pass P to S
\end{tabular} & \[
\begin{aligned}
& \mathrm{a} 12 \rightarrow \mathrm{RA}, \times 21 \rightarrow \mathrm{RB} \\
& \mathrm{p} 2=\mathrm{a} 12 * \times 21 \\
& \mathrm{p} 1 \rightarrow \mathrm{P}(\mathrm{p} 1)
\end{aligned}
\] \\
\hline 3 & Load a13, x31 SP Multiply Add P to S & \[
\begin{aligned}
& \text { a13 } \rightarrow \text { RA, } x 31 \rightarrow R B \\
& \text { p3 }=\text { a13 } * x 31, p 2 \rightarrow P(p 2) \\
& P(p 1)+0 \rightarrow S(p 1)
\end{aligned}
\] \\
\hline 4 & Load a14, x41 SP Multiply Add P to S & \[
\begin{aligned}
& \mathrm{a} 14 \rightarrow \mathrm{RA}, \mathrm{x} 41 \rightarrow \mathrm{RB} \\
& \mathrm{p} 4=\mathrm{a} 14 * \times 41, \mathrm{p} 3 \rightarrow \mathrm{P}(\mathrm{p} 3) \\
& \mathrm{P}(\mathrm{p} 2)+\mathrm{S}(\mathrm{p} 1) \rightarrow \mathrm{S}(\mathrm{p} 1+\mathrm{p} 2)
\end{aligned}
\] \\
\hline 5 & \begin{tabular}{l}
Load a11, \(\times 12\) \\
SP Multiply \\
Add P to S
\end{tabular} & \[
\begin{aligned}
& \mathrm{a} 11 \rightarrow \mathrm{RA}, \times 12 \rightarrow \mathrm{RB} \\
& \mathrm{p} 5=\mathrm{a} 11 * \times 12, \mathrm{p} 4 \rightarrow \mathrm{P}(\mathrm{p} 4) \\
& \mathrm{P}(\mathrm{p} 3)+\mathrm{S}(\mathrm{p} 1+\mathrm{p} 2) \rightarrow \mathrm{S}(\mathrm{p} 1+\mathrm{p} 2+\mathrm{p} 3)
\end{aligned}
\] \\
\hline 6 & \begin{tabular}{l}
Load a12, x 22 \\
SP Multiply Pass P to S Output S
\end{tabular} & \[
\begin{aligned}
& a 12 \rightarrow R A, \times 22 \rightarrow R B \\
& p 6=a 12 * \times 22, p 5 \rightarrow P(p 5) \\
& P(p 4)+S(p 1+p 2+p 3) \rightarrow \\
& S(p 1+p 2+p 3+p 4)
\end{aligned}
\] \\
\hline 7
8 & \begin{tabular}{l}
Load a13, x32 \\
SP Multiply \\
Add P to S \\
Load a14, x42 \\
SP Multiply \\
Add P to S
\end{tabular} & \[
\begin{aligned}
& \mathrm{a} 13 \rightarrow R A, \times 32 \rightarrow R B \\
& \mathrm{p} 7=\mathrm{a} 13 * \times 32, \mathrm{p} 6 \rightarrow \mathrm{P}(\mathrm{p} 6) \\
& \mathrm{P}(\mathrm{p} 5)+0 \rightarrow \mathrm{~S}(\mathrm{p} 5) \\
& \mathrm{a} 14 \rightarrow \mathrm{RA}, \times 42 \rightarrow \mathrm{RB} \\
& \mathrm{p} 8=\mathrm{a} 14 * \times 42, \mathrm{p} 7 \rightarrow \mathrm{P}(\mathrm{p} 7) \\
& \mathrm{P}(\mathrm{p} 6)+\mathrm{S}(\mathrm{p} 5) \rightarrow \mathrm{S}(\mathrm{p} 5+\mathrm{p} 6)
\end{aligned}
\] \\
\hline 9 & Next operands Next instruction Add \(P\) to \(S\) & \[
\begin{aligned}
& A \rightarrow R A, B \rightarrow R B \\
& p i=A * B, p 8 \rightarrow P(p 8) \\
& P(p 7)+S(p 5+p 6) \rightarrow S(p 5+p 6+p 7)
\end{aligned}
\] \\
\hline 10 & Next operands Next instruction Output S & \[
\begin{aligned}
& C \rightarrow R A, D \rightarrow R B \\
& p j=C * D, p i \rightarrow P(p i) \\
& P(p 8)+S(p 5+p 6+p 7) \rightarrow \\
& \quad S(p 5+p 6+p 7+p 8)
\end{aligned}
\] \\
\hline
\end{tabular}

A microcode sequence to generate this matrix multiplication is shown in Table 53.

Table 53．Microinstructions for Sample Matrix Multiplication


Six cycles are required to complete calculation of \(x a\) ，the first coordinate，and after four more cycles the second coordinate ya is output．Each subsequent coordinate can be calculated in four cycles so the 4 －tuple for vertex \(A^{\prime}\) requires a total of 18 cycles to complete．

Calculations for vertices \(\mathrm{B}^{\prime}, \mathrm{C}^{\prime}\) ，and \(\mathrm{D}^{\prime}\) ，can be executed in 48 cycles， 16 cycles for each vertex．Processing time improves when the transformation matrix is reduced， i．e．，when the last column has the form shown below：


The h－scalars h1，h2，h3，and h4 are equal to 1 ．The number of clock cycles to generate each 4－tuple can then be decreased from 16 to 13 cycles．Total number of clock cycles to calculate all four vertices is reduced from 66 to 54 clocks．Figure 73 summarizes the overall matrix transformation．


Figure 73. Resultant Matrix Transformation

This microprogram can also be written to calculate sums of products with all pipeline registers enabled so that the FPU can operate in its fastest mode. Because of timing relationships, the \(C\) register is used in some steps to hold the intermediate sum of products. Latency due to pipelining and chained data manipulation is 11 cycles for calculation of the first coordinate, and four cycles each for the other three coordinates.

After calculation of the first vertex, 16 cycles are required to calculate the four coordinates of each subsequent vertex. Table 54 presents the sequence of calculations for the first two coordinates, xA and yA.

Products in Table 54 are numbered according to the clock cycle in which the operands and instruction were loaded into the RA, RB, and I register, and execution of the instruction began. Sums indicated in Table 54 are listed below:
\[
\begin{array}{lll}
s 1=p 1+0 & s 5=p 5+p 7 & s 9=p 10+p 12 \\
s 2=p 1+p 3 & s 6=p 6+p 8 & x A=p 1+p 2+p 3+p 4 \\
s 3=p 2+p 4 & s 7=p 9+0 & y A=p 5+p 6+p 7+p 8 \\
s 4=p 5+0 & s 8=p 9+p 11 &
\end{array}
\]

Table 54. Fully Pipelined Single-Precision Sum of Products (PIPES2-PIPESO =000)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline \begin{tabular}{l}
CLOCK \\
CYCLE
\end{tabular} & \[
\begin{gathered}
\text { I } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{aligned}
& \text { DA } \\
& \text { BUS }
\end{aligned}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { I } \\
\text { REG }
\end{gathered}
\] & RA REG & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { S } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] \\
\hline 0 & Mul & x11 & a11 & & & & & & & & & \\
\hline 1 & Mul & x21 & a12 & Mul & \(\times 11\) & a11 & & & & & & \\
\hline 2 & Chn & x31 & a13 & Mul & x21 & a 12 & p1 & & & & & \\
\hline 3 & Mul & x41 & a14 & Chn & x31 & a13 & p2 & & p1 & & & \\
\hline 4 & Chn & \(\times 12\) & a11 & Mul & \(\times 41\) & a 14 & p3 & s1 & p2 & & & \\
\hline 5 & Chn & x22 & a12 & Chn & \(\times 12\) & a11 & p4 & \(\dagger\) & p3 & s1 & p2 & \\
\hline 6 & Chn & x32 & a13 & Chn & \(\times 22\) & a 12 & p5 & s2 & p4 & \(\dagger\) & p2 & \\
\hline 7 & Chn & \(\times 42\) & a14 & Chn & x32 & a13 & p6 & s3 & p5 & s2 & p2 & \\
\hline 8 & Chn & \(\times 13\) & a11 & Chn & \(\times 42\) & a 14 & p7 & s4 & p6 & s3 & s2 & \\
\hline 9 & Chn & \(\times 23\) & a12 & Chn & x 13 & a11 & p8 & \(\times \mathrm{A}\) & p7 & s4 & p6 & \\
\hline 10 & Chn & x33 & a13 & Chn & \(\times 23\) & a 12 & p9 & s5 & p8 & \(\times\) A & p6 & \(x\) A \\
\hline 11 & Chn & x43 & a14 & Chn & x33 & a13 & p10 & s6 & p9 & s5 & p6 & \\
\hline 12 & Chn & \(\times 14\) & a11 & Chn & \(\times 43\) & a 14 & p11 & s7 & p10 & s6 & s5 & \\
\hline 13 & Chn & \(\times 24\) & a12 & Chn & \(\times 14\) & a11 & p12 & yA & p11 & s7 & p10 & \\
\hline 14 & Chn & x34 & a13 & Chn & \(\times 24\) & a 12 & p13 & s8 & p12 & yA & p10 & yA \\
\hline 15 & Chn & x44 & a14 & Chn & x34 & a13 & p14 & s9 & p13 & s8 & p10 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) Contents of this register are not valid during this cycle.

\section*{Chebyshev Routines for the SN74ACT8847 FPU}

\section*{Introduction}

Using the SN74ACT8847, very efficient routines can be developed for the implementation of transcendental functions. A high degree of accuracy can be achieved by taking advantage of the 'ACT8847's ability to perform calculations using doubleprecision floating point operands.

This application note describes how to use the 'ACT8847 to implement seven different transcendental functions. TIM (Texas Instruments Meta-Macro Assembler) assembly files have been written for all seven functions and these files are available upon request from Texas Instruments. The algorithm chosen to implement these functions is the Chebyshev expansion method [1]. Table 55 lists the functions that have been implemented, along with the number of cycles required, and time required to perform the calculations. Also listed in the table is the cycle count and time required to perform the same calculation using the Motorola MC68881 Floating Point Coprocessor and the Intel 80387 Numeric Processor Extension.

The Chebyshev expansion method was chosen rather than some of the more well known methods, such as the Taylor series and Newton-Raphson approximation, for a variety of reasons. The primary advantage of Chebyshev's method is that it provides a uniform convergence rate in the number of terms required to achieve the desired accuracy. Thus the range of the input value will have little effect on the accuracy of the result. Another advantage is that the number of terms required to calculate the
approximation is relatively small. This provides for faster execution. Also, Chebyshev's method can be applied to any function which is continuous and of bounded variation. Lastly, tables are available which contain the constants necessary to implement Chebyshev's method.

In order that this application note be useful to the largest audience, only those instructions and features common to all 'ACT8847 versions have been used to implement the routines.

Contact Texas Instruments VLSI Logic applications group at (214) 997-3970 for a copy of the seven TIM assembly files.

Table 55. Cycle Count and Execution Speed for the Seven Chebyshev Functions
\begin{tabular}{|l|c|c|c|c|c|c|}
\hline \multirow{2}{*}{ FUNCTION } & \multicolumn{3}{|c|}{ CYCLE COUNT \(^{\dagger}\)} & \multicolumn{3}{c|}{\begin{tabular}{c} 
EXECUTION SPEED \\
IN
\end{tabular}} \\
\cline { 2 - 7 } & 'ACT8847 MICROSECONDS & MC68881 & 80387 & 'ACT8847 & MC68881 & 80387 \\
\hline Sine & 51 & 416 & \begin{tabular}{c}
122 to \\
771
\end{tabular} & 1.53 & 25.0 & \begin{tabular}{c}
7.32 to \\
46.3
\end{tabular} \\
\hline Cosine & 51 & 416 & \begin{tabular}{c}
123 to \\
772
\end{tabular} & 1.53 & 25.0 & \begin{tabular}{c}
7.38 to \\
46.3
\end{tabular} \\
\hline Tangent & 84 & 498 & \begin{tabular}{c}
191 to \\
497
\end{tabular} & 2.52 & 29.9 & \begin{tabular}{c}
11.5 to \\
29.8
\end{tabular} \\
\hline ArcSine & 68 & 606 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} & 2.04 & 36.4 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} \\
\hline ArcCosine & 68 & 650 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} & 2.04 & 39.0 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} \\
\hline ArcTangent & 104 & 428 & \begin{tabular}{c}
314 to \\
487
\end{tabular} & 3.12 & 25.7 & \begin{tabular}{c}
18.8 to \\
29.2
\end{tabular} \\
\hline Exponentiation & 52 & 522 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} & 1.56 & 31.3 & \begin{tabular}{c} 
Not \\
Avail.
\end{tabular} \\
\hline
\end{tabular}
\({ }^{\dagger}\) For MC68881 cycle count refer to 'MC68881 Floating Point Coprocessor User's Manual', Document No. MC68881UM/AD, Page 6-13. For 80387 cycle count refer to ' 80387 Programmer's Reference Manual', Document No. 231917-001, Page E-36.
\(\ddagger\) 'ACT8847 cycle speed is \(30 \mathrm{~ns}, 33 \mathrm{MHz}\)
MC68881 cycle speed is \(60 \mathrm{~ns}, 16.6 \mathrm{MHz}\)
80387 cycle speed is \(40 \mathrm{~ns}, 25 \mathrm{MHz}\)

\section*{Overview of Chebyshev's Expansion Method}

If \(f(x)\) is continuous and of bounded variation over the interval \(-1 \leq x \leq 1\), then \(f(x)\) may be approximated by the following equation:
\[
\begin{aligned}
f(x) & =1 / 2 a_{0}+a_{1} T_{1}(x)+a_{2} T_{2}(x)+\ldots \\
& =\sum_{r=0}^{\infty} a_{r} T_{r}(x)
\end{aligned}
\]

Note that the range for \(x\) is between -1 and 1 . For most functions, this restriction requires that the input, \(x\), be range reduced before the calculation begins. Range reducing an argument means to scale the argument down to a certain range. In the case of Chebyshev approximations, the range is usually \(-1 \leq x \leq 1\), or \(0 \leq x \leq 1\).

In the equation for \(f(x)\) above, the constants represented by \(a_{n}\) are known as Chebyshev coefficients. The variables represented by \(T_{r}\) are known as Chebyshev polynomials and can be derived from the following relationship and values:
\[
\begin{aligned}
& T_{r}+1(x)-2 x T_{r}(x)+T_{r}-1(x)=0 \\
& T_{0}(x)=1 \\
& T_{1}(x)=x
\end{aligned}
\]

To illustrate Chebyshev's expansion method, the procedure to approximate function \(f(x)\) using the first seven polynomials is now covered. Let
\[
\begin{aligned}
f(x)= & 1 / 2 a_{0}+ \\
& a_{1} T_{1}(x)+ \\
& a_{2} T_{2}(x)+ \\
& a_{3} T_{3}(x)+ \\
& a_{4} T_{4}(x)+ \\
& a_{5} T_{5}(x)+ \\
& a_{6} T_{6}(x)
\end{aligned}
\]

Substituting in the expressions for the polynomials,
\[
\begin{aligned}
f(x)= & 1 / 2 a_{0}+ \\
& a_{1}(x)+ \\
& a_{2}\left(2 x^{2}-1\right)+ \\
& a_{3}\left(4 x^{3}-3 x\right)+ \\
& a_{4}\left(8 x^{4}-8 x^{2}+1\right)+ \\
& a_{5}\left(16 x^{5}-20 x^{3}+5 x\right)+ \\
& a_{6}\left(32 x^{6}-48 x^{4}+18 x^{2}-1\right)
\end{aligned}
\]

Rearranging the expression, by grouping powers of \(x\),
\[
\begin{aligned}
f(x)= & x^{0}\left(1 / 2 a_{0}-a_{2}+a_{4}-a_{6}\right)+ \\
& x^{1}\left(a_{1}-3 a_{3}+5 a_{5}\right)+ \\
& x^{2}\left(2 a_{2}-8 a_{4}+18 a_{6}\right)+ \\
& x^{3}\left(4 a_{3}-20 a_{5}\right)+ \\
& x^{4}\left(8 a_{4}-48 a_{6}\right)+ \\
& x^{5}\left(16 a_{5}\right)+ \\
& x^{6}\left(32 a_{6}\right)
\end{aligned}
\]

Next make the following substitutions:
\[
\text { Let } \begin{aligned}
c_{0} & =1 / 2 a_{0}-a_{2}+a_{4}-a_{6} \\
c_{1} & =a_{1}-3 a_{3}+5 a_{5} \\
c_{2} & =2 a_{2}-8 a_{4}+18 a_{6} \\
c_{3} & =4 a_{3}-20 a_{5} \\
c_{4} & =8 a_{4}-48 a_{6} \\
c_{5} & =16 a_{5} \\
c_{6} & =32 a_{6}
\end{aligned}
\]

Substituting the c's into the last equation for \(f(x)\),
\[
\begin{aligned}
f(x)= & c_{0} x^{0}+c_{1} x^{1}+c_{2} x^{2}+c_{3} x^{3}+ \\
& c_{4} x^{4}+c_{5} x^{5}+c_{6} x^{6}
\end{aligned}
\]

Applying Horner's Rule yields,
\[
\begin{aligned}
& f(x)=\left(\left(\left(\left(c_{6} x+c_{5}\right) x+c_{4}\right) x+\right.\right. \\
& \left.\left.\left.c_{3}\right) x+c_{2}\right) x+c_{1}\right) x+c_{0}
\end{aligned}
\]

In the remainder of the paper, the above equation will be referred to as \(\mathrm{C}_{\text {series }}\). Therefore,
\[
\begin{aligned}
C_{\text {series_f }} f(x)=\left(\left(\left(\left(c_{6} x+c_{5}\right) x+c_{4}\right) x+\right.\right. \\
\left.\left.\left.c_{3}\right) x+c_{2}\right) x+c_{1}\right) x+c_{0}
\end{aligned}
\]

The last step prior to approximating \(f(x)\) is to calculate the \(c\) 's by substituting the values for the Chebyshev coefficients into the equations for \(c_{0}\) through c6.

\section*{Format for the Remainder of the Application Note}

Each of the seven functions will be covered in a separate section. Each section will include the following information:
1. General steps required to perform the calculation including a description of any preprocessing and/or postprocessing
2. An algorithm for each of the above steps
3. What system intervention, if any, is required; this intervention may take the form of branching based on comparision status generated by the 'ACT8847, or storing and then later retrieving intermediate results
4. The number of 'ACT8847 cycles required to calculate \(f(x)\)
5. A listing of the \(c\) 's
6. Pseudocode table showing how the calculation is accomplished. The pseudocode tables list the contents of all the relevent 'ACT8847 registers and buses for each instruction.
7. Microcode table listing the instructions

\section*{References}
［1］C．W．Clenshaw，G．F．Miller，and M．Woodger，＇＂Algorithms for Special Functions I，＂Numerische Mathematik，Vol 4，1963，pages 403 through 419.
［2］C．W．Clenshaw，＂Chebyshev Series for Mathematical Functions，＂Vol 5 of the Mathematical Tables of the National Physical Laboratory，Department of Scientific Industrial Research，England， 1960.

\section*{Cosine Routine Using Chebyshev＇s Method}

All floating point inputs and outputs are double precision．The input is in radians．

\section*{Steps Required to Perform the Calculation}

STEP 1 －Preprocessing；range－reduce the input，\(X\) ，to a range of \([-1,1]\) ．Next square this range－reduced value，multiply it by 2.0 ，and finally subtract 1．0．X3 is the range－reduced input value，it must be stored externally． ＇TRUNC＇means to truncate．
\[
\begin{aligned}
& \mathrm{X} 1 \leftarrow \mathrm{X} *(2.0 / \mathrm{pi}) \\
& \mathrm{X} 2 \leftarrow(4(\operatorname{TRUNC}(0.25(\mathrm{X} 1+2.0))))-\mathrm{X} 1+1.0
\end{aligned}
\]
\[
\text { If X2 > } 1.0
\]
\[
\text { Then } \mathrm{X} 3 \leftarrow 2.0-\times 2
\]

Else X3 \(\leftarrow\) X2
\(X 4 \leftarrow 2.0 *(X 3 * X 3)-1.0\)
STEP 2 －Core Calculation；X4 in Step 1 will be referred to as＇\(x\)＇in the core calculation．
\[
\begin{aligned}
& X 5 \leftarrow C_{\text {series_cos }} \\
& \leftarrow\left(\left(\left(\left(\left(\left(c_{8}{ }^{*} x+c_{7}\right) * x+c_{6}\right) * x+c_{5}\right) * x+\right.\right.\right. \\
&\left.\left.\left.\left.\quad c_{4}\right) * x+c_{3}\right) * x+c_{2}\right) * x+c_{1}\right) * x+c_{0}
\end{aligned}
\]

STEP 3 －Postprocessing；multiply the output of the core calculation times X3．
Cosine \((X) \leftarrow X 5 * X 3\)

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing:

T1 \(\leftarrow X *(2.0 / \mathrm{pi})\)
T2 \(\leftarrow \mathrm{T} 1+2.0\)
T3 \(\leftarrow 0.25 * T 2\) and
T4 \(\leftarrow 1.0-\) CREG
T5 \(\leftarrow\) INT(T3)
T6 \(\leftarrow 4 * T 5\)
T7 \(\leftarrow\) DOUBLE(T6)
T8 \(\leftarrow\) T7 + CREG
CMP (1.0,T8)
If ( \(1.0>\mathrm{T} 8\) )
Then T9 \(\leftarrow 2.0-\) CREG Else T9 \(\leftarrow\) CREG

T10 \(\leftarrow\) CREG \(*\) CREG
T11 \(\leftarrow\) T10 *2.0
\(\mathrm{T} 12 \leftarrow \mathrm{~T} 11-1.0\)
2.0/pi entered as a constant

CREG \(\leftarrow \mathrm{T} 1, \mathrm{~T} 3\) and T 4 result from a chained instruction round controls set to truncate CREG \(\leftarrow\) T4
convert from integer to double

CREG \(\leftarrow \mathrm{T} 8\)
T9 is X3 in Step 1, must be stored externally
CREG \(\rightarrow\) T9

T12 is X4 in Step 1, the input to the core routine

Step 2 perform the core calculation:
```

    T13 \(\leftarrow\) c8*CREG
    \(\mathrm{T} 14 \leftarrow \mathrm{~T} 13+\mathrm{C} 7\)
    T15 \(\leftarrow\) T14*CREG
    \(\mathrm{T} 16 \leftarrow \mathrm{~T} 15+\mathrm{C} 6\)
    T17 \(\leftarrow\) T16*CREG
    \(\mathrm{T} 18 \leftarrow \mathrm{~T} 17+\mathrm{C} 5\)
    T19 \(\leftarrow\) T18*CREG
    \(\mathrm{T} 20 \leftarrow \mathrm{~T} 19+\mathrm{C} 4\)
    T21 \(\leftarrow\) T20*CREG
    \(\mathrm{T} 22 \leftarrow \mathrm{~T} 21+\mathrm{C} 3\)
    T23 $\leftarrow$ T22*CREG
$\mathrm{T} 24 \leftarrow \mathrm{~T} 23+\mathrm{C}_{2}$
T25 $\leftarrow$ T24*CREG
$\mathrm{T} 26 \leftarrow \mathrm{~T} 25+\mathrm{c} 1$
T27 $\leftarrow$ T26*CREG
$\mathrm{T} 28 \leftarrow \mathrm{~T} 27+\mathrm{c}_{0}$

```

Step 3 perform the postprocessing:
Cosine \((X) \leftarrow T 28 * T 9\)

\section*{Required System Intervention}

As seen in the algorithm for Step 1，the＇ACT8847 performs a compare．The results of this compare determine which one of two calculations is to be performed．The system，in which the＇ACT8847 is a part，must make the decision as to which of the two calculations is to be performed．In addition，the system must store X3 and then later furnish X3 as an input to the＇ACT8847．

\section*{Number of＇ACT8847 Cycles Required to Calculate Cosine（x）}

Calculation of Cosine（ \(x\) ）requires 46 cycles．In addition，it is assumed that five additional cycles are required due to the compare instruction，and resulting system intervention． Therefore，the total number of cycles to perform the Cosine \((x)\) calculation is 51 ．

\section*{Listing of the Chebyshev Constants（c＇s）}

The constants are represented in IEEE double－precision floating point format．
\[
\begin{aligned}
& c_{8}=3 D 19 D 46 B 7 D 4 C 8 F 32 \\
& c_{7}=\text { BD962909C5C01ED6 } \\
& c_{6}=3 E 0 D 53517735 F 927 \\
& c_{5}=\text { BE7CC930FD0ADA9D } \\
& c_{4}=3 E E 3 E 0 A F 61 F 7677 F \\
& c_{3}=\text { BF41E5FDEF25C403 } \\
& c_{2}=3 F 92 A 9 F B 40 C 119 E D \\
& c_{1}=\text { BFD23B03366AAOC9 } \\
& c_{0}=3 F F 4464 B C C 8 C B A 1 F
\end{aligned}
\]

\section*{Pseudocode Table for the Cosine(x) Calculation}

Table 56. Pseudocode for Chebyshev Cosine Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{gathered}
\text { DA } \\
\text { BUS } \\
\hline
\end{gathered}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|c|}
\hline \text { CLK } \\
\text { MODE } \\
\hline
\end{array}
\] & INSTR & MUL PIPE & ALU PIPE & \[
\begin{array}{c|}
\hline \mathbf{P} \\
\mathrm{REG} \\
\hline
\end{array}
\] & \[
\begin{gathered}
\mathrm{C} \\
\mathrm{REG}
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{s} \\
\text { REG } \\
\hline
\end{gathered}
\] & \[
\left|\begin{array}{c}
\mathrm{y} \\
\text { BUS }
\end{array}\right|
\] & COMMENT \\
\hline 1 & X MSH & X LSH & & & 0 & RA2*RB2 & & & & & & & X is the input \\
\hline 2 & \[
\begin{aligned}
& \text { 2DIVPI } \\
& \text { MSH } \\
& \hline
\end{aligned}
\] & \[
\begin{gathered}
\text { 2DIVPI } \\
\text { LSH } \\
\hline
\end{gathered}
\] & x & 2DIVPI & 0 & RA2*RB2 & & & & & & & 2DIVPI is a constant representing 2.0/pi \\
\hline 3 & 1.0 MSH & 1.0 LSH & x & 2DIVPI & 0 & PR4 + RB4 & RA2*RB2 & & & & & & Preload RA with 1.0 for use in cycles 5 and 11 \\
\hline 4 & 2.0 MSH & 2.0 LSH & 1.0 & 2.0 & 0 & PR4 + RB4 & & & P1 & & & & \\
\hline 5 & 0.25 MSH & 0.25 LSH & 1.0 & 0.25 & 1 & \[
\begin{aligned}
& \text { SR5*RB5 } \\
& \text { RA5-CR5 } \\
& \hline
\end{aligned}
\] & & & & P1 & S1 & & \\
\hline 6 & & & 1.0 & 0.25 & 0 & DP21(PR7) & SR5*RB5 & RA5-CR5 & & & & & Double precision \(\rightarrow\) integer \\
\hline 7 & & & 1.0 & 0.25 & 0 & DP21(PR7) & & & P2 & & S2 & & Cycles 6,7 set RND1, \(0=01\) \\
\hline 8 & & 4 & 1.0 & 4 & 0 & SR8*RB8 & & & & S2 & S3 & & \\
\hline 9 & & & 1.0 & 4 & 1 & 12DP(PR9) & & & P3 & & & & Integer \(\rightarrow\) double-precision \\
\hline 10 & & & 1.0 & 4 & 1 & CR10+SR10 & & & & & S4 & & \\
\hline 11 & & & 1.0 & 4 & 1 & \begin{tabular}{l}
COMPARE \\
RA11,SR11
\end{tabular} & & & & & S5 & & \[
\begin{array}{|l}
\text { If SR11 > RA11 then } 13 \text { a } \\
\text { If SR11 } \leq \text { RA11 then } 13 b \\
\hline
\end{array}
\] \\
\hline 12 & & & 1.0 & 4 & 0 & NOP & & & & S5 & & & Wait for system response \\
\hline 13a & 2.0 MSH & 2.0 LSH & 1.0 & 2.0 & 1 & RB13-CR13 & & & & & & & Execute 13a or 13b \\
\hline 13b & & & 1.0 & 4 & 1 & PAS(CR13) & & & & & & & Pass contents of CREG \\
\hline 14 & & & 1.0 & \[
\begin{aligned}
& 2.0 \\
& \text { or } 4
\end{aligned}
\] & 1 & CR14*CR14 & & & & & S6 & S6 & S6 is either RB13-CR13 or CR13 from PASS CR13, and must be stored externally for use in cycle 43 \\
\hline 15 & 2.0 MSH & 2.0 LSH & 1.0 & \[
\begin{aligned}
& 2.0 \\
& \text { or } 4 \\
& \hline
\end{aligned}
\] & 0 & RA16*PR16 & CR14*CR14 & & & S6 & & S6 & Output S6 in cycles 14 and 15 \\
\hline 16 & & & 2.0 & \[
\begin{aligned}
& 2.0 \\
& \text { or } 4
\end{aligned}
\] & 0 & RA16*PR16 & & & P4 & & & & \\
\hline
\end{tabular}

\section*{Lヤ881つもゅLNS}

Table 56．Pseudocode for Chebyshev Cosine Routine（PIPES2－0＝010，RND1－0＝00）（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{aligned}
& \text { DA } \\
& \text { BUS } \\
& \hline
\end{aligned}
\] & \[
\begin{aligned}
& \text { DB } \\
& \text { BUS } \\
& \hline
\end{aligned}
\] & \[
\begin{array}{|c}
\hline \text { RA } \\
\text { REG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG } \\
\hline
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE } \\
\hline
\end{gathered}
\] & INSTR & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathrm{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{C} \\
\text { REG } \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c}
\text { SEG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS } \\
\hline
\end{gathered}
\] & COMMENT \\
\hline 17 & & & 2.0 & \[
\begin{gathered}
2.0 \\
\text { or } 4 \\
\hline
\end{gathered}
\] & 0 & PR18＋RB18 & RA16＊PR16 & & & & & & \\
\hline 18 & －1．0 MSH & －1．0 LSH & 2.0 & －1．0 & 0 & PR18＋RB18 & & & P5 & & & & \\
\hline 19 & \({ }^{\text {c }} 8 \mathrm{MSH}\) & \({ }^{\text {c } 8 ~ L S H ~}\) & 2.0 & \(\mathrm{c}_{8}\) & 1 & SR19＊RB19 & & & & & S7 & & Start core calculation \\
\hline 20 & & & 2.0 & c8 & 0 & PR21＋RB21 & SR19＊RB19 & & & S7 & & & S7 is input to core calc． \\
\hline 21 & \(c_{7} \mathrm{MSH}\) & \(c_{7}\) LSH & 2.0 & \(\mathrm{c}_{7}\) & 0 & PR21＋RB21 & & & P6 & & & & \\
\hline 22 & & & 2.0 & c7 & 1 & SR22＊CR22 & & & & & S8 & & \\
\hline 23 & & & 2.0 & c7 & 0 & PR24＋RB24 & SR22＊CR22 & & & & & & \\
\hline 24 & \(c_{6} \mathrm{MSH}\) & \(\mathrm{c}_{6}\) LSH & 2.0 & \({ }^{\text {c } 6}\) & 0 & PR24＋RB24 & & & P7 & & & & \\
\hline 25 & & & 2.0 & c6 & 1 & SR25＊CR25 & & & & & S9 & & \\
\hline 26 & & & 2.0 & \({ }^{6} 6\) & 0 & PR27＋RB27 & SR25＊CR25 & & & & & & \\
\hline 27 & \(\mathrm{c}_{5} \mathrm{MSH}\) & \(\mathrm{c}_{5} \mathrm{LSH}\) & 2.0 & c5 & 0 & PR27＋RB27 & & & P8 & & & & \\
\hline 28 & & & 2.0 & C5 & 1 & SR28＊CR28 & & & & & S10 & & \\
\hline 29 & & & 2.0 & c5 & 0 & PR30＋RB30 & SR28＊CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{4} \mathrm{MSH}\) & \(c_{4}\) LSH & 2.0 & \(\mathrm{c}_{4}\) & 0 & PR30＋RB30 & & & P9 & & & & \\
\hline 31 & & & 2.0 & \(\mathrm{C}_{4}\) & 1 & SR31＊CR31 & & & & & S11 & & \\
\hline 32 & & & 2.0 & \(\mathrm{c}_{4}\) & 0 & PR33＋RB33 & SR31＊CR31 & & & & & & \\
\hline 33 & c3 MSH & \(\mathrm{c}_{3} \mathrm{LSH}\) & 2.0 & c3 & 0 & PR33＋RB33 & & & P10 & & & & \\
\hline 34 & & & 2.0 & \(\mathrm{c}_{3}\) & 1 & SR34＊CR34 & & & & & S12 & & \\
\hline 35 & & & 2.0 & c3 & 0 & PR36＋RB36 & SR34＊CR34 & & & & & & \\
\hline 36 & \(c_{2} \mathrm{MSH}\) & \(c_{2}\) LSH & 2.0 & \(\mathrm{c}_{2}\) & 0 & PR36＋RB36 & & & P11 & & & & \\
\hline
\end{tabular}

Table 56. Pseudocode for Chebyshev Cosine Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) ) (Concluded)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{gathered}
\text { DA } \\
\text { BUS } \\
\hline
\end{gathered}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS } \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c}
\hline \text { RA } \\
\hline \text { REG }
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\hline \text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & \[
\begin{aligned}
& \hline \text { ALU } \\
& \text { PIPE } \\
& \hline
\end{aligned}
\] & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
c \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{s} \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathbf{y} \\
\text { BUS } \\
\hline
\end{array}
\] & COMMENT \\
\hline 37 & & & 2.0 & \(\mathrm{c}_{2}\) & 1 & SR37*CR37 & & & & & S13 & & \\
\hline 38 & & & 2.0 & \(\mathrm{c}_{2}\) & 0 & PR39 + RB39 & SR37 * CR37 & & & & & & \\
\hline 39 & \(\mathrm{c}_{1} \mathrm{MSH}\) & \(c_{1}\) LSH & 2.0 & \(\mathrm{c}_{1}\) & 0 & PR39 + RB39 & & & P12 & & & & \\
\hline 40 & & & 2.0 & \(\mathrm{c}_{1}\) & 1 & SR40*CR40 & & & & & S14 & & \\
\hline 41 & & & 2.0 & \(\mathrm{c}_{1}\) & 0 & PR42 + RB42 & SR40*CR40 & & & & & & \\
\hline 42 & \(\mathrm{co}_{0} \mathrm{MSH}\) & \(\mathrm{co}_{0}\) LSH & 2.0 & \({ }_{0}\) & 0 & PR42 + RB42 & & & P13 & & & & \\
\hline 43 & S6 MSH & S6 LSH & 2.0 & S6 & 1 & SR43*RB43 & S15 & & & & & & Begin postprocessing \\
\hline 44 & & & 2.0 & S6 & 0 & DUMMY & SR43*RB43 & & & & & & Instruction is doubleprecision RA + RB, allows time for answer to propagate to the Y bus \\
\hline 45 & & & 2.0 & S6 & 0 & NOP & & & P14 & & & P14 & Output MSH of answer \\
\hline 46 & & & 2.0 & S6 & 0 & NOP & & & P14 & & & P14 & Output LSH of answer \\
\hline
\end{tabular}

\section*{Lヤ881つも七LNS}

\section*{Y Microcode Table for the Cosine（x）Calculation}

All numbers are in hex．Any field with a length that is not a multiple of 4 is right justified and zero filled．For the microcode table，the value of \(X\) has been chosen to be \(1 / 2\) pi．
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline P & D & D & P & E E & E & C & P & C & C & & S & \(\overline{\mathrm{R}}\) & \(\bar{H}\) & E & F & & 1 & & & F & S & B & & & & \(\overline{0} \overline{0}\) \\
\hline A & A & B & B & N N & N & L & 1 & L & 0 & & E & E & A & N & L & & N & & & A & R & & & & & E E E \\
\hline & & & & A B & B & K & P & K & N & & L & S & L & C & 0 & & S & & D & S & C & L & S & & & \(Y\) S C \\
\hline & & & & & & C & E & M & M & & 0 & E & T & & W & & T & & & & C & S & T & & Y & \\
\hline & & & & & & & S & \[
\begin{aligned}
& 0 \\
& 0
\end{aligned}
\] & O 1 & & P & T & & & C & & R & & & & & & & & & \\
\hline
\end{tabular}
\begin{tabular}{lll} 
F & 3FF921FB \\
F & 3FE45F30 \\
F & 3FF00000 \\
F & 40000000 \\
F & 3FD00000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 00000000 \\
F & 40000000 \\
F & 00000000 \\
F & 00000000 \\
F & BFF00000 \\
F & 3D19D46B \\
F & 00000000 \\
F & BD962909
\end{tabular}
\(\left.\begin{array}{llllllllllllllllllllllll}54442 D 18 & F & 0 & 0 & - & 2 & 0 & 3 & F F & 1 & 1 & 1 & 0 & 1 C 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1\end{array}\right)\)

\section*{Microcode Table for the Cosine(x) Calculation (Continued)}


SN74ACT8847

\section*{Sine Routine Using Chebyshev＇s Method}

All floating point inputs and outputs are double precision．The input is in radians．

\section*{Steps Required to Perform the Calculation}

STEP 1 －Preprocessing；range reduce the input，\(X\) ，to a range of［－1，1］．Next square this range－reduced value，multiply it by 2.0 ，and finally subtract 1．0．X3 is the range－reduced input value，it must be stored externally． ＇TRUNC＇means to truncate．
```

X1}\leftarrow\textrm{X}*(2.0/pi
X2}\leftarrow\textrm{X1}-(4(TRUNC(0.25(X1 + 1.0)))
If X2 > 1.0

```
    Then X3 \(\leftarrow 2.0-\mathrm{X} 2\)
    Else \(\mathrm{X} 3 \leftarrow \mathrm{X} 2\)
\(X 4 \leftarrow 2.0 *(X 3 * X 3)-1.0\)

STEP 2 －Core calculation；X4 in Step 1 will be referred to as＇\(x\)＇in the core calculation．
```

$X 5 \leftarrow C_{\text {series_sin }}$
$\leftarrow\left(()\left(()\left(c_{8} * * x+c_{7}\right) * x+c_{6}\right) * x+c_{5}\right) * x+$
$\left.\left.\left.\left.c_{4}\right) * x+c_{3}\right) * x+c_{2}\right) * x+c_{1}\right) * x+c_{0}$

```

STEP 3 －Postprocessing；multiply the output of the core calculation times X 3 ．
Sine \((X) \leftarrow X 5 * X 3\)

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing：
\begin{tabular}{|c|c|}
\hline T1 \(\leftarrow \mathrm{X} *(2.0 / \mathrm{pi})\) & 2．0／pi entered as a constant \\
\hline \(\mathrm{T} 2 \leftarrow \mathrm{~T} 1+1.0\) & \\
\hline T3 \(\leftarrow 0.25 * T 2\) & CREG \(\leftarrow\) T1 \\
\hline T4 \(\leftarrow \mathrm{INT}(\mathrm{T} 3)\) & round controls set to truncate \\
\hline T5 \(\leftarrow 4 *\) T4 & \\
\hline T6 \(\leftarrow\) DOUBLE（T5） & convert from integer to double \\
\hline \multicolumn{2}{|l|}{T7 \(\leftarrow\) CREG－T6} \\
\hline CMP（1．0，T7） & compare 1.0 to T7 \\
\hline If（ \(1.0>\mathrm{T} 7\) ） & CREG \(\leftarrow\) T7 \\
\hline Then T8 \(\leftarrow 2.0-\) CREG & T8 is X3 in Step 1，must \\
\hline Else T8 \(\leftarrow\) CREG & be stored externally \\
\hline & CREG \(\rightarrow\) T8 \\
\hline \multicolumn{2}{|l|}{T9 \(\leftarrow\) CREG＊CREG} \\
\hline \multicolumn{2}{|l|}{\(\mathrm{T} 10 \leftarrow \mathrm{~T} 9 * 2.0\)} \\
\hline T11 \(\leftarrow\) T10－1．0 & T11 is X4 in Step 1 above，the input to the core routine \\
\hline & T11＝＇ x ＇from Step 2 above \\
\hline
\end{tabular}

Step 2 perform the core calculation:
\begin{tabular}{|c|c|}
\hline T12 ¢ 8 * \({ }^{\text {cheg }}\) & CREG \(\leftarrow T 11\) \\
\hline \(\mathrm{T} 13 \leftarrow \mathrm{~T} 12+\mathrm{c} 7\) & \\
\hline T14 \(\leftarrow\) T13*CREG & \\
\hline \(\mathrm{T} 15 \leftarrow \mathrm{~T} 14+\mathrm{C} 6\) & \\
\hline T16 \(\leftarrow\) T15*CREG & \\
\hline \(\mathrm{T} 17 \leftarrow \mathrm{~T} 16+\mathrm{C} 5\) & \\
\hline T18 \(\leftarrow\) T17*CREG & \\
\hline \(\mathrm{T} 19 \leftarrow \mathrm{~T} 18+\mathrm{C} 4\) & \\
\hline T20 \(\leftarrow T 19 *\) CREG & \\
\hline \(\mathrm{T} 21 \leftarrow \mathrm{~T} 20+\mathrm{C} 3\) & \\
\hline T22 \(\leftarrow\) T21*CREG & \\
\hline \(\mathrm{T} 23 \leftarrow \mathrm{~T} 22+\mathrm{C} 2\) & \\
\hline T24 T 23 * CREG & \\
\hline \(\mathrm{T} 25 \leftarrow \mathrm{~T} 24+\mathrm{C} 1\) & \\
\hline T26 \(\leftarrow\) T25*CREG & \\
\hline T27 \(\leftarrow\) T26 \(\leftarrow \mathrm{c}_{0}\) & \\
\hline
\end{tabular}

Step 3 perform the postprocessing:
\[
\text { Sine }(X) \leftarrow T 27 * T 8
\]

\section*{Required System Intervention}

As seen in the algorithm for Step 1, the 'ACT8847 performs a compare. The results of this compare determine which one of two calculations is to be performed. The system, in which the 'ACT8847 is a part, must make the decision between which two calculations are to be performed. In addition, the system must store X3 and then later furnish X3 as an input to the 'ACT8847.

\section*{Number of 'ACT8847 Cycles Required to Calculate Sine(x)}

Calculation of Sine(x) requires 46 cycles. In addition, it is assumed that five additional cycles are required due to the compare instruction and resulting system intervention. Therefore, the total number of cycles to perform the Sine \((x)\) calculation is 51 .

\section*{Listing of the Chebyshev Constants (c's)}

The constants are represented in IEEE double-precision floating point format.
\[
\begin{aligned}
& c_{8}=3 D 19 D 46 B 7 D 4 C 8 F 32 \\
& c_{7}=\text { BD962909C5C01ED6 } \\
& c_{6}=3 E 0 D 53517735 F 927 \\
& c_{5}=\text { BE7CC930FD0ADA9D } \\
& c_{4}=3 E E 3 E 0 A F 61 F 7677 F \\
& c_{3}=\text { BF41E5FDEF25C403 } \\
& c_{2}=3 F 92 A 9 F B 40 C 119 E D \\
& c_{1}=\text { BFD23B03366AAOC9 } \\
& c_{0}=3 F F 4464 B C C 8 C B A 1 F
\end{aligned}
\]

\section*{Lヤ881つももLNS}

\section*{Pseudocode Table for the Sine（x）Calculation}

Table 57．Pseudocode for Chebyshev Sine Routine（PIPES2－0 \(=010\) ，RND1－0 \(=00\) ）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{C} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathbf{Y} \\
\hline \text { BUS } \\
\hline
\end{array}
\] & COMMENT \\
\hline 1 & X MSH & X LSH & & & 0 & RA2＊RB2 & & & & & & & X is the input \\
\hline 2 & \begin{tabular}{l}
2DIVPI \\
MSH
\end{tabular} & \begin{tabular}{l}
2DIVPI \\
LSH
\end{tabular} & X & 2DIVPI & 0 & RA2＊RB2 & & & & & & & 2DIVPI is a constant representing 2．0／pi \\
\hline 3 & & & x & 2DIVPI & 0 & PR4＋RB4 & RA2＊RB2 & & & & & & \\
\hline 4 & 1．0 MSH & 1．0 LSH & X & 1.0 & 0 & PR4＋RB4 & & & P1 & & & & \\
\hline 5 & 0．25 MSH & 0．25 LSH & X & 0.25 & 1 & SR5＊RB5 & & & & P1 & S1 & & \\
\hline 6 & 1．0 MSH & 1．0 LSH & X & 0.25 & 0 & DP21（PR7） & SR5＊RB5 & & & & & & Double precision \(\rightarrow\) integer \\
\hline 7 & & & 1.0 & 0.25 & 0 & DP21（PR7） & & & P2 & & & & Cycles 6，7 set RND1，0 \(=01\) \\
\hline 8 & & 4 & 1.0 & 4 & 0 & SR8＊RB8 & & & & & S2 & & \\
\hline 9 & & & 1.0 & 4 & 1 & I2DP（PR9） & & & P3 & & & & Integer \(\rightarrow\) double precision \\
\hline 10 & & & 1.0 & 4 & 1 & CR10－SR10 & & & & & S3 & & \\
\hline 11 & & & 1.0 & 4 & 1 & \begin{tabular}{l}
COMPARE \\
RA11，SR11
\end{tabular} & & & & & S4 & & \[
\begin{aligned}
& \text { If SR11 } \rightarrow \text { RA11 then } 13 a \\
& \text { If SR11 } \leq \text { RA11 then } 13 b
\end{aligned}
\] \\
\hline 12 & & & 1.0 & 4 & 0 & NOP & & & & S4 & & & Wait for system response \\
\hline 13a & 2．0 MSH & 2．0 LSH & 1.0 & 2.0 & 1 & RB13－CR13 & & & & & & & Execute 13a or 13b \\
\hline 13b & － & & 1.0 & 4 & 1 & PAS（CR13） & & & & & & & Pass contents of CREG \\
\hline 14 & & & 1.0 & \[
\begin{gathered}
2.0 \\
\text { or } 4
\end{gathered}
\] & 1 & CR14＊CR14 & & & & & S5 & S5 & S5 is either RB13－CR13 or CR13 from PASS CR13，and must be stored externally for use in cycle 43 \\
\hline 15 & 2．0 MSH & 2．0 LSH & 1.0 & \[
\begin{array}{r}
2.0 \\
\text { or } 4 \\
\hline
\end{array}
\] & 0 & RA16＊PR16 & CR14＊CR14 & & & S5 & & S5 & Output S5 in cycles 14 and 15 \\
\hline 16 & & & 2.0 & \[
\begin{array}{r}
2.0 \\
\text { or } 4 \\
\hline
\end{array}
\] & 0 & RA16＊PR16 & & & P4 & & & & \\
\hline 17 & & & 2.0 & \[
\begin{array}{r}
2.0 \\
\text { or } 4 \\
\hline
\end{array}
\] & 0 & PR18＋RB18 & RA16＊PR16 & & & & & & \\
\hline 18 & －1．0 MSH & －1．0 LSH & 2.0 & －1．0 & 0 & PR18＋RB18 & & & P5 & & & & \\
\hline
\end{tabular}

Table 57. Pseudocode for Chebyshev Sine Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) ) (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & DB
BUS & RA
REG & \begin{tabular}{l}
RB \\
REG
\end{tabular} & \begin{tabular}{l}
CLK \\
MODE
\end{tabular} & INSTR & MUL PIPE & \begin{tabular}{l}
ALU \\
PIPE
\end{tabular} & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{C} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 19 & \(\mathrm{c}_{8} \mathrm{MSH}\) & \(\mathrm{c}_{8}\) LSH & 2.0 & C8 & 1 & SR19*RB19 & & & & & S6 & & Start core calculation \\
\hline 20 & & . & 2.0 & \(\mathrm{c}_{8}\) & 0 & PR21 + RB21 & SR19*RB19 & & & S6 & & & S7 is input to core calc. \\
\hline 21 & \(c_{7} \mathrm{MSH}\) & c7 LSH & 2.0 & \(c_{7}\) & 0 & PR21 + RB21 & & & P6 & & & & \\
\hline 22 & & & 2.0 & c7 & 1 & SR22*CR22 & & & & & S7 & & \\
\hline 23 & & & 2.0 & c7 & 0 & PR24 + RB24 & SR22*CR22 & & & & & & \\
\hline 24 & \(\mathrm{c}_{6} \mathrm{MSH}\) & \(\mathrm{c}_{6} \mathrm{LSH}\) & 2.0 & \(\mathrm{c}_{6}\) & 0 & PR24 + RB24 & & & P7 & & & & \\
\hline 25 & & & 2.0 & \(c_{6}\) & 1 & SR25*CR25 & & & & & S8 & & \\
\hline 26 & & & 2.0 & \(\mathrm{c}_{6}\) & 0 & PR27 + RB27 & SR25*CR25 & & & & & & \\
\hline 27 & \(\mathrm{c}_{5} \mathrm{MSH}\) & \(\mathrm{c}_{5} \mathrm{LSH}\) & 2.0 & \(c_{5}\) & 0 & PR27 + RB27 & & & P8 & & & & \\
\hline 28 & & & 2.0 & C5 & 1 & SR28*CR28 & & & & & S9 & & \\
\hline 29 & & & 2.0 & C5 & 0 & PR30 + RB30 & SR28*CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{4} \mathrm{MSH}\) & \(\mathrm{c}_{4} \mathrm{LSH}\) & 2.0 & \(\mathrm{c}_{4}\) & 0 & PR30 + RB30 & & & P9 & & & & \\
\hline 31 & & & 2.0 & \(\mathrm{c}_{4}\) & 1 & SR31 * CR31 & & & & & S10 & & \\
\hline 32 & & & 2.0 & \(\mathrm{c}_{4}\) & 0 & PR33 + RB33 & SR31 * CR31 & & & & & & \\
\hline 33 & \(\mathrm{c}_{3} \mathrm{MSH}\) & \(c_{3}\) LSH & 2.0 & c3 & 0 & PR33 + RB33 & & & P10 & & & & \\
\hline 34 & & & 2.0 & c3 & 1 & SR34*CR34 & & & & & S11 & & \\
\hline 35 & & & 2.0 & c3 & 0 & PR36 + RB36 & SR34*CR34 & & & & & & \\
\hline 36 & \(\mathrm{c}_{2} \mathrm{MSH}\) & \(\mathrm{c}_{2} \mathrm{LSH}\) & 2.0 & \(\mathrm{c}_{2}\) & 0 & PR36 + RB36 & & & P11 & & & & \\
\hline 37 & & & 2.0 & \(\mathrm{c}_{2}\) & 1 & SR37*CR37 & & & & & S12 & & \\
\hline 38 & & & 2.0 & \(\mathrm{c}_{2}\) & 0 & PR39 + RB39 & SR37 * CR37 & & & & & & \\
\hline 39 & \(c_{1} \mathrm{MSH}\) & \(c_{1}\) LSH & 2.0 & \(\mathrm{c}_{1}\) & 0 & PR39 + RB39 & & & P12 & & & & \\
\hline 40 & & & 2.0 & \(c_{1}\) & 1 & SR40*CR40 & & & & & S13 & & \\
\hline
\end{tabular}

\section*{L七8810ももLNS}

Table 57．Pseudocode for Chebyshev Sine Routine（PIPES2－0＝010，RND1－0＝00）（Concluded）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{array}{|c|}
\text { RA } \\
\text { REG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & \begin{tabular}{l}
ALU \\
PIPE
\end{tabular} & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\underset{\text { REG }}{\mathbf{C}}
\] & \[
\begin{array}{|c|}
\mathbf{S} \\
\text { REG }
\end{array}
\] & \[
\begin{gathered}
Y \\
B U S
\end{gathered}
\] & COMMENT \\
\hline 41 & & & 2.0 & c1 & 0 & PR42＋RB42 & SR40＊CR40 & & & & & & \\
\hline 42 & \(\mathrm{c}_{0} \mathrm{MSH}\) & \(\mathrm{c}_{0}\) LSH & 2.0 & \({ }^{\circ}\) & 0 & PR42＋RB42 & & & P13 & & & & \\
\hline 43 & S5 MSH & S5 LSH & 2.0 & S5 & 1 & SR43＊RB43 & & & & & S14 & & Begin postprocessing \\
\hline 44 & & & 2.0 & S5 & 0 & DUMMY & SR43＊RB43 & & & & & & Instruction is double－ precision RA＋RB，allows time for answer to propagate to the Y bus \\
\hline 45 & & & 2.0 & S5 & 0 & NOP & & & P14 & & & P14 & Output MSH of answer \\
\hline 46 & & & 2.0 & S5 & 0 & NOP & & & P14 & & & P14 & Output LSH of answer \\
\hline
\end{tabular}

\section*{Microcode Table for the Sine(x) Calculation}

All numbers are in hex. Any field with a length that is not a multiple of 4 is right justified and zero filled. For the microcode table, the value of \(X\) has been chosen to be \(1 / 2 \mathrm{pi}\).


SN74ACT8847

\section*{Lヤ88コロナ}

Microcode Table for the Sine（x）Calculation（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & D & D & & & E C & & C & & S & \(\bar{R}\) & \(\bar{H}\) & & F & & & & & & & & & & & \\
\hline A & A & B & B & N & N L & 1 & L & 0 & E & E & A & N & L & N & N & A & R & Y & E & E & \[
E
\] & & E & \\
\hline & & & & A & B K & P & K & N & L & S & L & C & 0 & S & D & S & C & T & L & S & & & S & C \\
\hline & & & & & C & E & M & F & 0 & E & T & & W & T & & & C & E & S & T & & & & \\
\hline & & & & & & S & 0 & & P & T & & & C & R & & & & P & T & & & & & \\
\hline & & & & & & & D & G & & & & & & & & & & & & & & & & \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1C0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3E0D5351 & 7735F927 & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & BE7CC930 & FD0ADA9D & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3EE3EOAF & 61F7677F & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline  & 00000000 & 00000000 & F & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & BF41E5FD & EF25C403 & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3F92A9FB & 40C119ED & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & － & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & BFD23B03 & 366AA0C9 & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline F & 3FF4464B & CC8CBA1F & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline F & 3FF00000 & 00000000 & F & 0 & 1 & 2 & 1 & 3 & BF & & 1 & & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & & \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline & 00000000 & 00000000 & & & 0 & & & & FF & & & & & 300 & & & & & & & & & 0 & \\
\hline
\end{tabular}

\section*{Tangent Routine Using Chebyshev's Method}

All floating point inputs and outputs are double precision. The input is in radians.

\section*{Steps Required to Perform the Calculation}

STEP 1 - Preprocessing; range reduce the input, \(X\), to a range of \([-1,1]\). Next square this range-reduced value, multiply it by 2.0 , and finally subtract 1.0. X 3 is the range-reduced input value, it must be stored externally. 'TRUNC' means to truncate. If X2 \(>1.0\), then in the postprocessing part of the routine, the answer is the reciprocal of \(X 5 * X 3\).
```

X1}\leftarrow\textrm{X}*(4.0/\textrm{pi}
X2\leftarrow X1 - (4(TRUNC(0.25(X1 + 1.0))))
If X2 > 1.0
Then X3 \leftarrow2.0- X2
Else X3}\leftarrow\times
X4}\leftarrow2.0*(X3*X3) - 1.0

```

STEP 2 - Core Calculation; \(X 4\) in Step 1 will be referred to as ' \(x\) ' in the core calculation.
\(\mathrm{X} 5 \leftarrow \mathrm{C}_{\text {series_tan }}\)
\(\left.\leftarrow\left(()\left(()\left(()\left(1(1) c_{14}\right) * x+c_{13}\right) * x+c_{12}\right) * x+c_{11}\right) * x+c_{10}\right) * x+\) c9) \(\left.\left.\left.\left.\left.\left.\mathrm{c}_{\mathrm{x}}+\mathrm{c}_{8}\right) * x+\mathrm{c}_{7}\right) * x+\mathrm{c}_{6}\right) * x+\mathrm{c}_{5}\right) * x+\mathrm{c}_{4}\right) * x+\mathrm{c}_{3}\right) * x+\) \(\left.\left.c_{2}\right) * x+c_{1}\right) * x+c_{0}\)

STEP 3 - Postprocessing; multiply the output of the core calculation times X3. If \(X 2>1.0\), then the reciprocal of \(X 5 * X 3\) is the answer, if \(X 2 \leq 1.0\) then \(X 5 * X 3\) is the answer.

Tangent \((X) \leftarrow X 5 * X 3\) (or reciprocal of \(X 5 * X 3\) )

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing:
\begin{tabular}{|c|c|}
\hline T1 \(\leftarrow \mathrm{X} *(4.0 / \mathrm{pi})\) & 4.0/pi entered as a constant \\
\hline T2 \(\leftarrow T 1+1.0\) & \\
\hline T3 \(\leftarrow 0.25 * T 2\) & CREG \(\leftarrow\) T1 \\
\hline T4 \(\leftarrow \mathrm{INT}(\mathrm{T} 3)\) & round controls set to truncate \\
\hline T5 \(\leftarrow 4 *\) T4 & \\
\hline T6 \(\leftarrow\) DOUBLE(T5) & convert from integer to double \\
\hline T7 \(\leftarrow\) CREG \(\leftarrow\) T6 & \\
\hline \multicolumn{2}{|l|}{CMP (1.0,T7)} \\
\hline If ( \(1.0>\mathrm{T} 7\) ) & CREG \(\leftarrow T 7\) \\
\hline Then T8 \(\leftarrow 2.0-\) CREG & T8 is X3 in Step 1, must \\
\hline Else T8 \(\leftarrow\) CREG & be stored externally \\
\hline
\end{tabular}

T8 is X3 in Step 1, must be stored externally

T9 \(\leftarrow\) CREG \(*\) CREG
T10 \(\leftarrow\) T9＊ 2.0
T11 \(\leftarrow\) T10－ 1.0

CREG \(\leftarrow \mathrm{T} 8\)

T11 is X4 in Step 1，the input to the core routine

Step 2 perform the core calculation：
T12 \(\leftarrow \mathrm{c} 14 *\) CREG
\(\mathrm{T} 13 \leftarrow \mathrm{~T} 12+\mathrm{c} 13\)
CREG \(\leftarrow\) T11
T14 \(\leftarrow\) T13＊CREG
\(\mathrm{T} 15 \leftarrow \mathrm{~T} 14+\mathrm{c} 12\)
T16 \(\leftarrow\) T15＊CREG
\(\mathrm{T} 17 \leftarrow \mathrm{~T} 16+\mathrm{c} 11\)
T18 \(\leftarrow\) T17＊CREG
T19 \(\leftarrow\) T18＋c10
T20 \(\leftarrow\) T19＊CREG
\(\mathrm{T} 21 \leftarrow \mathrm{~T} 20+\mathrm{c} 9\)
T22 \(\leftarrow\) T21＊CREG
\(\mathrm{T} 23 \leftarrow \mathrm{~T} 22+\mathrm{c} 8\)
T24 \(\leftarrow\) T23＊CREG
\(\mathrm{T} 25 \leftarrow \mathrm{~T} 24+\mathrm{C} 7\)
T26 \(\leftarrow\) T25＊CREG
\(\mathrm{T} 27 \leftarrow \mathrm{~T} 26+\mathrm{c}_{6}\)
T28 \(\leftarrow\) T27＊CREG
T29 \(\leftarrow\) T28＋C5
T30 \(\leftarrow\) T29＊CREG
\(\mathrm{T} 31 \leftarrow \mathrm{~T} 30+\mathrm{C} 4\)
T32 \(\leftarrow\) T31＊CREG
\(\mathrm{T} 33 \leftarrow \mathrm{~T} 32+\mathrm{C} 3\)
T34 \(\leftarrow\) T33＊CREG
T35 \(\leftarrow\) T34＋ \(\mathrm{c}_{2}\)
T36 \(\leftarrow\) T35＊CREG
T37 \(\leftarrow\) T36＋ \(\mathbf{c} 1\)
T38 \(\leftarrow\) T37＊CREG
\(\mathrm{T} 39 \leftarrow \mathrm{~T} 38+\mathrm{c}_{0}\)
Step 3 perform the postprocessing：
```

T40}\leftarrowT39*T8
If X2 (in Step 1) > 1.0
Then Tangent(X)\leftarrow1.0/T40
Else Tangent(X)}\leftarrowT4

```

\section*{Required System Intervention}

As seen in the algorithm for Step 1, the 'ACT8847 performs a compare. The results of this compare determine which one of two calculations is to be performed. The system, in which the 'ACT8847 is a part, must make the decision as to which of the two calculations is to be performed. In addition, the system must store X3 and then later furnish X3 as an input to the 'ACT8847. Finally, the system will have to determine if it is necessary to take the reciprocal of the final product (T40 in the Algorithm for Step 3) to yield the answer. If it is necessary to take the reciprocal, then the system will be required to direct the variable T40 from the 'ACT8847's output bus to the input buses. This is because operands for division instructions must be provided by the RA and RB registers; feedback is not an option.

\section*{Number of 'ACT8847 Cycles Required to Calculate Tangent(x)}

Calculation of Tangent(x) requires 79 cycles. In addition, it is assumed that five additional cycles are required for system intervention due to the compare instruction. Therefore, the total number of cycles required to perform the Tangent(x) calculation is 84 .

\section*{Listing of the Chebyshev Constants (c's)}

The constants are represented in IEEE double-precision floating point format.
\[
\begin{aligned}
& c_{14}=3 D 747 D 842210 C C 35 \\
& \text { c13 = 3DA1D66636043991 } \\
& \text { c12 }=3 \text { DCCD078F52B3A73 } \\
& \text { c11 }=\text { 3DF938F9CDDFF864 } \\
& \text { c10 }=3 E 2620430 E 99 B 5 B 7 \\
& \text { c9 }=\text { 3E535C2C953CE515 } \\
& \text { c8 }=3 \text { E80F07AFC099D7F } \\
& c_{7}=3 E A D A 4 D 789 E B 45 C 4 \\
& \text { c6 }=\text { 3ED9F03D4C51A771 } \\
& c_{5}=3 \text { F06B236DE4D014C } \\
& \mathrm{c}_{4}=3 \text { F33DBFB01B3F415 } \\
& \mathrm{c}_{3}=3 \text { F6160DE701F3A53 } \\
& \mathrm{c}_{2}=3 \text { F8E70A18736FC10 } \\
& \mathrm{c}_{1}=3 \text { FBAEA2653199611 } \\
& c_{0}=3 F E C 14 B 2675 B 10 B A
\end{aligned}
\]

\section*{Lも881つももLNS}

\section*{Psuedocode Table for the Tangent（x）Calculation}

Table 58．Pseudocode for Chebyshev Tangent Routine（PIPES2－0＝010，RND1－0＝0）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA BUS & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{array}{|c}
\text { RA } \\
\text { REG }
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & CLK MODE & INSTR & MUL PIPE & \begin{tabular}{l}
ALU \\
PIPE
\end{tabular} & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\underset{\text { REG }}{\mathbf{C}}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c}
\mathrm{Y} \\
\text { BUS } \\
\hline
\end{array}
\] & COMMENT \\
\hline 1 & X MSH & X LSH & & & 0 & RA2＊RB2 & & & & & & & \(X\) is the input \\
\hline 2 & \begin{tabular}{l}
4DIVPI \\
MSH
\end{tabular} & \[
\begin{gathered}
\text { 4DIVPI } \\
\text { LSH }
\end{gathered}
\] & X & \[
\begin{aligned}
& \text { 4DIV } \\
& \text { PI }
\end{aligned}
\] & 0 & RA2＊RB2 & & & & & & & 4DIVPI is a constant representing 4．0／pi \\
\hline 3 & & & x & 4DIVPI & 0 & PR4＋RB4 & RA2＊RB2 & & & & & & \\
\hline 4 & 1．0 MSH & 1．0 LSH & X & 1.0 & 0 & PR4＋RB4 & & & P1 & & & & \\
\hline 5 & 0．25 MSH & 0.25 LSH & X & 0.25 & 1 & SR5＊RB5 & & & & P1 & S1 & & \\
\hline 6 & 1．0 MSH & 1．0 LSH & X & 0.25 & 0 & DP21（PR7） & SR5＊RB5 & & & & & & Double precision \(\rightarrow\) integer \\
\hline 7 & & & 1.0 & 0.25 & 0 & DP21（PR7） & & & P2 & & & & Cycles 6，7 set RND1，0 \(=01\) \\
\hline 8 & & 4 & 1.0 & 4 & 0 & SR8＊RB8 & & & & & S2 & & \\
\hline 9 & & & 1.0 & 4 & 1 & I2DP（PR9） & & & P3 & & & & Integer \(\rightarrow\) double precision \\
\hline 10 & & & 1.0 & 4 & 1 & CR10－SR10 & & & & & S3 & & \\
\hline 11 & & & 1.0 & 4 & 1 & \begin{tabular}{l}
COMPARE \\
RA11，SR11
\end{tabular} & & & & & S4 & & \[
\begin{aligned}
& \text { If SR11 > RA11 then } 13 a \\
& \text { If SR11 } \leq \text { RA11 then } 13 b
\end{aligned}
\] \\
\hline 12 & & & 1.0 & 4 & 0 & NOP & & & & S4 & & & Wait for system response \\
\hline 13a & 2．0 MSH & 2.0 LSH & 1.0 & 2.0 & 1 & RB13－CR13 & & & & & & & Execute 13a or 13b \\
\hline 13b & & & 1.0 & 4 & 1 & PAS（CR13） & & & & & & & Pass contents of Creg \\
\hline 14 & & & 1.0 & \[
\begin{gathered}
2.0 \\
\text { or } 4
\end{gathered}
\] & 1 & CR14＊CR14 & & & & & S5 & S5 & S5 is either RB13－CR13 or CR13 from PASS CR13，and must be stored externally for use in cycle 61 \\
\hline 15 & 2．0 MSH & 2．0 LSH & 1.0 & \[
\begin{array}{r}
2.0 \\
\text { or } 4 \\
\hline
\end{array}
\] & 0 & RA16＊PR16 & CR14＊CR14 & & & S5 & & S5 & Output S5 in cycles 14 and 15 \\
\hline 16 & & & 2.0 & \[
\begin{gathered}
2.0 \\
\text { or } 4 \\
\hline
\end{gathered}
\] & 0 & RA16＊PR16 & & & P4 & & & & \\
\hline 17 & & & 2.0 & \[
\begin{gathered}
2.0 \\
\text { or } 4 \\
\hline
\end{gathered}
\] & 0 & PR18＋RB18 & RA16＊PR16 & & & & & & \\
\hline 18 & －1．0 MSH & －1．0 LSH & 2.0 & －1．0 & 0 & PR18＋RB18 & & & P5 & & & & \\
\hline
\end{tabular}

Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0 \(=010\), RND1-0 \(=0\) ) (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{array}{r}
\text { DA } \\
\text { BUS } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS } \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c}
\mathrm{RA} \\
\hline \text { REG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG } \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \text { CLK } \\
\text { MODE } \\
\hline
\end{array}
\] & INSTR & MUL PIPE & \[
\begin{aligned}
& \text { ALU } \\
& \text { PIPE }
\end{aligned}
\] & \[
\begin{array}{|c|}
\hline \mathbf{P} \\
\mathbf{R E G} \\
\hline
\end{array}
\] & \[
\begin{gathered}
\mathrm{C} \\
\mathrm{REG}
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathbf{s} \\
\mathbf{R E G} \\
\hline
\end{array}
\] & \[
\begin{array}{|c|}
\hline \mathbf{y} \\
\text { BUS } \\
\hline
\end{array}
\] & COMMENT \\
\hline 19 & \(\mathrm{c}_{14} \mathrm{MSH}\) & \(\mathrm{c}_{14}\) LSH & 2.0 & \(\mathrm{c}_{14}\) & 1 & SR19*RB19 & & & & & S6 & & Start core calculation \\
\hline 20 & & & 2.0 & \(\mathrm{c}_{14}\) & 0 & PR21 + RB21 & SR19*RB19 & & & S6 & & & S7 is input to core calc. \\
\hline 21 & \(\mathrm{c}_{13} \mathrm{MSH}\) & \(\mathrm{c}_{13}\) LSH & 2.0 & \(\mathrm{c}_{13}\) & 0 & PR21 + RB21 & & & P6 & & & & \\
\hline 22 & & & 2.0 & \(\mathrm{c}_{13}\) & 1 & SR22*CR22 & & & & & S7 & & \\
\hline 23 & & & 2.0 & \(\mathrm{c}_{13}\) & 0 & PR24 + RB24 & SR22*CR22 & & & & & & \\
\hline 24 & \(\mathrm{c}_{12} \mathrm{MSH}\) & \(\mathrm{c}_{12}\) LSH & 2.0 & \(\mathrm{c}_{12}\) & 0 & PR24 + RB24 & & & P7 & & & & \\
\hline 25 & & & 2.0 & \(\mathrm{c}_{12}\) & 1 & SR25*CR25 & & & & & S8 & & \\
\hline 26 & & & 2.0 & \(\mathrm{c}_{12}\) & 0 & PR27 + RB27 & SR25*CR25 & & & & & & \\
\hline \begin{tabular}{|l|}
\hline 27 \\
\hline
\end{tabular} & \(\mathrm{c}_{11} \mathrm{MSH}\) & \(\mathrm{c}_{11}\) LSH & 2.0 & \(\mathrm{c}_{11}\) & 0 & PR27 + RB27 & & & P8 & & & & \\
\hline 28 & & & 2.0 & \(\mathrm{c}_{11}\) & 1 & SR28*CR28 & & & & & S9 & & \\
\hline 29 & & & 2.0 & \(\mathrm{c}_{11}\) & 0 & PR30 + RB30 & SR28*CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{10} \mathrm{MSH}\) & \(\mathrm{c}_{10}\) LSH & 2.0 & \(\mathrm{c}_{10}\) & 0 & PR30 + RB30 & & & P9 & & & & \\
\hline 31 & & & 2.0 & \(\mathrm{c}_{10}\) & 1 & SR31 * CR31 & & & & & S10 & & \\
\hline 32 & & & 2.0 & \(\mathrm{c}_{10}\) & 0 & PR33 + RB33 & SR31*CR31 & & & & & & \\
\hline 33 & c9 MSH & c9 LSH & 2.0 & c9 & 0 & PR33 + RB33 & & & P10 & & & & \\
\hline 34 & & & 2.0 & \(\mathrm{c}_{9}\) & 1 & SR34*CR34 & & & & & S11 & & \\
\hline 35 & & & 2.0 & c9 & 0 & PR36 + RB36 & SR34*CR34 & & & & & & \\
\hline \begin{tabular}{|l|}
\hline 36 \\
\hline 37 \\
\hline
\end{tabular} & \(\mathrm{c}_{8} \mathrm{MSH}\) & \(\mathrm{c}_{8}\) LSH & 2.0 & \(\mathrm{c}_{8}\) & 0 & PR36 + RB36 & & & P11 & & & & \\
\hline 37 & & & 2.0 & \({ }^{8} 8\) & 1 & SR37*CR37 & & & & & S12 & & \\
\hline 38 & & & 2.0 & \({ }^{8} 8\) & 0 & PR39 + RB39 & SR37 * CR37 & & & & & & \\
\hline 39 & \({ }^{\text {c } 7 ~ M S H}\) & \(\mathrm{c}_{7}\) LSH & 2.0 & \({ }^{6} 7\) & 0 & PR39 + RB39 & & & P12 & & & & \\
\hline 40 & & & 2.0 & \({ }^{\text {c }} 7\) & 1 & SR40*CR40 & & & & & S13 & & \\
\hline \begin{tabular}{|l|}
\hline 41 \\
\hline 18 \\
\hline
\end{tabular} & & & 2.0 & \({ }^{\text {c }} 7\) & 0 & PR42 + RB42 & SR40*CR40 & & & & & & \\
\hline 42 & \(\mathrm{c}_{6} \mathrm{MSH}\) & \(\mathrm{c}_{6}\) LSH & 2.0 & \({ }^{\text {c } 6}\) & 0 & PR42 + RB42 & & & P13 & & & & \\
\hline
\end{tabular}

SN74ACT8847

\section*{Lヤ881つもヤLNS}

Table 58．Pseudocode for Chebyshev Tangent Routine（PIPES2－0＝010，RND1－0＝0）（Concluded）
\begin{tabular}{|l|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
CLK \\
MODE
\end{tabular} & INSTR & \begin{tabular}{c} 
MUL \\
PIPE
\end{tabular} & \begin{tabular}{c} 
ALU \\
PIPE
\end{tabular} & \begin{tabular}{c} 
P \\
REG
\end{tabular} & \begin{tabular}{c} 
C \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} & \begin{tabular}{c} 
Y \\
BUS
\end{tabular} & \\
\hline 43 & & & 2.0 & \(\mathrm{c}_{6}\) & 1 & SR43＊CR43 & & & & & COMMENT
\end{tabular}

Table 58. Pseudocode for Chebyshev Tangent Routine (PIPES2-0 =010, RND1-0 \(=0\) ) (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{aligned}
& \text { DA } \\
& \text { BUS }
\end{aligned}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \text { CLK } \\
\text { MODE }
\end{array}
\] & INSTR & \begin{tabular}{l}
MUL \\
PIPE
\end{tabular} & \begin{tabular}{l}
ALU \\
PIPE
\end{tabular} & \[
\begin{gathered}
\mathrm{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathrm{Y} \\
\text { BUS }
\end{array}
\] & COMMENT \\
\hline 63 & & & 2.0 & S5 & 0 & NOP & & & P20 & & & P20 & Output MSH, if cycle 13b was executed then P20 is the answer; if cycle 13a was executed then the answer is \(1.0 / \mathrm{P} 20\), which is calculated next \\
\hline 64 & 1.0 MSH & 1.0 LSH & 2.0 & S5 & 0 & DIV & & & & & & P20 & Output LSH \\
\hline 65 & P20 MSH & P20 LSH & 1.0 & P20 & 0 & DIV & & & & & & & Operands for Division must come from RA and RB, feedback is not an option \\
\hline 66 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 67 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 68 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 69 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 70 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 71 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 72 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 73 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 74 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 75 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 76 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 77 & & & 1.0 & P20 & 0 & NOP & & & & & & & Wait for Division result \\
\hline 78 & & & 1.0 & P20 & 0 & NOP & & & P21 & & & P21 & Output MSH of answer \\
\hline 79 & & & 1.0 & P20 & 0 & NOP & & & P21 & & & P21 & Output LSH of answer \\
\hline
\end{tabular}

\section*{くも881つももLNS}

\section*{Microcode Table for the Tangent（x）Calculation}

All numbers are in hex．Any field with a length that is not a multiple of 4 is right justified and zero filled．For the microcode table，the value of \(X\) has been chosen to be \(1 / 3 \mathrm{pi}\) ．
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline P & D & D & P & E E C & & & & S & \(\overline{\mathrm{R}}\) & & & F & & & & & & & & & & & \\
\hline A & A & B & B & N N L & 1 & L & 0 & E & E & A & N & L & \(N\) & N & ，A & R & R & & & & & & \\
\hline & & & & A B K & P & K & N & L & S & L & C & 0 & S & D & S & C & C & & & & & & C \\
\hline & & & & C & E & M & F & 0 & E & T & & W & T & & & & C & & & & & & \\
\hline & & & & & S & \[
\begin{aligned}
& \mathrm{O} \\
& \mathrm{D}
\end{aligned}
\] & & P & T & & & C & R & & & & & P & T & & & & \\
\hline F & 3FFOC152 & 382D7365 & F & 00 & 2 & 0 & 3 & FF & 1 & 1 & ， & 0 & 1 Co & 0 & 0 & 0 & 0 & 0 & 3 & & 1 & 0 & 0 \\
\hline F & 3FF45F30 & 6DC9C883 & F & 11 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 3FF00000 & 00000000 & F & 01 & 2 & 0 & 3 & FB & & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 3FD00000 & 00000000 & F & 015 & 2 & 1 & 3 & BF & 1 & 1 & 0 & 0 & 1 Co & 0 & O & － 1 & 10 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 3FF00000 & 00000000 & F & 00 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 1A3 & 1 & 0 & & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 10 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 1A3 & 1 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000004 & F & 01 & 2 & 0 & 1 & BF & 1 & 1 & 1 & 0 & 240 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 1 & 3 & FB & 1 & 1 & 1 & 0 & 1A2 & 0 & & & 0 & 0 & 3 & 3 & 1 & 0 & \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 1 & 3 & F6 & 1 & 1 & 1 & 0 & 181 & 0 & 0 & 0 & 0 & － & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 1 & 3 & FE & 1 & 1 & 1 & 0 & 182 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 0 5 & 2 & 0 & 3 & FF & 1 & 1 & 0 & 0 & 300 & 0 & & & 0 & 0 & 3 & 3 & 1 & 0 & \\
\hline \[
F
\] & 40000000 & 00000000 & F & 01 & 2 & 1 & 3 & F7 & & 1 & 1 & 0 & 183 & 0 & 0 & 0 & 0 & & 3 & 3 & & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 1 & 3 & 5F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 40000000 & 00000000 & F & 005 & 2 & 0 & 3 & EF & 1 & 1 & 0 & 0 & 1 Co & 0 & & & 0 & 0 & 3 & 3 & 1 & 0 & \\
\hline F & 00000000 & 00000000 & F & 10 & 2 & 0 & 3 & EF & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 0 & 3 & 3 & & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 00 & 2 & 0 & 3 & FB & 1 & & 1 & 0 & 180 & 0 & 0 & & 0 & 0 & 3 & 3 & & 0 & \\
\hline F & BFF00000 & 00000000 & F & 01 & 2 & 0 & 3 & FB & 1 & 1 & & 0 & 180 & 0 & & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline F & 3D747D84 & 2210CC35 & F & 01 & 2 & 1 & 3 & BF & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline & 00000000 & 00000000 & F & 005 & 2 & 0 & 3 & FB & 1 & 1 & 0 & 0 & 180 & 0 & & & & & & & & & \\
\hline & 3DA1D666 & 36043991 & F & 01 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 0 & 3 & & & 0 & \\
\hline
\end{tabular}

\section*{Microcode Table for the Tangent(x) Calculation (Continued)}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & \[
\begin{aligned}
& \mathrm{P} \\
& \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{D} \\
& \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{D} \\
& \mathrm{~B}
\end{aligned}
\] & \[
\begin{aligned}
& P \\
& B
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{E} \\
& \mathrm{~N} \\
& \mathrm{~A}
\end{aligned}
\] & \[
\begin{aligned}
& E \\
& \mathrm{~N} \\
& \mathrm{~B}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{C} \\
& \mathrm{~L} \\
& \mathrm{~K} \\
& \mathrm{C}
\end{aligned}
\] & \[
\begin{aligned}
& P \\
& I \\
& P \\
& E \\
& S
\end{aligned}
\] & \(C\)
\(L\)
\(K\)
\(M\)
\(M\)
O
D & \[
\begin{gathered}
\mathrm{C} \\
\mathrm{O} \\
\mathrm{~N} \\
\mathrm{~F} \\
\mathrm{I} \\
\mathrm{G}
\end{gathered}
\] & \[
\begin{aligned}
& S \\
& E \\
& L \\
& O \\
& P
\end{aligned}
\] & \[
\begin{aligned}
& \bar{R} \\
& E \\
& S \\
& E \\
& T
\end{aligned}
\] & \[
\begin{aligned}
& \bar{H} \\
& \mathrm{~A} \\
& \mathrm{~L} \\
& \mathrm{~T}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{E} \\
& \mathrm{~N} \\
& \mathrm{C}
\end{aligned}
\] & \[
\begin{gathered}
\mathrm{F} \\
\mathrm{~L} \\
\mathrm{O} \\
\mathrm{~W} \\
\mathrm{C}
\end{gathered}
\] & \[
\begin{aligned}
& \mathrm{I} \\
& \mathrm{~N} \\
& \mathrm{~S} \\
& \mathrm{~T} \\
& \mathrm{R}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{R} \\
& \mathrm{~N} \\
& \mathrm{D}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{F} \\
& \mathrm{~F} \text { A } \\
& \mathrm{S} \\
& \mathrm{~T}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{S} \\
& \mathrm{R} \\
& \mathrm{C} \\
& \mathrm{C}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{B} \\
& \mathrm{Y} \\
& \mathrm{~T} \\
& \mathrm{E} \\
& \mathrm{P}
\end{aligned}
\] & \[
\begin{aligned}
& S \\
& E \\
& L \\
& S \\
& \text { S }
\end{aligned}
\] & \[
\begin{aligned}
& T \\
& E \\
& S \\
& T
\end{aligned}
\] & \[
\begin{aligned}
& L \\
& Y
\end{aligned}
\] & & \[
\begin{array}{ll}
\bar{O} & \bar{O} \\
E & E \\
S & C
\end{array}
\] \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 3DCCD078 & F52B3A73 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & & & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 3DF938F9 & CDDFF864 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 3 E 262043 & 0E99B5B7 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 3E535C2C & 953CE515 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline & & 3E80F07A & FC099D7F & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & , & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 Co & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline & & 3EADA4D7 & 89EB45C4 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & & & 0 \\
\hline & F & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & , & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & 0 & 0 \\
\hline & F & 3ED9F03D & 4C51A771 & F & 0 & 1 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & & 0 \\
\hline \(\pm\) & & 00000000 & 00000000 & F & 0 & 0 & & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & 0 \\
\hline \(\underset{\sim}{\infty}\) & & 3F06B236 & DE4D014C & F & 0 & 1 & - & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 10 & & \\
\hline
\end{tabular}

SN74ACT8847

Microcode Table for the Tangent（x）Calculation（Continued）


\section*{Microcode Table for the Tangent(x) Calculation (Concluded)}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline P & D & D & & & E C & P & & & S & \(\overline{\mathrm{R}}\) & \(\bar{H}\) & E & & & & & S & B & S & & & & & \\
\hline A & A & B & B & N & N L & 1 & L & 0 & E & E & A & N & L & N & N & & R & Y & \[
E
\] &  & & & E & \\
\hline & & & & A & B K & P & K & N & L & S & L & C & 0 & S & D & & C & T & &  & & & S & \\
\hline & & & & & C & E & M & F & 0 & E & T & & w & T & & & C & E & S & T & & & & \\
\hline & & & & & & S & \[
\begin{aligned}
& \mathrm{O} \\
& \mathrm{D}
\end{aligned}
\] & \[
\begin{aligned}
& \mathrm{I} \\
& \mathrm{G}
\end{aligned}
\] & P & T & & & C & R & & & & P & T & & & & & \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & & 0 & 0 & 3 & 3 & & 0 & 0 & 0 \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & , & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & & 0 & 0 & 0 \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 0 & 0 & 0 & \\
\hline
\end{tabular}

\section*{ArcSine \＆ArcCosine Routine Using Chebyshev＇s Method}

All floating point inputs and outputs are double precision．The output is in radians．

\section*{Steps Required to Perform the Calculation}

STEP 1 －Preprocessing；range reduction is not needed，because an input，\(X\) ， outside the range of \([-1,1]\) indicates an error．This routine requires that the \(X^{2}\) be less than or equal to \(1 / 2\) ．The first operation to be performed is to square \(X\) ，then multiply it by 4.0 ，and finally subtract 1.0.
\[
X 1 \leftarrow x * X * 4-1
\]

STEP 2 －Core Calculation；X1 in Step 1 will be referred to as＇\(x\)＇in the core calculation．
\[
\begin{aligned}
& \mathrm{X} 2 \leftarrow \mathrm{C}_{\text {series_asin\&acos }} \\
& \leftarrow(1(1(1)(1(1)(1(1) 18 * x+c 17) * x+c 16) * x+
\end{aligned}
\]
\[
\begin{aligned}
& \text { cg) } \left.\left.\left.\left.* x+c 8) * x+c 7) * x+c_{6}\right) * x+c 5\right) * x+c 4\right) * x+c 3\right) * x+ \\
& \left.\left.c_{2}\right) * x+c_{1}\right) * x+c_{0}
\end{aligned}
\]

STEP 3 －Postprocessing；multiply the output of the core calculation times SQRT（2．0），then multiply this product by \(X\) ，the original input．This yields ArcSine（X）．To calculate ArcCosine（X），the following identity is used：
\[
\operatorname{ArcCosine}(X)=\mathrm{pi} / 2-\operatorname{ArcSine}(X)
\]
\[
X 3 \leftarrow X 2 * \operatorname{SQRT}(2.0)
\]

ArcSine \((X) \leftarrow X 3 * X\)
\(\operatorname{ArcCosine}(X) \leftarrow \mathrm{pi} / 2-\operatorname{ArcSine}(X)\)

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing：

T1 \(\leftarrow \mathrm{X} * \mathrm{X}\)
T2 \(\leftarrow 4.0 * T 1\)
\(\mathrm{T} 3 \leftarrow \mathrm{~T} 2-1\)

T3 is X1 in Step 1，the input to the core routine

Step Two perform the core calculation:
\[
\begin{aligned}
& \text { T4 } \leftarrow \mathrm{c}_{18 *} \text { CREG } \\
& \mathrm{T} 5 \leftarrow \mathrm{~T} 4+\mathrm{c} 17 \\
& \text { T6 } \leftarrow \text { T5*CREG } \\
& \mathrm{T} 7 \leftarrow \mathrm{~T} 6+\mathrm{c} 16 \\
& \text { T8 } \leftarrow \text { T7*CREG } \\
& \mathrm{T} 9 \leftarrow \mathrm{~T} 8+\mathrm{c} 15 \\
& \text { T10 } \leftarrow \text { T9* CREG } \\
& \mathrm{T} 11 \leftarrow \mathrm{~T} 10+\mathrm{c} 14 \\
& \text { T12 } \leftarrow \text { T11*CREG } \\
& \mathrm{T} 13 \leftarrow \mathrm{~T} 12+\mathrm{c} 13 \\
& \text { T14 } \leftarrow \text { T13*CREG } \\
& \mathrm{T} 15 \leftarrow \mathrm{~T} 14+\mathrm{c} 12 \\
& \text { T16 } \leftarrow \text { T15*CREG } \\
& \mathrm{T} 17 \leftarrow \mathrm{~T} 16+\mathrm{c} 11 \\
& \text { T18 } \leftarrow \text { T17*CREG } \\
& \mathrm{T} 19 \leftarrow \mathrm{~T} 18+\mathrm{c} 10 \\
& \text { T20 } \leftarrow \text { T19*CREG } \\
& \mathrm{T} 21 \leftarrow \mathrm{~T} 20+\mathrm{c} 9 \\
& \text { T22 } \leftarrow \text { T21*CREG } \\
& \mathrm{T} 23 \leftarrow \mathrm{~T} 22+\mathrm{C} 8 \\
& \text { T24 } \leftarrow \text { T23*CREG } \\
& \mathrm{T} 25 \leftarrow \mathrm{~T} 24+\mathrm{C} 7 \\
& \text { T26 } \leftarrow \text { T25*CREG } \\
& \mathrm{T} 27 \leftarrow \mathrm{~T} 26+\mathrm{C} 6 \\
& \text { T28 } \leftarrow \text { T27*CREG } \\
& \mathrm{T} 29 \leftarrow \mathrm{~T} 28+\mathrm{C} 5 \\
& \text { T30 } \leftarrow \text { T29* CREG } \\
& \mathrm{T} 31 \leftarrow \mathrm{~T} 30+\mathrm{C} 4 \\
& \text { T32 } \leftarrow \text { T31*CREG } \\
& \mathrm{T} 33 \leftarrow \mathrm{~T} 32+\mathrm{c} 3 \\
& \text { T34 } \leftarrow \text { T33*CREG } \\
& \text { T35 } \leftarrow \text { T34 }+\mathrm{c}_{2} \\
& \text { T36 } \leftarrow \text { T35*CREG } \\
& \text { T37 } \leftarrow \text { T36 + } \mathbf{c} 1 \\
& \text { T38 } \leftarrow \text { T37*CREG } \\
& \mathrm{T} 39 \leftarrow \mathrm{~T} 38+\mathrm{C} 0
\end{aligned}
\]

Step 3 perform the postprocessing:
T40 \(\leftarrow \mathrm{X} *\) T39
ArcSine (X) \(\leftarrow\) T40*SQRT(2.0)
SQRT(2.0) entered as a constant

\section*{Required System Intervention}

There is no system intervention required to calculate \(\operatorname{ArcSine}(X)\) and \(\operatorname{ArcCosine}(X)\) ．

\section*{Number of＇ACT8847 Cycles Required to Calculate ArcSine（x）and ArcCosine（x）}

The total number of cycles required to perform the \(\operatorname{ArcSine}(x)\) and \(\operatorname{ArcCosine}(x)\) calculation is 68.

\section*{Listing of the Chebyshev Constants（c＇s）}

The constants are represented in IEEE double－precision floating point format．
\[
\begin{aligned}
& \mathrm{c} 18=3 \mathrm{DA} 4 \mathrm{~A} 49 \mathrm{~F} 8 \mathrm{CCD9E73} \\
& \mathrm{c}_{17}=3 \mathrm{DC05DFE52AAD} 200 \\
& \text { c16 = 3DCCF31E26F94C8D } \\
& \text { c15 }=\text { 3DE86CDA3C8CAEB0 } \\
& \text { c14 }=3 \text { E0768D9F4E950EA } \\
& \text { c13 }=3 \text { E2383A37598FC80 } \\
& \mathrm{c}_{12}=3 \mathrm{E} 403 \mathrm{E} 4 \mathrm{~B} 2 \mathrm{~F} 65 \mathrm{FODE} \\
& \text { c11 }=\text { 3E5BAFC8245ABDF8 } \\
& \mathrm{c} 10=3 \mathrm{E} 77 \mathrm{E} 3333 \mathrm{AFF} 1 \mathrm{AB4} \\
& \text { c9 }=3 E 94 E 3 A 4 D 4220 C 9 C \\
& \text { c8 }=3 E B 296 D D 4 C 084 A C B \\
& \text { c7 }=\text { 3ED0E913F5F9D496 } \\
& \mathrm{c}_{6}=3 \mathrm{EEFA} 74 \mathrm{E} 896 \text { F8FA8 } \\
& \mathrm{C}_{5}=3 \mathrm{FOEC76B7832DBB6} \\
& \mathrm{c}_{4}=3 \mathrm{~F} 2 \mathrm{~F} 978698 \mathrm{C} 8 \mathrm{~B} 2 \mathrm{E} 4 \\
& c_{3}=3 F 519 B 1087542073 \\
& c_{2}=3 F 7696895 F F C 05 A 0 \\
& c_{1}=3 F A 375 C A 61 D 2988 C \\
& c_{0}=3 F E 7 B 20423 D 1 D 930
\end{aligned}
\]

\section*{Pseudocode Table for the ArcSine(x) and ArcCosine(x) Calculation}

Table 59. Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA BUS & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{array}{|l|}
\hline \text { RA } \\
\text { REG }
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathrm{C} \\
\hline
\end{array}
\] & \[
\begin{array}{|c|}
\hline \mathbf{s} \\
\mathrm{REG}
\end{array}
\] & \[
\left|\begin{array}{c}
\mathrm{y} \\
\text { BUS }
\end{array}\right|
\] & COMMENT \\
\hline 1 & X MSH & X LSH & & & 0 & RA2*RB2 & & & & & & & X is the input \\
\hline 2 & X MSH & X LSH & x & X & 0 & RA2*RB2 & & & & & & & \\
\hline 3 & 4.0 MSH & 4.0 LSH & x & x & 0 & RA4*PR4 & RA2*RB2 & & & & & & \\
\hline 4 & & & 4.0 & X & 0 & RA4*PR4 & & & P1 & & & & \\
\hline 5 & & & 4.0 & x & 0 & PR6 + RB6 & RA4*PR4 & & & & & & \\
\hline 6 & -1.0 MSH & -1.0 LSH & 4.0 & -1.0 & 0 & PR6 + RB6 & & & P2 & & & & \\
\hline 7 & \(\mathrm{c}_{18} \mathrm{MSH}\) & \(\mathrm{c}_{18}\) LSH & 4.0 & \(\mathrm{c}_{18}\) & 1 & SR7*RB7 & & & & & S1 & & Start core calculation \\
\hline 8 & & & 4.0 & \(\mathrm{c}_{18}\) & 0 & PR9 + RB9 & SR7*RB7 & & & S1 & & & S1 is input to core calc. \\
\hline 9 & \(\mathrm{c}_{17} \mathrm{MSH}\) & \(\mathrm{c}_{17} \mathrm{LSH}\) & 4.0 & \(\mathrm{c}_{17}\) & 0 & PR9 + RB9 & & & P3 & & & & \\
\hline 10 & & & 4.0 & \(\mathrm{c}_{17}\) & 1 & SR10*CR10 & & & & & S2 & & \\
\hline 11 & & & 4.0 & \(\mathrm{c}_{17}\) & 0 & PR12 + RB12 & SR10*CR10 & & & & & & \\
\hline 12 & \(\mathrm{c}_{16} \mathrm{MSH}\) & \(\mathrm{c}_{16} \mathrm{LSH}\) & 4.0 & \(\mathrm{c}_{16}\) & 0 & PR12 + RB12 & & & P4 & & & & \\
\hline \begin{tabular}{|l|}
13 \\
\hline 14 \\
\hline
\end{tabular} & & & 4.0 & \(\mathrm{c}_{16}\) & 1 & SR13*CR13 & & & & & S3 & & \\
\hline 14 & & & 4.0 & \(\mathrm{c}_{16}\) & 0 & PR15 + RB15 & SR13*CR13 & & & & & & \\
\hline 15 & \(\mathrm{c}_{15} \mathrm{MSH}\) & \(\mathrm{c}_{15}\) LSH & 4.0 & \(\mathrm{c}_{15}\) & 0 & PR15 + RB15 & & & P5 & & & & \\
\hline \begin{tabular}{|l|}
\hline 16 \\
\hline 17 \\
\hline
\end{tabular} & & & 4.0 & \(\mathrm{c}_{15}\) & 1 & SR16*CR16 & & & & & S4 & & \\
\hline \begin{tabular}{|l|}
17 \\
\hline 18 \\
\hline
\end{tabular} & & & 4.0 & \(\mathrm{c}_{15}\) & 0 & PR18 + RB18 & SR16*CR16 & & & & & & \\
\hline 18 & \(\mathrm{c}_{14} \mathrm{MSH}\) & \(\mathrm{c}_{14}\) LSH & 4.0 & \(\mathrm{c}_{14}\) & 0 & PR18 + RB18 & & & P6 & & & & \\
\hline 19 & & & 4.0 & \(\mathrm{c}_{14}\) & 1 & SR19*CR19 & & & & & S5 & & \\
\hline 20 & & & 4.0 & \(\mathrm{c}_{14}\) & 0 & PR21 + RB21 & SR19*CR19 & & & & & & \\
\hline 21 & \(\mathrm{c}_{13} \mathrm{MSH}\) & \(\mathrm{c}_{13} \mathrm{LSH}\) & 4.0 & \(\mathrm{c}_{13}\) & 0 & PR21 + RB21 & & & P7 & & & & \\
\hline 22 & & & 4.0 & \(\mathrm{c}_{13}\) & 1 & SR22*CR22 & & & & & S6 & & \\
\hline 23 & & & 4.0 & \(\mathrm{c}_{13}\) & 0 & PR24 + RB24 & SR22*CR22 & & & & & & \\
\hline
\end{tabular}

\section*{Lヤ88」つももLNS}

Table 59．Pseudocode for Chebyshev ArcSine and ArcCosine Routine（PIPES2－0＝010，RND1－0＝00）（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & CLK MODE & INSTR & \begin{tabular}{l}
MUL \\
PIPE
\end{tabular} & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \begin{tabular}{l}
C \\
REG
\end{tabular} & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 24 & \(\mathrm{c}_{12} \mathrm{MSH}\) & \(c_{12}\) LSH & 4.0 & C12 & 0 & PR24＋RB24 & & & P8 & & & & \\
\hline 25 & & & 4.0 & C12 & 1 & SR25＊CR25 & & & & & S7 & & \\
\hline 26 & & & 4.0 & C1．2 & 0 & PR27＋RB27 & SR25＊CR25 & & & & & & \\
\hline 27 & \(\mathrm{c}_{11} \mathrm{MSH}\) & \(c_{11}\) LSH & 4.0 & c11 & 0 & PR27＋RB27 & & & P9 & & & & \\
\hline 28 & & & 4.0 & C11 & 1 & SR28＊CR28 & & & & & S8 & & \\
\hline 29 & & & 4.0 & \(\mathrm{C}_{11}\) & 0 & PR30＋RB30 & SR28＊CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{10} \mathrm{MSH}\) & \(c_{10}\) LSH & 4.0 & \(\mathrm{C}_{10}\) & 0 & PR30＋RB30 & & & P10 & & & & \\
\hline 31 & & & 4.0 & \(\mathrm{C}_{10}\) & 1 & SR31．＊CR31 & & & & & S9 & & \\
\hline 32 & & & 4.0 & C10 & 0 & PR33＋RB33 & SR31＊CR31 & & & & & & \\
\hline 33 & c9 MSH & cg LSH & 4.0 & C9 & 0 & PR33＋RB33 & & & P11 & & & & \\
\hline 34 & & & 4.0 & c9 & 1 & SR34＊CR34 & & & & & S10 & & \\
\hline 35 & & & 4.0 & c9 & 0 & PR36＋RB36 & SR34＊CR34 & & & & & & \\
\hline 36 & \(\mathrm{c}_{8} \mathrm{MSH}\) & \({ }^{\text {c }} 8 \mathrm{LSH}\) & 4.0 & C8 & 0 & PR36＋RB36 & & & P12 & & & & \\
\hline 37 & & & 4.0 & \({ }^{\text {c } 8}\) & 1 & SR37＊CR37 & & & & & S11 & & \\
\hline 38 & & & 4.0 & C8 & 0 & PR39＋RB39 & SR37＊CR37 & & & & & & \\
\hline 39 & \(c_{7} \mathrm{MSH}\) & \(c_{7} \mathrm{LSH}\) & 4.0 & c7 & 0 & PR39＋RB39 & & & P13 & & & & \\
\hline 40 & & & 4.0 & \(\mathrm{C}_{7}\) & 1 & SR40＊CR40 & & & & & S12 & & \\
\hline 41 & & & 4.0 & \(\mathrm{C}_{7}\) & 0 & PR42＋RB42 & SR40＊CR40 & & & & & & \\
\hline 42 & \(c_{6} \mathrm{MSH}\) & \(c_{6}\) LSH & 4.0 & \(\mathrm{c}_{6}\) & 0 & PR42＋RB42 & & & P14 & & & & \\
\hline 43 & & & 4.0 & \(\mathrm{C}_{6}\) & 1 & SR43＊CR43 & & & & & S13 & & \\
\hline 44 & & & 4.0 & \(c_{6}\) & 0 & PR45＋RB45 & SR43＊CR43 & & & & & & \\
\hline 45 & \(\mathrm{C}_{5} \mathrm{MSH}\) & \(\mathrm{c}_{5} \mathrm{LSH}\) & 4.0 & \(\mathrm{C}_{5}\) & 0 & PR45＋RB45 & & & P15 & & & & \\
\hline 46 & & & 4.0 & \(\mathrm{C}_{5}\) & 1 & SR46＊CR46 & & & & & S14 & & \\
\hline 47 & & & 4.0 & \(\mathrm{C}_{5}\) & 0 & PR48＋RB48 & SR46＊CR46 & & & & & & \\
\hline 48 & \(\mathrm{c}_{4} \mathrm{MSH}\) & \(\mathrm{c}_{4} \mathrm{LSH}\) & 4.0 & \(\mathrm{C}_{4}\) & 0 & PR48＋RB48 & & & P16 & & & & \\
\hline
\end{tabular}

Table 59. Pseudocode for Chebyshev ArcSine and ArcCosine Routine (PIPES2-0 =010, RND1-0 =00) (Concluded)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{gathered}
\text { DA } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{C} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{s} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 49 & & & 4.0 & \(\mathrm{c}_{4}\) & 1 & SR49*CR49 & & & & & S15 & & \\
\hline 50 & & & 4.0 & \(\mathrm{c}_{4}\) & 0 & PR51 + RB51 & SR49*CR49 & & & & & & \\
\hline 51 & \(\mathrm{c}_{3} \mathrm{MSH}\) & \(\mathrm{c}_{3}\) LSH & 4.0 & \(\mathrm{c}_{3}\) & 0 & PR51 + RB51 & & & P17 & & & & \\
\hline 52 & & & 4.0 & \(\mathrm{c}_{3}\) & 1 & SR52*CR52 & & & & & S16 & & \\
\hline 53 & & & 4.0 & \(\mathrm{c}_{3}\) & 0 & PR54 + RB54 & SR52*CR52 & & & & & & \\
\hline 54 & \(\mathrm{c}_{2} \mathrm{MSH}\) & \(\mathrm{c}_{2}\) LSH & 4.0 & \(\mathrm{c}_{2}\) & 0 & PR54 + RB54 & & & P18 & & & & \\
\hline 55 & & & 4.0 & \(\mathrm{c}_{2}\) & 1 & SR55*CR55 & & & & & S17 & & \\
\hline 56 & & & 4.0 & \(\mathrm{c}_{2}\) & 0 & PR57 + RB57 & SR55*CR55 & & & & & & \\
\hline 57 & \(\mathrm{c}_{1} \mathrm{MSH}\) & \(c_{1}\) LSH & 4.0 & \(\mathrm{c}_{1}\) & 0 & PR57 + RB57 & & & P19 & & & & \\
\hline 58 & & & 4.0 & \(\mathrm{c}_{1}\) & 1 & SR58*CR58 & & & & & S18 & & \\
\hline 59 & & & 4.0 & \(\mathrm{c}_{1}\) & 0 & PR60 + RB60 & SR58*CR58 & & & & & & \\
\hline 60 & \(\mathrm{co}_{0} \mathrm{MSH}\) & \(\mathrm{c}_{0} \mathrm{LSH}\) & 4.0 & \({ }^{\circ}\) & 0 & PR60 + RB60 & & & P20 & & & & \\
\hline 61 & X MSH & X LSH & 4.0 & X & 1 & SR61*RB61 & & & & & S19 & & Begin postprocessing \\
\hline 62 & \[
\begin{gathered}
\text { SQRT(2) } \\
\text { MSH }
\end{gathered}
\] & \[
\begin{gathered}
\text { SORT(2) } \\
\hline \text { LSH } \\
\hline
\end{gathered}
\] & 4.0 & X & 0 & RA63*PR63 & SR61*RB61 & & & & & & SQRT(2) is the real value of square root of 2.0 \\
\hline 63 & & & \[
\begin{array}{|c}
\hline \text { SQRT } \\
2 \\
\hline
\end{array}
\] & x & 0 & RA63*PR63 & & & P21 & & & & \\
\hline 64 & & & \[
\begin{array}{|c}
\text { SQRT } \\
2
\end{array}
\] & x & 0 & DUMMY & RA63*PR63 & & & & & & Instruction is doubleprecision RA + RB, prevents ArcCosine from overwriting ArcSine result \\
\hline 66 & pi/2 MSH & pi/2 LSH & \[
\begin{array}{|c}
\hline \text { SORT } \\
2 \\
\hline
\end{array}
\] & pi/2 & 1 & RB66-PR66 & & & P22 & & & P22 & Output LSH of ArcSine \\
\hline 67 & & & \[
\begin{array}{|c}
\hline \text { SQRT } \\
2 \\
\hline
\end{array}
\] & pi/2 & 0 & NOP & & & & & S20 & S20 & Output MSH of ArcCosine \\
\hline 68 & & & \[
\begin{array}{|c|}
\hline \text { SORT } \\
2 \\
\hline
\end{array}
\] & pi/2 & 0 & NOP & & & & & S20 & S20 & Output LSH of ArcCosine \\
\hline
\end{tabular}

SN74ACT8847

\section*{Lも88ㄱロームNS}

\section*{Microcode Table for the ArcSine（x）and ArcCosine（x）Calculation}

All numbers are in hex．Any field with a length that is not a multiple of 4 is right justified and zero filled．For the microcode table，the value of \(X\) has been chosen to be \(1 /(\operatorname{SQRT}(2.0)\) ）．


Microcode Table for the ArcSine(x) and ArcCosine(x) Calculation (Continued)


SN74ACT8847

\section*{Lセ881つ甘もLNS}
in Microcode Table for the ArcSine（x）and ArCosine（x）Calculation（Concluded）


\section*{ArcTangent Routine Using Chebyshev's Method}

All floating point inputs and outputs are double precision. The output is in radians.

\section*{Steps Required to Perform the Calculation}

STEP 1 - Preprocessing; If the magnitude of the input, \(X\), is greater than 1.0, then the reciprocal must be taken. If the magnitude of \(X\) is not greater than 1.0, then pass \(X\). Let this number (either \(X\) or \(1.0 / X\) ) be referred to as X 1 . Next multiply X 1 times 2.0 , then multiply this resulting number by X 1 . Finally, subtract 1.0 from this last product.

If \(|X|>1.0\)
Then \(\mathrm{X} 1 \leftarrow 1.0 / \mathrm{X}\)
Else \(\mathrm{X} 1 \leftarrow \mathrm{X}\)
\(X 2 \leftarrow X 1 * 2.0 * X 1-1.0\)
STEP 2 - Core Calculation; X 2 in Step 1 will be referred to as ' \(x\) ' in the core calculation.
\(\mathrm{X} 3 \leftarrow\) C \(_{\text {series_atan }}\)
\(\leftarrow((1)(()(1(1)(1)(() c 19 * x+c 18) * x+c 17) * x+c 16) * x+c 15) * x+\)
 \(\left.\left.\left.\left.\left.\left.\left.+\mathrm{c}_{8}\right) * x+\mathrm{c}_{7}\right) * x+\mathrm{c}_{6}\right) * x+\mathrm{c}_{5}\right) * x+\mathrm{c}_{4}\right) * x+\mathrm{c}_{3}\right) * x+\mathrm{c}_{2}\right) * x\) \(\left.+c_{1}\right) * x+c_{0}\)

STEP 3 - Postprocessing; multiply the output of the core calculation times X . Let this number be referred to as \(X 4\). The next computation will yield the answer. If \(X\) was greater than 1.0, then subtract \(X 4\) from pi/2. If \(X\) was less than -1.0 , then subtract \(X 4\) from \(-p i / 2\). If neither of the two conditions above are true, then \(X 4\) is the answer.
```

X4}\leftarrow\textrm{X}3*X

```

If \(X>1.0\)
\[
\text { Then } \operatorname{ArcTangent}(\mathrm{X}) \leftarrow \mathrm{pi} / 2-\mathrm{X} 4
\]

Else If \(X<-1.0\)
Then \(\operatorname{ArcTangent}(X) \leftarrow-\) pi/2 \(-X 4\)
Else ArcTangent \((X) \leftarrow X 4\)

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing：
\[
\begin{aligned}
\text { If }|\mathrm{X}|> & >1.0 \\
\text { Then } \mathrm{T} 1 & \leftarrow 1.0 / \mathrm{X} \\
\mathrm{~T} 2 & \leftarrow \mathrm{~T} 1 * 2.0 \\
\mathrm{~T} 3 & \leftarrow \mathrm{~T} 2 * \text { CREG } \\
\mathrm{T} 4 & \leftarrow \mathrm{~T} 3-1.0 \\
\text { Else } & \mathrm{T} 1
\end{aligned} \mathrm{X} .
\]

Step 2 perform the core calculation：
\[
\begin{aligned}
& \text { T5 } \leftarrow \mathrm{c} 19 * \text { CREG } \\
& \text { T6 } \leftarrow \text { T5 }+\mathrm{c} 18 \\
& \text { T7 } \leftarrow \text { T6*CREG } \\
& \text { T8 } \leftarrow T 7+\mathrm{C} 17 \\
& \text { T9 } \leftarrow \text { T8*CREG } \\
& \mathrm{T} 10 \leftarrow \mathrm{~T} 9+\mathrm{c} 16 \\
& \text { T11 } \leftarrow \text { T10*CREG } \\
& \mathrm{T} 12 \leftarrow \mathrm{~T} 11+\mathrm{C} 15 \\
& \text { T13 } \leftarrow \text { T12*CREG } \\
& \mathrm{T} 14 \leftarrow \mathrm{~T} 13+\mathrm{c} 14 \\
& \text { T15 } \leftarrow \text { T14*CREG } \\
& \text { T16 } \leftarrow \text { T15 }+ \text { c13 } \\
& \text { T17 } \leftarrow \text { T16*CREG } \\
& \mathrm{T} 18 \leftarrow \mathrm{~T} 17+\mathrm{c} 12 \\
& \text { T19 } \leftarrow \text { T18*CREG } \\
& \mathrm{T} 20 \leftarrow \mathrm{~T} 19+\mathrm{c}_{11} \\
& \text { T21 } \leftarrow \text { T20*CREG } \\
& \mathrm{T} 22 \leftarrow \mathrm{~T} 21+\mathrm{c} 10 \\
& \text { T23 } \leftarrow \text { T22* CREG } \\
& \mathrm{T} 24 \leftarrow \mathrm{~T} 23+\mathrm{c} 9 \\
& \text { T25 } \leftarrow \text { T24*CREG } \\
& \mathrm{T} 26 \leftarrow \mathrm{~T} 25+\mathrm{c} 8 \\
& \text { T27 } \leftarrow \text { T26*CREG } \\
& \mathrm{T} 28 \leftarrow \mathrm{~T} 27+\mathrm{C} 7 \\
& \text { T29 } \leftarrow \text { T28*CREG } \\
& \text { T30 } \leftarrow \text { T29 + } \mathrm{c}_{6}
\end{aligned}
\]

T1 is X 1 in Step 1，must be stored externally
CREG \(\leftarrow\) T1

CREG \(\leftarrow\) T4
\[
\begin{aligned}
& \text { T31 } \leftarrow \text { T30*CREG } \\
& \text { T32 } \leftarrow \text { T31 + } \mathrm{C} 5 \\
& \text { T33 } \leftarrow \text { T32 * CREG } \\
& T 34 \leftarrow \text { T33 + } \mathrm{C} 4 \\
& \text { T35 } \leftarrow \text { T34*CREG } \\
& \mathrm{T} 36 \leftarrow \mathrm{~T} 35+\mathrm{C} 3 \\
& \text { T37 } \leftarrow \text { T36* CREG } \\
& \mathrm{T} 38 \leftarrow \mathrm{~T} 37+\mathrm{C}_{2} \\
& \text { T39 } \leftarrow \text { T38*CREG } \\
& \mathrm{T} 40 \leftarrow \mathrm{~T} 39+\mathrm{c}_{1} \\
& \text { T41 } \leftarrow \text { T40*CREG } \\
& \mathrm{T} 42 \leftarrow \mathrm{~T} 41+\mathrm{c}_{0}
\end{aligned}
\]

Step 3 perform the postprocessing:
\[
\begin{aligned}
& \text { T43 } \leftarrow \mathrm{T} 42 * \mathrm{~T} 1 \\
& \text { If } \mathrm{X}>1.0 \quad \text { CREG } \leftarrow \mathrm{T} 43 \\
& \text { Then } \operatorname{ArcTangent}(\mathrm{X}) \leftarrow \mathrm{pi} / 2-\text { CREG } \\
& \text { Return } \\
& \text { If } \mathrm{X}<-1.0 \\
& \text { Then ArcTangent }(X) \leftarrow-\mathrm{pi} / 2-\text { CREG } \\
& \text { Return } \\
& \text { ArcTangent }(X) \leftarrow \text { CREG }
\end{aligned}
\]

\section*{Required System Intervention}

As seen in the algorithm for Step 1, the 'ACT8847 performs a compare. The results of this compare determine what kind of preproccessing is to be performed. In Step 3, there are two more compare operations. The system must therefore perform additional decision making. In addition, the system must store T1, and later (in the postprocessing) provide this value to the 'ACT8847.

\section*{Number of 'ACT8847 Cycles Required to Calculate ArcTangent(x)}

Calculation of ArcTangent( \(x\) ) requires at most 89 cycles (including the divide instruction). In addition, it is assumed that 15 additional cycles are required due to the compare instructions, and resulting system intervention. Therefore, the total number of cycles to perform the \(\operatorname{ArcTangent}(x)\) calculation is 104 .

\section*{Listing of the Chebyshev Constants（c＇s）}

The constants are represented in IEEE double－precision floating point format．
\[
\begin{aligned}
& \text { c19 }=\text { BDC4D6CC6308553F } \\
& \text { c18 = 3DDFFD56FCFD2315 } \\
& \mathrm{c} 17=\text { BDE880782D99D071 } \\
& \text { c16 }=\text { 3E0409670CB71218 } \\
& \text { c15 = BE237C8239249B77 } \\
& \mathrm{c}_{14}=3 \mathrm{E} 3 \mathrm{~F} 1358 \mathrm{EC} 1 \mathrm{D} 6 \mathrm{AC0} \\
& \text { c13 }=\text { BE587CD25F4AFBED } \\
& \text { c12 }=3 E 73 D 2388 B 0 B 8 A 86 \\
& \text { c11 }=\text { BE9028E921CA6A94 } \\
& \text { c10 }=\text { 3EAA814997A38D4E } \\
& \text { c9 }=\text { BEC5EDAD9A21FE5F } \\
& \mathrm{c} 8=3 \mathrm{EE} 256 \mathrm{E} 57 \mathrm{BA} 07 \mathrm{FAE} \\
& c_{7}=\text { BEFF171F48FDF707 } \\
& \mathrm{c}_{6}=3 \mathrm{~F} 1 \text { ACFA9F95CAODF } \\
& \mathrm{C}_{5}=\mathrm{BF} 37 \mathrm{~A} 8464221 \mathrm{D} 994 \\
& \text { c4 }=3 \text { F558DF7A83283C9 } \\
& \mathrm{c}_{3}=\mathrm{BF} 749 \mathrm{~B} 3 \mathrm{E} 2 \mathrm{E} 433683 \\
& c_{2}=3 F 955 A 300 B F B 8078 \\
& c_{1}=\mathrm{BFBA} 1494 \mathrm{C} 19 F A D D 4 \\
& c_{0}=3 F E B D A 7 A 85 B D 40 C B
\end{aligned}
\]

\section*{Pseudocode Table for the ArcTangent(x) Calculation}

Table 60. Pseudocode for Chebyshev ArcTangent Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) )
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA BUS & \begin{tabular}{l}
DB \\
BUS
\end{tabular} & RA REG & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{aligned}
& \text { CLK } \\
& \text { MODE }
\end{aligned}
\] & INSTR & \begin{tabular}{l}
MUL \\
PIPE
\end{tabular} & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 1 & 1.0 MSH & 1.0 LSH & 0 & & & \[
\begin{array}{r}
\text { COMPARE } \\
\text { RA2, |RB2| } \\
\hline
\end{array}
\] & & & & & & & \begin{tabular}{l}
\(X\) is the input \\
Compare 1.0 and \(A B S(X)\)
\end{tabular} \\
\hline 2 & X MSH & X LSH & 1.0 & X & 0 & \[
\begin{gathered}
\mathrm{RA} 2 * \mathrm{RB} 2 \\
\mathrm{RA} 2,|\mathrm{RB} 2|
\end{gathered}
\] & & & & & & & If \(A B S(X)\) is greater than 1.0 perform 1.0/X, otherwise go to cycle 16b \\
\hline 3 & & & 1.0 & \(X\) & 0 & NOP & & & & & & & Wait for system response \\
\hline 4 & & & 1.0 & X & 1 & DIV & & & & & & & Divide: 1.0/X \\
\hline 5 & & & 1.0 & X & 0 & NOP & & & & & & & Wait for Division result \\
\hline 6 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 7 & & & 1.0 & \(X\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 8 & & & 1.0 & X & 0 & NOP & & & & & & & Wait for Division result \\
\hline 9 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 10 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 11 & & & 1.0 & X & 0 & NOP & & & & & & & Wait for Division result \\
\hline 12 & & & 1.0 & X & 0 & NOP & & & & & & & Wait for Division result \\
\hline 13 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 14 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 15 & & & 1.0 & \(x\) & 0 & NOP & & & & & & & Wait for Division result \\
\hline 16a & 2.0 MSH & 2.0 LSH & 1.0 & X & 0 & RA17*PR17 & & & P1 & & & P1 & If the reciprocal of \(X\) was \\
\hline 17a & & & 2.0 & X & 0 & RA17*PR17 & & & & & & P1 & performed, then execute \\
\hline 18a & & & 2.0 & X & 0 & CR19*PR19 & RA17*PR17 & & & P1 & & & cycles 16a through 19a \\
\hline 19a & & & 2.0 & X & 0 & CR19*PR19 & & & P2a & & & & In cycles 16a and 17 a output P1 and store it for use in cycle 79 \\
\hline 16b & 2.0 MSH & 2.0 LSH & 1.0 & X & 0 & RA17*RB17 & & & & & & & If the reciprocal of \(X\) was \\
\hline
\end{tabular}

\section*{Lヤ88〇ワもヤLNS}

Table 60．Pseudocode for Chebyshev ArcTangent Routine（PIPES2－0＝010，RND1－0＝00）（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & DA BUS & \[
\begin{aligned}
& \text { DB } \\
& \text { BUS }
\end{aligned}
\] & \[
\begin{array}{r}
\text { RA } \\
\text { REG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \begin{tabular}{l}
CLK \\
MODE
\end{tabular} & INSTR & MUL PIPE & ALU PIPE & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 17b & & & 2.0 & X & 0 & RA17＊RB17 & & & & & & & not performed，then execute \\
\hline 18b & X MSH & X LSH & X & X & 0 & RA19＊PR19 & RA17＊RB17 & & & & & & cycle 16b through 19b \\
\hline 19b & & & X & X & 0 & RA19＊PR19 & & & P2b & & & & \\
\hline 20 & & & \[
\begin{gathered}
2.0 \\
\text { or } \\
\times \\
\hline
\end{gathered}
\] & X & 0 & PR21＋RB21． & \[
\begin{gathered}
\hline \text { CR19*PR19 } \\
\text { or } \\
\text { RA } 19 * \text { PR19 } \\
\hline
\end{gathered}
\] & & & & & & The RA register is not used again until cycle 81 so rather than indicating \\
\hline 21 & －1．0 MSH & －1．0 LSH & \[
\begin{gathered}
2.0 \\
\text { or } \\
\times \\
\hline
\end{gathered}
\] & －1．0 & 0 & PR21＋RB21 & & & \[
\begin{gathered}
\hline \text { P3a } \\
\text { or } \\
\text { P3b } \\
\hline
\end{gathered}
\] & & & & the contents＇ 2.0 of RA as：or \(X^{\prime}\) \\
\hline 22 & \(\mathrm{c}_{19} \mathrm{MSH}\) & \(c_{19}\) LSH & 2 or X & c19 & 1 & SR22＊RB22 & & & & & S1 & & \begin{tabular}{l}
use the term＇ 2 or \(\mathrm{X}^{\prime}\) \\
Start the core calculation
\end{tabular} \\
\hline 23 & & & 2 or X & \(\mathrm{c}_{19}\) & 0 & PR24＋RB24 & SR22＊RB22 & & & S1 & & & \\
\hline 24 & \(\mathrm{c}_{18} \mathrm{MSH}\) & \(\mathrm{c}_{18}\) LSH & 2 or \(X\) & C18 & 0 & PR24＋RB24 & & & P4 & & & & \\
\hline 25 & & & 2 or X & c18 & 1 & SR25＊CR25 & & & & & S2 & & \\
\hline 26 & & & 2 or X & c18 & 0 & PR27＋RB27 & SR25＊CR25 & & & & & & \\
\hline 27 & \(\mathrm{c}_{17} \mathrm{MSH}\) & \(\mathrm{c}_{17}\) LSH & 2 or X & c17 & 0 & PR27＋RB27 & & & P5 & & & & \\
\hline 28 & & & 2 or X & c17 & 1 & SR28＊CR28 & & & & & S3 & & \\
\hline 29 & & & 2 or X & c17 & 0 & PR30＋RB30 & SR28＊CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{16} \mathrm{MSH}\) & \(\mathrm{c}_{16}\) LSH & 2 or X & \(\mathrm{c}_{16}\) & 0 & PR30＋RB30 & & & P6 & & & & \\
\hline 31 & & & 2 or X & c16 & 1 & SR31＊CR31 & & & & & S4 & & \\
\hline 32 & & & 2 or X & \(\mathrm{c}_{16}\) & 0 & PR33＋RB33 & SR31＊CR31 & & & & & & \\
\hline 33 & \(\mathrm{c}_{15} \mathrm{MSH}\) & \(\mathrm{c}_{15}\) LSH & 2 or X & \(\mathrm{c}_{15}\) & 0 & PR33＋RB33 & & & P7 & & & & \\
\hline 34 & & & 2 or X & c15 & 1 & SR34＊CR34 & & & & & S5 & & \\
\hline 35 & & & 2 or X & \(\mathrm{c}_{15}\) & 0 & PR36＋RB36 & SR34＊CR34 & & & & & & \\
\hline 36 & \({ }^{\text {c }} 14 \mathrm{MSH}\) & \(\mathrm{c}_{14} \mathrm{LSH}\) & 2 or X & \(\mathrm{c}_{14}\) & 0 & PR36＋RB36 & & & P8 & & & & \\
\hline
\end{tabular}

Table 60. Pseudocode for Chebyshev ArcTangent Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) ) (Continued)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & CLK & \[
\begin{gathered}
\text { DA } \\
\text { BUS } \\
\hline
\end{gathered}
\] & \[
\begin{aligned}
& \text { DB } \\
& \text { BUS }
\end{aligned}
\] & \[
\begin{array}{r}
\text { RA } \\
\text { REG } \\
\hline
\end{array}
\] & \[
\begin{array}{|c|}
\hline \text { RB } \\
\text { REG } \\
\hline
\end{array}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & MUL PIPE & \[
\begin{aligned}
& \text { ALU } \\
& \text { PIPE }
\end{aligned}
\] & \[
\begin{gathered}
\mathrm{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathrm{C} \\
\mathrm{REG}
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline & 37 & & & 2 or X & c14 & 1 & SR37*CR37 & & & & & S6 & & \\
\hline & 38 & & & 2 or X & \(\mathrm{c}_{14}\) & 0 & PR39 + RB39 & SR37*CR37 & & & & & & \\
\hline & 39 & \(\mathrm{c}_{13} \mathrm{MSH}\) & \(\mathrm{c}_{13}\) LSH & 2 or X & \(\mathrm{c}_{13}\) & 0 & PR39 + RB39 & & & P9 & & & & \\
\hline & 40 & & & 2 or X & \(\mathrm{c}_{13}\) & 1 & SR40*CR40 & & & & & S7 & & \\
\hline & 41 & & & 2 or X & \(\mathrm{c}_{13}\) & 0 & PR42 + RB42 & SR40*CR40 & & & & & & \\
\hline & 42 & \(\mathrm{c}_{12} \mathrm{MSH}\) & \(\mathrm{c}_{12}\) LSH & 2 or X & \(\mathrm{c}_{12}\) & 0 & PR42 + RB42 & & & P10 & & & & \\
\hline & 43 & & & 2 or X & \(\mathrm{c}_{12}\) & 1 & SR43*CR43 & & & & & S8 & & \\
\hline & 44 & & & 2 or X & \(\mathrm{c}_{12}\) & 0 & PR45 + RB45 & SR43*CR43 & & & & & & \\
\hline & 45 & \(\mathrm{c}_{11} \mathrm{MSH}\) & \(\mathrm{c}_{11}\) LSH & 2 or X & \(\mathrm{c}_{11}\) & 0 & PR45 + RB45 & & & P11 & & & & \\
\hline & 46 & & & 2 or X & \(\mathrm{c}_{11}\) & 1 & SR46*CR46 & & & & & s9 & & \\
\hline & 47 & & & 2 or X & \(\mathrm{c}_{11}\) & 0 & PR48 + RB48 & SR46*CR46 & & & & & & \\
\hline & 48 & \(\mathrm{c}_{10} \mathrm{MSH}\) & \(\mathrm{c}_{10}\) LSH & 2 or X & \(\mathrm{c}_{10}\) & 0 & PR48 + RB48 & & & P12 & & & & \\
\hline & 49 & & & 2 or X & \(\mathrm{c}_{10}\) & 1 & SR49*CR49 & & & & & S10 & & \\
\hline & 50 & & & 2 or X & \(\mathrm{c}_{10}\) & 0 & PR51 + RB51 & SR49*CR49 & & & & & & \\
\hline & 51 & cg MSH & cg LSH & 2 or X & c9 & 0 & PR51 + RB51 & & & P13 & & & & \\
\hline & 52 & & & 2 or X & c9 & 1 & SR52*CR52 & & & & & S11 & & \\
\hline & 53 & & & 2 or X & c9 & 0 & PR54 + RB54 & SR52 *CR52 & & & & & & \\
\hline & 54 & \(\mathrm{c}_{8} \mathrm{MSH}\) & c8 LSH & 2 or X & \({ }^{\text {c }} 8\) & 0 & PR54 + RB54 & & & P14 & & & & \\
\hline & 55 & & & 2 or X & \(\mathrm{c}_{8}\) & 1 & SR55*CR55 & & & & & S12 & & \\
\hline & 56 & & & 2 or X & \(\mathrm{c}_{8}\) & 0 & PR57 + RB57 & SR55*CR55 & & & & & & \\
\hline & 57 & \(\mathrm{c}_{7} \mathrm{MSH}\) & \({ }^{7} 7\) LSH & 2 or X & \({ }^{\text {c } 7}\) & 0 & PR57 + RB57 & & & P15 & & & & \\
\hline & 58 & & & 2 or X & \({ }^{\text {c } 7}\) & 1 & SR58 * CR58 & & & & & S13 & & \\
\hline & 59 & & & 2 or X & \(\mathrm{c}_{7}\) & 0 & PR60 + RB60 & SR58*CR58 & & & & & & \\
\hline \[
\underset{y}{\hat{O}}
\] & 60 & \(\mathrm{C}_{6} \mathrm{MSH}\) & \(\mathrm{c}_{6} \mathrm{LSH}\) & 2 or X & \(\mathrm{c}_{6}\) & 0 & PR60 + RB60 & & & P16 & & & & \\
\hline
\end{tabular}

SN74ACT8847

\section*{Lヤ88」つもヤLNS}

Table 60．Pseudocode for Chebyshev ArcTangent Routine（PIPES2－0＝010，RND1－0＝00）（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & RA REG & RB REG & CLK MODE & INSTR & MUL PIPE & ALU PIPE & P REG & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & Y BUS & COMMENT \\
\hline 61 & & & 2 or \(X\) & \(c_{6}\) & 1 & SR61＊CR61 & & & & & S14 & & \\
\hline 62 & & & 2 or \(X\) & \({ }^{C} 6\) & 0 & PR63＋RB63 & SR61＊CR61 & & & & & & \\
\hline 63 & \(\mathrm{c}_{5} \mathrm{MSH}\) & \(\mathrm{c}_{5} \mathrm{LSH}\) & 2 or \(X\) & \(\mathrm{C}_{5}\) & 0 & PR63＋RB63 & & & P17 & & & & \\
\hline 64 & & & 2 or \(X\) & C5 & 1 & SR64＊CR64 & & & & & S15 & & \\
\hline 65 & & & 2 or \(X\) & \(\mathrm{C}_{5}\) & 0 & PR66＋RB66 & SR64＊CR64 & & & & & & \\
\hline 66 & \(\mathrm{c}_{4} \mathrm{MSH}\) & \(\mathrm{c}_{4}\) LSH & 2 or \(X\) & \(\mathrm{C}_{4}\) & 0 & PR66＋RB66 & & & P18 & & & & \\
\hline 67 & & & 2 or X & \(\mathrm{C}_{4}\) & 1 & SR67＊CR67 & & & & & S16 & & \\
\hline 68 & & & 2 or \(X\) & \(\mathrm{C}_{4}\) & 0 & PR69＋RB69 & SR67＊CR67 & & & & & & \\
\hline 69 & \(\mathrm{c}_{3} \mathrm{MSH}\) & \(c_{3}\) LSH & 2 or \(X\) & \(\mathrm{c}_{3}\) & 0 & PR69＋RB69 & & & P19 & & & & \\
\hline 70 & & & 2 or X & \(c_{3}\) & 1 & SR70＊CR70 & & & & & S17 & & \\
\hline 71 & & & 2 or \(X\) & \(c_{3}\) & 0 & PR72＋RB72 & SR70＊CR70 & & & & & & \\
\hline 72 & \(\mathrm{c}_{2} \mathrm{MSH}\) & \(\mathrm{c}_{2} \mathrm{LSH}\) & 2 or \(X\) & \(\mathrm{C}_{2}\) & 0 & PR72＋RB72 & & & P20 & ， & & & \\
\hline 73 & & & 2 or X & \(\mathrm{C}_{2}\) & 1 & SR73＊CR73 & & & & & S18 & & \\
\hline 74 & & & 2 or \(X\) & \(\mathrm{c}_{2}\) & 0 & PR75＋RB75 & SR73＊CR73 & & & & & & \\
\hline 75 & \(\mathrm{c}_{1} \mathrm{MSH}\) & c1 LSH & 2 or \(X\) & C 1 & 0 & PR75＋RB75 & & & P21 & & & & \\
\hline 76 & & & 2 or X & C1 & 1 & SR76＊CR76 & & & & & S19 & & \\
\hline 77 & & & 2 or X & c1 & 0 & PR78＋RB78 & SR76＊CR76 & & & & & & \\
\hline 78 & \(\mathrm{c}_{0} \mathrm{MSH}\) & \(c_{0} \mathrm{LSH}\) & 2 or X & \({ }^{C}\) & 0 & PR78＋RB78 & & & P22 & & & & \\
\hline 79 & T1 MSH & T1 LSH & 2 or \(X\) & T1 & 1 & SR79＊RB79 & & & & & S20 & & T 1 is either P 1 or is X depending on what action was called for at cycle 2 Begin the post processing \\
\hline 80 & X MSH & X LSH & 2 or \(X\) & T1 & 0 & COMPARE
\[
\text { X, } 1.0
\] & & & & \(\cdots\) & & & If \(X>1.0\) then execute 83 through 86，otherwise skip to 83b．In either case execute 80 through 82 \\
\hline
\end{tabular}

Table 60. Pseudocode for Chebyshev ArcTangent Routine (PIPES2-0 \(=010\), RND1-0 \(=00\) ) (Concluded)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{l}
DA \\
BUS
\end{tabular} & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & RA REG & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { CLK } \\
\text { MODE }
\end{gathered}
\] & INSTR & \begin{tabular}{l}
MUL \\
PIPE
\end{tabular} & ALU PIPE & \begin{tabular}{l}
P \\
REG
\end{tabular} & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{Y} \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 81 & 1.0 MSH & 1.0 LSH & X & 1.0 & 0 & \[
\begin{gathered}
\text { COMPARE } \\
\times, 1.0 \\
\hline
\end{gathered}
\] & & & P23 & & & & \\
\hline 82 & & & X & 1.0 & 0 & NOP & & & & P23 & & & Wait for system response \\
\hline 83 & & & X & 1.0 & 0 & RB84 - CR84 & & & & & & & Execute if \(\mathrm{X}>1.0\) \\
\hline 84 & pi/2 MSH & pi/2 LSH & X & pi/2 & 0 & RB84 - CR84 & & & & & & & \\
\hline 85 & & & X & pi/2 & 0 & NOP & & & & & S21a & S21a & Output MSH of answer \\
\hline 86 & & & X & pi/2 & 0 & NOP & & & & & S21a & S21a & Output LSH of answer The calculation is done \\
\hline 83b & - 1.0 MSH & -1.0 LSH & X & 1.0 & 0 & \[
\begin{aligned}
& \text { COMPARE } \\
& -1.0, \mathrm{X}
\end{aligned}
\] & & & & & & & \begin{tabular}{l}
Execute if \(\mathrm{X} \leq 1.0\). \\
If \(-1.0>X\) then execute 86b through 89b, otherwise skip to 86c. In either case execute 83b thru 85b
\end{tabular} \\
\hline 84b & X MSH & X LSH & -1.0 & X & 0 & \[
\begin{aligned}
& \text { COMPARE } \\
& -1.0, \mathrm{x} \\
& \hline
\end{aligned}
\] & & & & P23 & & & \\
\hline 85b & & & \(-1.0\) & X & 0 & NOP & & & & P23 & & & Wait for system response \\
\hline 86b & & & -1.0 & X & 0 & RB87-CR87 & & & & & & & Execute if -1.0>X \\
\hline 87b & \[
\begin{aligned}
& -\mathrm{pi} / 2 \\
& \mathrm{MSH}
\end{aligned}
\] & \[
\begin{gathered}
-\mathrm{pi} / 2 \\
\text { LSH }
\end{gathered}
\] & - 1.0 & -pi/2 & 0 & RB87 - CR87 & & & & & & & \\
\hline 88b & & & -1.0 & -pi/2 & 0 & NOP & & & & & S21b & S21b & Output MSH of answer \\
\hline 89b & & & - 1.0 & pi/2 & 0 & NOP & & & & & S21b & S21b & \begin{tabular}{l}
Output LSH of answer. \\
The calculation is done.
\end{tabular} \\
\hline 86c & & & -1.0 & X & 1 & PASS(CR86) & & & & & & & Execute if X is within the \\
\hline & & & & & & & & & & & & & range [ \(-1,1\) ], Pass CREG \\
\hline 87c & & & -1.0 & X & 0 & NOP & & & & & S21c & S21c & Output MSH of answer \\
\hline 88c & & & -1.0 & X & 0 & NOP & & & & & S21c & S21c & Output LSH of answer \\
\hline
\end{tabular}

\section*{Lヤ88」こも七LNS}

\section*{Microcode Table for the ArcTangent（x）Calculation}

All numbers are in hex．Any field with a length that is not a multiple of 4 is right justified and zero filled．For the microcode table，the value of \(X\) has been chosen to be SQRT（3．0）．


\section*{Microcode Table for the ArcTangent(x) Calculation (Continued)}


SN74ACT8847

\section*{Lヤ88ㄱローレNS}

\section*{Microcode Table for the ArcTangent（x）Calculation（Continued）}


Microcode Table for the ArcTangent（x）Calculation（Concluded）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline & D & D & P & E & E C & P & C & & S & \(\overline{\mathrm{R}}\) & \(\bar{H}\) & E & F & 1 & R & F & & B & S & T & S & & & \\
\hline \multirow[t]{5}{*}{P} & A & B & B & N & N L & 1 & L & 0 & E & E & A & N & L & N & N & A & R & Y & E & E & E & E & & E \\
\hline & & & & A & B K & P & K & N & L & S & L & C & 0 & S & D & S & C & T & L & S & L & Y & S & C \\
\hline & & & & & C & E & M & F & 0 & E & T & & W & T & & T & C & E & S & T & Y & & & \\
\hline & & & & & & S & 0 & 1 & P & T & & & C & R & & & & P & & & & & & \\
\hline & & & & & & & D & G & & & & & & & & & & & & & & & & \\
\hline & F 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 1C0 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3F955A30 & 0BFB8078 & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1 & 3 & 9F & 1 & 1 & 1 & 0 & 100 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & BFBA1494 & C19FADD4 & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 1. & 3 & 9F & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3FEBDA7A & 85BD40CB & F & 0 & 1 & 2 & 0 & 3 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3FE279A7 & 4590331C & F & 0 & 1 & 2 & 1 & 3 & BF & 1 & 1 & 1 & 0 & 1 CO & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3FFBB67A & E8584CAB & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 182 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 3FF00000 & 00000000 & F & 1 & 1 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 182 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline \multirow[t]{2}{*}{F} & 00000000 & 00000000 & F & 0 & 0 5 & 2 & 0 & 3 & FF & 1 & 1 & 0 & 0 & 300 & 0 & 0 & 1 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & F7 & 1 & 1 & 1 & 0 & 183 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline & 3FF921FB & 54442D18 & F & 0 & 1 & 2 & 0 & 3 & F7 & 1 & 1 & 1 & 0 & 183 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 1 & 0 & 0 & 0 \\
\hline F & 00000000 & 00000000 & F & 0 & 0 & 2 & 0 & 3 & FF & 1 & 1 & 1 & 0 & 300 & 0 & 0 & 0 & 0 & 3 & 3 & 0 & 0 & 0 & \\
\hline
\end{tabular}

\section*{Exponential Routine Using Chebyshev＇s Method}

All floating point inputs and outputs are double precision．

\section*{Steps Required to Perform the Calculation}

STEP 1 －Preprocessing；first multiply the input，\(X\) ，by \(\log _{2} e\)（yielding X1）．Next， convert this product to an integer，using truncate mode（yielding \(X 2\) ）． Form the variable EX by adding 1024 to X 2 ．EX is used in the postprocessing part of the routine．Subtract 1023 from EX to find the variable N （ N is actually X 2 incremented by 1 ）．Convert N to a floating point number（yielding X3）．Subtract X1 from X3，multiply this difference by 2.0 ，and then finally subtract 1．0．This last computation is the input to the core routine．
\[
\mathrm{X} 1 \leftarrow \mathrm{X} * \log _{2} \mathrm{e}
\]
\(\mathrm{X} 2 \leftarrow \operatorname{TRUNC}(\mathrm{X} 1)\)
\(E X \leftarrow 1024+X 2\)
\(N \leftarrow E X-1023\)
X3 \(\leftarrow\) DOUBLE（N）
\(\mathrm{X} 4 \leftarrow 2.0 *(\mathrm{X} 3-\mathrm{X} 1)-1.0\)
STEP 2 －Core Calculation；X4 in Step 1 will be referred to as＇\(x\)＇in the core calculation．

X5 \(\leftarrow\) C \(_{\text {series＿exp }}\)
\(\left.\leftarrow\left(()\left(()\left(()\left(c_{11} * x+c_{10}\right) * x+c 9\right) * x+c_{8}\right) * x+c_{7}\right) * x+c_{6}\right) * x+\) \(\left.\left.\left.\left.\left.c_{5}\right) * x+c_{4}\right) * x+c_{3}\right) * x+c_{2}\right) * x+c_{1}\right) * x+c_{0}\)

STEP 3 －Postprocessing；multiply the output of the core calculation times 2 N ． To generate 2 N ，perform the following：shift left logical 20 positions （bits）the variable EX（which was calculated in Step 1）．The resulting bit pattern will be the double precision floating point representation of 2 N ．However，the＇ACT8847 will not at this point recognize the bit pattern as a floating point number．So this number must be output from the \(Y\) bus，and then input（declaring the input to be a double precision floating point number）on the input bus．Now the＇ACT8847 will process 2 N as a double float，and so the core output，\(X 5\) ，can be multiplied by 2 N to produce the final result．＇SLL＇means to shift left logical．
\(\mathrm{X} 6 \leftarrow\) EX \(S L L\) by 20 bits
Y bus \(\leftarrow \mathrm{X} 6\)
DA bus \(\leftarrow Y\) bus
\(\operatorname{Exp}(X) \leftarrow X 5 * X 6\)

\section*{Algorithms for the Three Steps}

Step 1 perform the preprocessing:
T1 \(\leftarrow \mathrm{X} * \log _{2} \mathrm{e}\)
T2 \(\leftarrow \mathrm{INT}(\mathrm{T} 1)\)
T3 \(\leftarrow 1024+\) T2
T4 \(\leftarrow\) T3 -1023
T5 \(\leftarrow 1 *\) T4
T6 \(\leftarrow\) DOUBLE(T5)
T7 \(\leftarrow\) T6 - CREG
T8 \(\leftarrow 2.0 *\) T7
T9 \(\leftarrow\) T8 -1.0

Step 2 perform the core calculation:
\begin{tabular}{|c|c|}
\hline T10 \(\leftarrow \mathrm{c} 11 *\) CREG & \\
\hline \(\mathrm{T} 11 \leftarrow \mathrm{~T} 10+\mathrm{c} 10\) & CREG \(\leftarrow \mathrm{T} 9\) \\
\hline T12 \(\leftarrow\) T11*CREG & \\
\hline \(\mathrm{T} 13 \leftarrow \mathrm{~T} 12+\mathrm{c} 9\) & \\
\hline T14 \(\leftarrow\) T13*CREG & \\
\hline \(\mathrm{T} 15 \leftarrow \mathrm{~T} 14+\mathrm{C} 8\) & \\
\hline T16 \(\leftarrow\) T15*CREG & \\
\hline \(\mathrm{T} 17 \leftarrow \mathrm{~T} 16+\mathrm{C} 7\) & \\
\hline T18 \(\leftarrow\) T17*CREG & \\
\hline \(\mathrm{T} 19 \leftarrow \mathrm{~T} 18+\mathrm{C} 6\) & \\
\hline T20 \(\leftarrow\) T19*CREG & \\
\hline \(\mathrm{T} 21 \leftarrow \mathrm{~T} 20+\mathrm{C} 5\) & \\
\hline T22 \(\leftarrow\) T21*CREG & \\
\hline \(\mathrm{T} 23 \leftarrow \mathrm{~T} 22+\mathrm{C} 4\) & \\
\hline T24 \(\leftarrow\) T \(23 *\) CREG & \\
\hline \(\mathrm{T} 25 \leftarrow \mathrm{~T} 24+\mathrm{C} 3\) & \\
\hline T26 \(\leftarrow\) T \(25 *\) CREG & \\
\hline \(\mathrm{T} 27 \leftarrow \mathrm{~T} 26+\mathrm{C} 2\) & \\
\hline T28 \(\leftarrow\) T27* CREG & \\
\hline \(\mathrm{T} 29 \leftarrow \mathrm{~T} 28+\mathrm{c} 1\) & \\
\hline T30 \(\leftarrow\) T \(29 *\) CREG & \\
\hline \(\mathrm{T} 31 \leftarrow \mathrm{~T} 30+\mathrm{CO}\) & \\
\hline
\end{tabular}

Step 3 perform the postprocessing：
\[
\begin{aligned}
& \text { T32 } \leftarrow \text { T3 SLL by } 20 \text { bits } \\
& \text { Y bus } \leftarrow T 32 \\
& \text { DA bus } \leftarrow Y \text { bus }(=T 32) \\
& \operatorname{Exp}(X) \leftarrow T 32 * \text { CREG }
\end{aligned}
\]

Shift T3 20 bits left
Output and then Input T32
CREG \(\leftarrow\) T31
Two cycles required to input both halves of T32

\section*{Required System Intervention}

The system is required to store the variable EX，and then later provide this variable． In addition，the system is required to route the variable T32（in Step 3）from the Y bus to the DA bus．

\section*{Number of＇ACT8847 Cycles Required to Calculate Exp（x）}

Calculation of \(\operatorname{Exp}(x)\) requires 52 cycles．Since there are no decisions which the system is required to perform，the total number of cycle to perform the \(\operatorname{Exp}(X)\) calculation is 52 ．

\section*{Listing of the Chebyshev Constants（c＇s）}

The constants are represented in IEEE double－precision floating point format．
\[
\begin{aligned}
\mathrm{c}_{11} & =\text { BD45A7FC05D3B501 } \\
\mathrm{c}_{1} 0 & =3 D 957 \mathrm{BFD} 2 \mathrm{DBF} 487 \mathrm{C} \\
\mathrm{c}_{9} & =\text { BDE351B821AC16D5 } \\
\mathrm{c}_{8} & =3 E 2 F 5 B 0 E 17440879 \\
\mathrm{c}_{7} & =\text { BE769E51EE631E87 } \\
\mathrm{c}_{6} & =3 E B C 8 D 7530548 \mathrm{DD5} \\
\mathrm{c}_{5} & =\text { BEFEE4FD234A4926 } \\
\mathrm{c}_{4} & =3 F 3 B D B 696 E 8987 A C \\
\mathrm{c}_{3} & =\text { BF741839EB88156E } \\
\mathrm{c}_{2} & =3 F A 5 B E 298 A D F 0369 \\
\mathrm{c}_{1} & =\text { BFCF5E46537AB906 } \\
\mathrm{c}_{0} & =3 F E 6 A 09 E 667 F 3 B C C
\end{aligned}
\]

\section*{Pseudocode Table for the \(\operatorname{Exp}(x)\) Calculation}

Table 61. Pseudocode for Chebyshev Exponential Routine (PIPES2-0 \(=010\), RND1-0)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
CLK \\
MODE
\end{tabular} & INSTR & \begin{tabular}{c} 
MUL \\
PIPE
\end{tabular} & \begin{tabular}{c} 
ALU \\
PIPE
\end{tabular} & \begin{tabular}{c} 
P \\
REG
\end{tabular} & \begin{tabular}{c} 
C \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} & \begin{tabular}{c} 
Y \\
BUS
\end{tabular} & COMMENT
\end{tabular}

SN74ACT8847

\section*{Lセ881つもヤLNS}

Table 61．Pseudocode for Chebyshev Exponential Routine（PIPES2－0＝010，RND1－0）（Continued）
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \[
\begin{aligned}
& \text { DA } \\
& \text { BUS }
\end{aligned}
\] & \[
\begin{gathered}
\text { DB } \\
\text { BUS }
\end{gathered}
\] & \[
\begin{gathered}
\text { RA } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { RB } \\
\text { REG }
\end{gathered}
\] & CLK MODE & INSTR & MUL PIPE & \begin{tabular}{l}
ALU \\
PIPE
\end{tabular} & \[
\begin{gathered}
\mathbf{P} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\text { C } \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
\mathbf{S} \\
\text { REG }
\end{gathered}
\] & \[
\begin{gathered}
Y \\
\text { BUS }
\end{gathered}
\] & COMMENT \\
\hline 21 & \({ }_{8} 8 \mathrm{MSH}\) & \({ }^{\text {c }} 8\) LSH & －1023 & c8 & 0 & PR21＋RB21 & & & P6 & & & & \\
\hline 22 & & & －1023 & \({ }^{8} 8\) & 1 & SR22＊CR22 & & & & & S9 & & \\
\hline 23 & & & －1023 & \(\mathrm{c}_{8}\) & 0 & PR24＋RB24 & SR22＊CR22 & & & & & & \\
\hline 24 & \(c_{7} \mathrm{MSH}\) & \(c_{7} \mathrm{LSH}\) & －1023 & c7 & 0 & PR24＋RB24 & & & P7 & & & & \\
\hline 25 & & & －1023 & c7 & 1 & SR25＊CR25 & & & & & S10 & & \\
\hline 26 & & & －1023 & c7 & 0 & PR27＋RB27 & SR25＊CR25 & & & & & & \\
\hline 27 & \(\mathrm{c}_{6} \mathrm{MSH}\) & \(\mathrm{c}_{6}\) LSH & －1023 & \(\mathrm{c}_{6}\) & 0 & PR27＋RB27 & & & P8 & & & & \\
\hline 28 & & & －1023 & \(\mathrm{c}_{6}\) & 1 & SR28＊CR28 & & & & & S11 & & \\
\hline 29 & & & －1023 & \(\mathrm{c}_{6}\) & 0 & PR30＋RB30 & SR28＊CR28 & & & & & & \\
\hline 30 & \(\mathrm{c}_{5} \mathrm{MSH}\) & \(\mathrm{c}_{5} \mathrm{LSH}\) & －1023 & \(\mathrm{c}_{5}\) & 0 & PR30＋RB30 & & & P9 & & & & \\
\hline 31 & & & －1023 & \(\mathrm{C}_{5}\) & 1 & SR31＊CR31 & & & & & S12 & & \\
\hline 32 & & & －1023 & C5 & 0 & PR33＋RB33 & SR31＊CR31 & & & & & & \\
\hline 33 & \(\mathrm{c}_{4} \mathrm{MSH}\) & \(\mathrm{c}_{4} \mathrm{LSH}\) & －1023 & \(\mathrm{C}_{4}\) & 0 & PR33＋RB33 & & & P10 & & & & \\
\hline 34 & & & －1023 & \(\mathrm{c}_{4}\) & 1 & SR34＊CR34 & & & & & S13 & & \\
\hline 35 & & & －1023 & \(\mathrm{c}_{4}\) & 0 & PR36＋RB36 & SR34＊CR34 & & & & & & \\
\hline 36 & \(\mathrm{c}_{3} \mathrm{MSH}\) & \(c_{3}\) LSH & －1023 & c3 & 0 & PR36＋RB36 & & & P11 & & & & \\
\hline 37 & & & －1023 & \(\mathrm{c}_{3}\) & 1 & SR37＊CR37 & & & & & S14 & & \\
\hline 38 & & & －1023 & \(\mathrm{c}_{3}\) & 0 & PR39＋RB39 & SR37＊CR37 & & & & & & \\
\hline 39 & \(\mathrm{c}_{2} \mathrm{MSH}\) & \(\mathrm{c}_{2} \mathrm{LSH}\) & －1023 & \(\mathrm{c}_{2}\) & 0 & PR39＋RB39 & & & P12 & & & & \\
\hline 40 & & & －1023 & \(\mathrm{c}_{2}\) & 1 & SR40＊CR40 & & & & & S15 & & \\
\hline 41 & & & －1023 & \(\mathrm{c}_{2}\) & 0 & PR42＋RB42 & SR40＊CR40 & & & & & & \\
\hline 42 & \(\mathrm{c}_{1} \mathrm{MSH}\) & \(c_{1}\) LSH & －1023 & c1 & 0 & PR42＋RB42 & & & P13 & & & & \\
\hline 43 & & & －1023 & \(\mathrm{c}_{1}\) & 1 & SR43＊CR43 & & & & & S16 & & \\
\hline
\end{tabular}

Table 61. Pseudocode for Chebyshev Exponential Routine (PIPES2-0 =010, RND1-0) (Concluded)
\begin{tabular}{|l|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CLK & \begin{tabular}{c} 
DA \\
BUS
\end{tabular} & \begin{tabular}{c} 
DB \\
BUS
\end{tabular} & \begin{tabular}{c} 
RA \\
REG
\end{tabular} & \begin{tabular}{c} 
RB \\
REG
\end{tabular} & \begin{tabular}{c} 
CLK \\
MODE
\end{tabular} & INSTR & \begin{tabular}{c} 
MUL \\
PIPE
\end{tabular} & \begin{tabular}{c} 
ALU \\
PIPE
\end{tabular} & \begin{tabular}{c} 
P \\
REG
\end{tabular} & \begin{tabular}{c} 
C \\
REG
\end{tabular} & \begin{tabular}{c} 
S \\
REG
\end{tabular} & \begin{tabular}{c} 
Y \\
BUS
\end{tabular} & COMMENT
\end{tabular}

\section*{Lち88」つもゅLNS}

\section*{Microcode Table for the \(\operatorname{Exp}(x)\) Calculation}

All numbers are in hex．Any field with a length that is not a multiple of 4 is right justified and zero filled．For the microcode table，the value of \(X\) has been chosen to be 6．25．

\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline F 40190000 & 00000 & F & 00 & & & & & & & & & & & & & & & & & & \\
\hline 3FF71547 & 652B82FE & F & 11 & 1 & 2 & & FF & 1 & 1 & 1 & 0 & 1 CO & 0 & & & 0 & 3 & & & & 0 \\
\hline 00000000 & 00000000 & F & 00 & & 2 & & FB & 1 & & & 0 & 1A3 & & & & 0 & 3 & & & & \\
\hline 000 & 00000000 & F & 00 & 0 & 2 & & FB & 1 & 1 & & 0 & 1A & 1 & & & 0 & 3 & & & & \\
\hline 00000400 & 00000000 & F & 0 & 0 」 & 2 & 0 & FE & 1 & 1 & 0 & 0 & 200 & 0 & & & 0 & 3 & & 0 & O & \\
\hline 01 & 00000000 & F & 10 & 0 & 2 & & FE & 1 & 1 & & 0 & 200 & 0 & & & 0 & 3 & & & & \\
\hline 00 & 00000001 & F & 01 & & 2 & & BF & 1 & 1 & & 0 & 40 & 0 & & & 0 & 3 & & & & \\
\hline F 00000000 & 00000000 & F & 00 & & 2 & & FB & 1 & 1 & & 0 & 1 A & 0 & & 0 & 0 & 3 & & 0 & 0 & \\
\hline F 00000000 & 00000000 & F & 00 & 0 & 2 & 1 & F6 & 1 & 1 & & 0 & 18 & 0 & & & 0 & 3 & & 0 & & \\
\hline 000000 & 00000000 & F & 01 & ， & 2 & & BF & 1 & 1 & & 0 & 1 C & 0 & 0 & & 0 & 3 & & 0 & 0 & \\
\hline 0 & 0000 & & & & 2 & & FB & 1 & 1 & & 0 & 180 & 0 & & & 0 & 3 & & 0 & & \\
\hline F BFF00000 & 000 & F & 0 & & 2 & 0 & FB & 1 & 1 & & 0 & 18 & 0 & & & 0 & 3 & & 0 & O & \\
\hline BD45A7FC & 05D3 & F & 01 & 1 & 2 & & BF & 1 & 1 & 1 & 0 & 1 C & 0 & 0 & 0 & 0 & 3 & & 0 & 0 & \\
\hline 00 & 00000000 & F & & 0 － & 2 & & FB & 1 & 1 & 0 & 0 & 18 & 0 & & & 0 & 3 & & 0 & & \\
\hline 3D9 & 2DBF487C & F & 0 & 1 － & 2 & 0 & & 1 & 1 & & 0 & 80 & 0 & & & 03 & & & & & \\
\hline 00000000 & 0000 & F & 00 & 0 & 2 & 13 & 9F & 1 & 1 & & 0 & C0 & 0 & & 0 & 0 & 3 & & 0 & 0 & \\
\hline 00000000 & 00000000 & F & 00 & 0 & 2 & 0 & FB & 1 & 1 & & 0 & 180 & 0 & & 0 & 0 & 3 & & 0 & 0 & \\
\hline BDE351B8 & 21AC16D & F & 01 & 1 & 2 & 03 & FB & 1 & 1 & & 0 & 180 & 0 & & 0 & 0 & 3 & & 0 & & \\
\hline 0000000 & 00000000 & F & 00 & 0 & 2 & 13 & 9F & 1 & 1 & & 0 & 1 CO & 0 & & & O & 3 & & & 0 & \\
\hline F 00000000 & 00000000 & F & 00 & & 2 & 0 & FB & 1 & 1 & 1 & 0 & 180 & 0 & 0 & 0 & 03 & 3 & & 0 & 0 & \\
\hline 3E2F5B0E & 17440879 & & 01 & 1 & 2 & & FB & 1 & 1 & 1 & & & 0 & & 0 & 03 & 3 & & & & \\
\hline
\end{tabular}

\section*{Microcode Table for the \(\operatorname{Exp}(x)\) Calculation (Continued)}


\section*{Microcode Table for the \(\operatorname{Exp}(x)\) Calculation（Concluded）}


\section*{High-Speed Vector Math and 3-D Graphics}

\section*{Introduction}

Texas Instruments SN74ACT8837 and SN74ACT8847 floating point units (FPU) are designed to execute high-speed, high-accuracy mathematical computations. The devices are especially suited for matrix manipulations such as those used in graphics or digital signal processing. These FPUs multiply and add data elements by executing sequences of microprogrammed calculations to form new matrices. Each device may be configured for either single- or double-precision operation. Single-precision operation is assumed throughout this report.

The 'ACT8847 is a functional superset of the 'ACT8837 and operates at higher clock rates (up to 33 MHz ) than the \(16-\mathrm{MHz}\) ' 8837 . Unlike the 'ACT8837, the 'ACT8847 can perform integer and logical operations and has built-in, hardwired algorithms for division and square root operations.

This application report outlines the timing, data flow, and programming for several common data vector calculations and matrix transformations. Further, it illustrates some of the programming "tricks" resulting in fastest operation. Throughout, this document compares the timing schemes for programs in which all registers, including the ALU and multiplier internal pipeline registers, are enabled ("pipelined" mode) with those for equivalent programs in which the internal pipeline registers are disabled ("unpiped" mode). Equations are provided to help the programmer select the more efficient mode, and performance figures are included for both devices, with times given for \(15-\mathrm{MHz}\) and \(30-\mathrm{MHz}\) operations.

This report begins by covering simple vector arithmetic operations, which are categorized as "computational" or "compare" functions for convenience. This document then compares these operations as they are used in graphics applications to perform three-dimensional coordinate transformations, perspective viewing, and clipping.

\section*{SN74ACT8837 and SN74ACT8847 Floating Point Units}

Both the 'ACT8837 and 'ACT8847 floating point units (FPU) combine a multiplier and an arithmetic-logic unit (ALU) in a single microprogrammable VLSI device. These devices are implemented in TI's advanced one-micron CMOS technology and are fully compatible with the IEEE standard for binary floating point arithmetic, STD 754-1985, for either single- or double-precision operation.

Instruction inputs can select independent ALU operation, independent multiplier operation, or simultaneous ALU/multiplier operation. Each FPU can handle three types of data input formats. The ALU accepts data operands in integer format or IEEE floating
point format．In the＇ACT8837，integers are converted to normalized floating point numbers with biased exponents prior to further processing．A third type of operand， denormalized numbers，can also be processed after the ALU has converted them to ＂wrapped＂numbers，which are explained in detail in the SN74ACT8800 Family Data Manual．The＇ACT8837 multiplier operates only on normalized floating point numbers or wrapped numbers．The＇ACT8847 multiplier also operates on integer operands．

Data enters the＇ACT8837 or＇ACT8847 through two 32－bit data buses，DA and DB（see Figures 74 and 75 ），which can be configured to operate as a single 64 －bit data bus for double－precision operations．Data can be latched in a 64－bit temporary register or loaded directly into the input registers，RA and RB，which pass data to the multiplier and ALU．

A clock－mode control allows the temporary register to be clocked on the rising or falling edge of the clock to support double－precision ALU operations at the same rate as single－ precision operations．Using the temporary register，double－precision numbers on a single 32 －bit input bus can be loaded in one clock cycle．

The input registers RA and RB are the first of three levels of internal data registers． Additionally，the ALU and multiplier each have an internal pipeline register and an output register．The ALU＇s output register is denoted by＂S＂（sum），and the multiplier＇s output register is denoted by＂P＂（product）．Any or all of these internal registers may be bypassed．

A 64－bit constant register（C）with a separate clock is provided for temporary storage of a multiplier result，ALU result，or constant for feedback to the multiplier and ALU．An instruction register and a status register are also included．

Four multiplexers select the multiplier and ALU operands from the input，C，S，or \(P\) registers．Results are output on the 32－bit Y bus；a Y output multiplexer selects the most or least significant half of the result for output．

In addition to add，subtract，and multiply functions，the＇ACT8837 can be programmed to perform floating point division using a Newton－Raphson algorithm．Absolute value conversions，floating point－to－integer and integer－to－floating point conversions，and a compare instruction are also available．

The＇ACT8847 FPU is fully compatible with IEEE Standard 754 －1985 for addition， subtraction，multiplication，division，square root，and comparison．The＇ACT8847 FPU also performs integer arithmetic，logical operations，and logical shifts．Additionally， absolute value conversions and floating point－to－integer and integer－to－floating point conversions are available．


Figure 74. SN74ACT8837 Floating Point Unit


Figure 75．SN74ACT8847 Floating Point Unit

For both the 'ACT8837 and 'ACT8847, the ALU and multiplier can operate in parallel to perform sums of products and products of sums. Detailed information regarding the instruction inputs for the various 'ACT8837 and 'ACT8847 configurations and operations is given in the SN74ACT8800 Family Data Manual.

\section*{Mathematical Processing Applications}

TI's SN74ACT8837 and SN74ACT8847 high-speed floating point units (FPU) are designed to perform high-accuracy, computationally-intensive mathematical operations. In particular, these FPUs can meet the computational demands of high-end graphics workstations and advanced signal processing. Both applications involve repetitive computations on arrays of data typically expressed as vector arithmetic operations.

For example, the calculation of the sum of products, or multiply-accumulate function, is frequently used in both signal and graphics processing. In general form, the sum of products equation is:
\[
S=\sum_{i=1}^{n} k_{i} x_{i} \text {, for coefficients } k_{i} \text { and data } x_{i} \text {. }
\]

This sum of products is the central function involved in multiplying matrices. Such matrices might represent a system of linear differential equations or the geometrical transformation of a graphic object. Specifically, an \(n \times n\) matrix A multiplied by an \(n \times m\) matrix \(B\) yields an \(n \times m\) matrix \(C\) whose elements \(c_{i j}\) are given by:
\[
c_{i j}=\sum_{k=1}^{n} a_{i k} \times b_{k j} \text { for } i=1, \ldots, n \text { and } j=1, \ldots, m
\]

The 'ACT8837 and 'ACT8847 are designed to handle efficiently this kind of parallel multiplication and addition.

\section*{Graphics Applications}

The basic principle of graphics processing is that any object can be reduced to a combination of points, lines, and polygons and then defined as a collection of points in three-dimensional space. Because points, planes, transformation matrices and other common data structures are vectors, most of the computations involved in graphics processing are vector operations.

Computations for a 3-D graphics display are highly involved due to the complexity introduced by the z-axis. Viewing an object from a particular perspective involves transforming the object's world coordinates, or its coordinates in the model space, into viewing, or eyepoint, coordinates. A series of translations and rotations map the viewing system axes onto the world coordinate axes. Each individual point must be translated, rotated and, if necessary, scaled in a proper order. Once the coordinate transformation is complete, the coordinates are clipped to a viewing volume. Clipping algorithms employ arithmetic operations to determine whether an object, or part of an object, is inside or outside a pyramidal volume. Hidden surface routines may then be employed to delete surfaces that fall behind a "nearer" surface from the viewer's perspective.

Matrix arithmetic is required for scaling, rotating, translating, or shearing an object, as well as for the final process of projecting its visible parts to a two-dimensional frame buffer. Any sequence of these transformations can be represented as a single matrix formed by concatenating the matrices for the individual operations. The generalized \(4 \times 4\) matrix for transforming a three-dimensional object is shown below, partitioned into four component matrices, each of which produces a specific effect on the image. The \(3 \times 3\) matrix produces linear transformation in the form of scaling, shearing, and rotation. The \(1 \times 3\) row matrix produces translation, while the \(3 \times 1\) column matrix produces perspective transformation with multiple vanishing points. The final single-element \(1 \times 1\) matrix produces overall scaling.
\[
\left[\begin{array}{l|r}
3 \times 3 & \begin{array}{c}
3 \\
\\
1 \times 3
\end{array} \\
\hline 1 \times 1
\end{array}\right]
\]

Overall operation of the matrix \(T\) on the position vectors of a graphics object produces a combination of shearing, rotation, reflection, translation, perspective, and overall scaling.

\section*{Vector Arithmetic}

Programs that require repetitive computations on multiple sets of operands lend themselves to vector-processing algorithms, in which the operands are viewed as succeeding elements of long "data vectors." The next two sections outline the programming for commonly-used vector operations. Most of these examples conclude with a comparison of program timing for pipelined (internal pipeline registers enabled) and unpiped (internal pipeline registers disabled) operation. For convenience, the operations are labeled "computational," which includes simple and compounded adds, multiplies, and divides, or "compare," which can be used to select maximum or minimum values from succeeding pairs of numbers or from a list.

\section*{Computational Operations on Data Vectors}

This section covers the following vector operations: vector add, vector multiply, vector divide, sum of products (also called inner, scalar, or dot product), and product of sums. Since matrix multiplication is composed of a sequence of sum of products operations, these two functions are discussed in the same section. In some cases, a whole class of operations is covered under one heading. For example, the vector add operation includes sums and differences of \(A_{i}, B_{i},\left|A_{i}\right|\), and \(\left|B_{i}\right|\) in all combinations.

\section*{Vector Add}

The vector add operation adds corresponding components of data vectors to obtain the components of the output vector. Hence, for input vectors \(A\) and \(B\) and output vector \(Y\), each with N components,
\[
Y_{i}=A_{i}+B_{i}, \quad 1 \leq i \leq N .
\]

The 'ACT8837 and 'ACT8847 perform this calculation in unchained, independent ALU mode.

Table 62 shows the contents of the data registers at successive clock cycles for \(N=6\) with the FPU operating in pipelined mode. Since the data travels by way of the internal pipeline register, two cycles pass before the first sum appears in the \(S\) register. The contents of the internal pipeline register are not given in the flow.

Table 62. Data Flow for Pipelined Single-Precision Vector Add, \(\mathbf{N}=\mathbf{6}\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline\(R A\) & \(A 1\) & \(A 2\) & \(A 3\) & \(A 4\) & \(A 5\) & \(A 6\) & & & \\
\hline RB & B 1 & B 2 & B 3 & B 4 & B 5 & B 6 & & & \\
\hline S & & & \(\mathrm{~A} 1+\mathrm{B} 1\) & \(\mathrm{~A} 2+\mathrm{B} 2\) & \(\mathrm{~A} 3+\mathrm{B} 3\) & \(\mathrm{~A} 4+\mathrm{B} 4\) & \(\mathrm{~A} 5+\mathrm{B} 5\) & \(\mathrm{~A} 6+\mathrm{B} 6\) & \\
\hline P & & & & & & & & & \\
\hline C & & & & & & & & & \\
\hline Y & & & Y 1 & Y 2 & Y 3 & Y 4 & Y 5 & Y 6 & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
\end{tabular}

Data transfers and operations for each clock cycle are summarized in the program listing in Table 63. Detailed information on the instruction inputs required to perform each operation is included in sections 5 and 7 . Note that the selection of the output source (in this case, the \(S\) register), which is determined by the 16 instruction bit, is programmed along with the ALU or multiplier operation that generates the output.

Table 63. Program Listing for Pipelined Single-Precision Vector Add, \(\mathbf{N}=6\)
\begin{tabular}{|ccc|c|c|}
\hline \multicolumn{2}{|c|}{ REGISTER TRANSFERS } & ALU OPERATION & \begin{tabular}{c} 
MULTIPLIER \\
OPERATION
\end{tabular} \\
\hline 1. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{S}\) & ADD(RA,RB) & \\
2. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{S}\) & ADD(RA,RB) & \\
3. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{S}\) & ADD(RA,RB) & \\
. & & & & \\
. & & & & \\
6. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{S}\) & ADD(RA,RB) & \\
\hline
\end{tabular}

Timing and programming are similar for other independent ALU operations involving two operands, such as \((A-B)\), \((B-A)\), and compare \((A, B)\). However, when the compare function is used, two status bits must be generated before numeric values can be output (see "Compare Operations on Data Vectors").

Because the vector add program closely parallels that for vector multiplication, pipelined and unpiped modes for both vector add and multiply are compared in the next section.

\section*{Vector Multiply}

The vector multiply operation multiplies corresponding elements of data vectors to obtain the components of the output vector. Hence, for input vectors \(A\) and \(B\) and output vector \(Y\), each with \(N\) components,
\[
Y_{i}=A_{i} \times B_{i}, \quad 1 \leq i \leq N .
\]

The 'ACT8837 and 'ACT8847 perform this calculation in unchained, independent multiplier mode.

\section*{Pipelined Mode}

Table 64 shows the contents of the data registers at successive clock cycles for \(N=6\) with the FPU operating in pipelined mode. The product may be replaced by a variety of other independent multiplier operations, such as \(-(A \times B), A \times|B|,-(A \times|B|),|A|\) \(\times|B|\), and \(-(|A| \times|B|)\). Data transfers and operations for each clock cycle are summarized in the program listing in Table 65.

Table 64. Data Flow for Pipelined Single-Precision Vector Multiply, \(\mathbf{N}=\mathbf{6}\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline RA & A 1 & A 2 & A 3 & A 4 & A 5 & A 6 & & & \\
\hline RB & B 1 & B 2 & B 3 & B 4 & B 5 & B 6 & & & \\
\hline S & & & & & & & & & \\
\hline P & & & \(\mathrm{A} 1 \times \mathrm{B} 1\) & \(\mathrm{~A} 2 \times \mathrm{B} 2\) & \(\mathrm{~A} 3 \times \mathrm{B} 3\) & \(\mathrm{~A} 4 \times \mathrm{B} 4\) & \(\mathrm{~A} 5 \times \mathrm{B} 5\) & \(\mathrm{~A} 6 \times \mathrm{B} 6\) & \\
\hline C & & & & & & & & & \\
\hline Y & & & & & & & & & \\
\hline Y 1 & & & Y 1 & Y 2 & Y 3 & Y 4 & Y 5 & Y 6 & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
\end{tabular}

Table 65. Program Listing for Pipelined Single-Precision Vector Multiply, N \(=\mathbf{6}\)
\begin{tabular}{|ccc|c|c|}
\hline \multicolumn{2}{|c|}{ REGISTER TRANSFERS } & ALU OPERATION & \begin{tabular}{c} 
MULTIPLIER \\
OPERATION
\end{tabular} \\
\hline 1. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{P}\) & & MULT(RA,RB) \\
2. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{P}\) & & MULT(RA,RB) \\
3. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{P}\) & & MULT(RA,RB) \\
. & & & & \\
. & & & & \\
6. & LOAD RA, RB; & \(\mathrm{Y} \leftarrow \mathrm{P}\) & & MULT(RA,RB) \\
\hline
\end{tabular}

\section*{Unpiped Mode}

Table 66 shows the contents of the data registers at successive clock cycles during a vector multiply operation for \(N=6\) with the FPU operating in unpiped mode. The vector add operation progresses similarly. Since there is no "single-clocked storage" in the internal pipeline register, each product or sum is performed in one cycle.

Table 66. Data Flow for Unpiped Single-Precision Vector Multiply, \(\mathbf{N}=\mathbf{6}\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline RA & A 1 & A 2 & A 3 & A 4 & A 5 & A 6 & & & \\
\hline RB & B 1 & B 2 & B 3 & B 4 & B 5 & B 6 & & & \\
\hline S & & & & & & & & & \\
\hline P & & \(\mathrm{A} 1 \times \mathrm{B} 1\) & \(\mathrm{~A} 2 \times \mathrm{B} 2\) & \(\mathrm{~A} 3 \times \mathrm{B} 3\) & \(\mathrm{~A} 4 \times \mathrm{B} 4\) & \(\mathrm{~A} 5 \times \mathrm{B} 5\) & \(\mathrm{~A} 6 \times \mathrm{B} 6\) & & \\
\hline C & & & & & & & & & \\
\hline Y & & Y 1 & Y 2 & Y 3 & Y 4 & Y 5 & Y 6 & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
\end{tabular}

Comparison of Pipelined and Unpiped Modes
For both vector add and vector multiply operations carried out in pipelined mode, results are output to the \(Y\) bus on clocks \(3, \ldots, N+2\). In unpiped mode, results are output to the Y bus on clocks \(2, \ldots, N+1\), thereby saving a cycle. Unfortunately, it is necessary to operate at a lower clock rate in unpiped mode than in pipelined mode. The following equation can be used to determine which of the two modes provides the faster performance in a particular application. Pipelined operation is faster if:
\[
(N+2) / F_{p}<(N+1) / F_{u}
\]
where \(F_{p}\) and \(F_{u}\) are the clock rates in pipelined and unpiped modes, respectively. As of publication, pipelined mode provides faster performance for input vectors with \(\mathrm{N}>2\).

\section*{Sum of Products}

The sum of products operation multiplies corresponding elements of data vectors and adds the resulting products．The operation is also referred to as the inner product，scalar product，or dot product of two vectors，since these are the names for the function as it is used in vector algebra．For input vectors \(A\) and \(B\) ，each with \(N\) components，the sum of products operation yields a single output \(Y\) defined as follows：
\[
Y=\sum_{i=1}^{N}\left(A_{i} \times B_{i}\right)
\]

The＇ACT8837 and＇ACT8847 perform this calculation in chained mode so that concurrent operation of the ALU and multiplier is possible．

\section*{Pipelined Mode}

Table 67 shows the contents of the data registers at successive clock cycles for \(N=8\) with the FPU operating in pipelined mode．

Table 67．Data Flow for Pipelined Single－Precision Sum of Products， \(\mathbf{N}=8\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & A1 & A2 & A3 & A4 & A5 & A6 & A7 & A8 & & & & & & \\
\hline RB & B1 & B2 & B3 & B4 & B5 & B6 & B7 & B8 & & & & & & \\
\hline S & & & & & S1 & & S3 & S4 & S5 & S6 & S7 & S8 & & \(\mathrm{S} 7+8\) \\
\hline P & & & P1 & P2 & P3 & P4 & P5 & P6 & P7 & P8 & & & & \\
\hline C & & & & & P2 & P2 & & & & & & S7 & & \\
\hline Y & & & & & & & & & & & & & & Y \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 \\
\hline
\end{tabular}

Here，\(P_{i}=A_{i} \times B_{i}, S_{1}=P_{1}+0, S_{3}=P_{3}+S_{1}, S_{4}=P_{4}+P_{2}, S_{6}=P_{6}+S_{4}, S_{7}=P_{7}\) \(+S_{5}\) ，and \(S_{8}=P_{8}+S_{6}\) ．The values of the sums could be more succinctly expressed as \(S_{i}=P_{i}+S_{i-2}\)（with \(S_{0}=S_{-1}=0\) ），except that \(S_{2}=P_{2}+0=P_{2}\) does not actually appear in the data flow as a sum in the \(S\) register．Instead，the \(C\) register holds \(P_{2}\) for two cycles．

This approach，although introducing a certain lack of symmetry into the programming， frees up the \(S\) register at a point allowing the efficient overlap of succeeding sum of products operations without any dead cycles．A new sum of products operation can begin at CLK 9，and the \(S\) register remains free to hold the first operation＇s result in CLK 14．Similary，by storing \(S 7\) in the \(C\) register in CLK 12，rather than multiplying it by one， the \(P\) register remains free to hold＂P2＂for the next pair of data vectors．By CLK 12， \(S_{7}=P_{1}+P_{3}+P_{5}+P_{7}\) and \(S_{8}=P_{2}+P_{4}+P_{6}+P_{8}\) ，so that \(Y=S_{7}+S_{8}\) ．

Data transfers and operations for each clock cycle are summarized in the program listing in Table 68.

Table 68. Program Listing for Pipelined Single-Precision Sum of Products, \(\mathrm{N}=8\)
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{3}{|r|}{REGISTER TRANSFERS} & ALU OPERATION & MULTIPLIER OPERATION \\
\hline 1. & LOAD RA, RB & & & MULT(RA, RB) \\
\hline 2. & LOAD RA, RB & & & MULT(RA,RB) \\
\hline 3. & LOAD RA, RB & & ADD(P,0) & MULT(RA,RB) \\
\hline 4. & LOAD RA, RB; & \(C \leftarrow P\) & & MULT(RA,RB) \\
\hline 5. & LOAD RA, RB & & ADD(P,S) & MULT(RA,RB) \\
\hline 6. & LOAD RA, RB & & ADD(P,C) & MULT(RA, RB) \\
\hline & LOAD RA, RB & & ADD(P,S) & MULT(RA,RB) \\
\hline & LOAD RA, RB & & ADD(P,S) & MULT(RA,RB) \\
\hline 9. & & & ADD(P,S) & \\
\hline 10. & & & ADD(P,S) & \\
\hline 11. & & \(\mathrm{C} \leftarrow \mathrm{S}\) & & \\
\hline 12. & & \(Y \leftarrow S\) & \(\operatorname{ADD}(\mathrm{S}, \mathrm{C})\) & \\
\hline
\end{tabular}

The above algorithm imposes no delay between input vectors. The time required to carry out the sum of products operation on \(M\) pairs of input vectors in succession, each of length \(N\), is \(N \times M+6\) cycles.

\section*{Unpiped Mode}

In the unpiped version of the sum of products, the data flow is more straightforward. Again, chained mode is employed to allow the ALU and multiplier to operate concurrently. Table 69 shows the contents of the data registers at successive clock cycles for \(N=8\) with the FPU operating in unpiped mode. Here, \(P_{i}=A_{i} \times B_{i}\), and \(\mathrm{S}_{\mathrm{i}}=\mathrm{S}_{(\mathrm{i}-1)}+\mathrm{P}_{\mathrm{i}}\), with \(\mathrm{S}_{0}=0\).

Table 69. Data Flow for Unpiped Single-Precision Sum of Products, \(\mathbf{N}=\mathbf{8}\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & A1 & A2 & A3 & A4 & A5 & A6 & A7 & A8 & & & & & & \\
\hline RB & B1 & B2 & B3 & B4 & B5 & B6 & B7 & B8 & & & & & & \\
\hline S & & & S1 & S2 & S3 & S4 & S5 & S6 & S7 & S8 & & & & \\
\hline P & & P1 & P2 & P3 & P4 & P5 & P6 & P7 & P8 & & & & & \\
\hline C & & & & & & & & & & & & & & \\
\hline Y & & & & & & & & & & Y & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 \\
\hline
\end{tabular}

A new problem can be presented at CLK 9 without any delay between the vectors. Therefore, the time required to compute the sums of products for \(M\) pairs of vectors, each of length \(N\), is \(N \times M+2\) clock cycles.

\section*{Comparison of Pipelined and Unpiped Modes}

The following equation can be used to determine which of the two modes provides the faster performance in a particular application. Pipelined operation is faster if:
\[
(M \times N+6) / F_{p}<(M \times N+2) / F_{u},
\]
where \(F_{p}\) and \(F_{u}\) are the clock rates in pipelined and unpiped modes, respectively. Because the unpiped mode's longer clock cycle usually outweighs its savings in cycles, pipelined mode provides faster performance for input vectors with \(N>4\).

\section*{Product of Sums}

The product of sums operation adds corresponding elements of data vectors and multiplies the resulting sums. For input vectors \(A\) and \(B\), each with \(N\) components, the product of sums operation yields a single output \(Y\) defined as follows:
\[
Y=\prod_{i=1}^{N}\left(A_{i}+B_{i}\right)
\]

The product of differences can be computed by simply making the ALU operation \((A-B)\) or \((B-A)\). The 'ACT8837 and 'ACT8847 perform this calculation in chained mode so that concurrent operation of the ALU and multiplier is possible. The data flow and program listing for the product of sums are identical to those for the sum of products, except that the roles of add and multiply are reversed. The criteria used to decide between pipelined and unpiped modes are also identical to those previously given.

\section*{Vector Divide}

The vector divide operation divides corresponding elements of data vectors to obtain the components of the output vector. Hence, for vectors \(A\) and \(B\) and output vector \(Y\), each with N components,
\[
Y_{i}=A_{i} / B_{i}, \quad 1 \leq i \leq N .
\]

The 'ACT8837 and 'ACT8447 perform this calculation using the Newton-Raphson iterative method. This algorithm, which is described in detail in the SN74ACT8800 Family Data Manual, calculates the value of a quotient \(Y\) by approximating the reciprocal of the divisor \(B\) and then multiplying the dividend \(A\) by that approximation.

The following sections review the vector divide programs for the 'ACT8837 and the 'ACT8847. In the 'ACT8847, the divide algorithm is built-in.

\section*{SN74ACT8837 Vector Divide}

For division using single-element inputs \(A\) and \(B\), the value of the reciprocal of \(B\), denoted by \(X\), is determined iteratively using the following equation:
\[
X_{i+1}=X_{i}\left(2-B \times X_{i}\right)
\]

The seed approximation, \(X_{0}\), is assumed to be given. The iteration stops when \(X\) is determined to the desired level of precision. Assuming the presence of a seed ROM providing 4-bits accuracy, three iterations are necessary to correctly determine a singleprecision result \(X\). Given the seed for \(1 / B=X_{0}, X_{i+1}=X_{i}\left(2-B \times X_{i}\right)\). \(A\) is eventually multiplied by the value \(X_{3}\).

An 8-bit seed ROM is commonly employed and gives single-precision accuracy in only two iterations and double-precision accuracy in three iterations. Instructions for implementing an 8-bit seed ROM are included in the SN74ACT8800 Family Data Manual. This example assumes that a 4-bit seed is used to develop the program.

\section*{Pipelined Mode}

The 'ACT8837 performs the vector divide in chained mode. Table 70 shows the data flow for pipelined operation. The value of \(\left(2-B \times X_{i}\right)\) is denoted as \(T_{i}\). Note that the value \(X_{3}\) does not appear, per se, in the table, but is expressed in terms of \(X_{2}\) to save unnecessary calculations. The output \(Y\) is determined from the calculation of ( \(A \times X_{2}\) ) \(\times T_{2}\) in cycle 17, which is equivalent to \(A \times X_{3}\), since \(X_{3}=X_{2} \times T_{2}\).

In order to keep \(X_{i}\) available for the final calculation of \(X_{i+1}\), a few programming "tricks" are employed to keep the original value of each \(X_{i}\) within the chip while it is being altered in the calculation of \(\left(2-B \times X_{i}\right)\). First, \(X_{i}\) is stored in the \(S\) register by adding 0 to it. Then, when the \(S\) register is needed, \(X_{i}\) is moved to the \(P\) register by multiplying it by 1 .

Table 70. Data Flow for 'ACT8837 Pipelined Single-Precision Vector Divide, \(\mathbf{N}=1\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & XO & & & & & & \(B\) & & & \\
\hline RB & \(B\) & & & & & & & & & \\
\hline S & & & \(X 0\) & & T0 & & & & X 1 & \\
\hline P & & & \(B \times X 0\) & & \(X 0\) & & X 1 & & \(\mathrm{~B} \times \mathrm{X} 1\) & \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & & & \(B\) & & & & & & & \\
\hline RB & & & & & \(A\) & & & & & \\
\hline S & T 1 & & & & X 2 & & T 2 & & & \\
\hline P & X 1 & & X 2 & & \(\mathrm{~B} \times \mathrm{X} 2\) & & \(\mathrm{~A} \times \mathrm{X} 2\) & & Y & \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & & Y & \\
\hline CLK & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 & 20 \\
\hline
\end{tabular}

Data transfers and operations are summarized in the program listing in Table 71. Because no operations begin on even-numbered cycles, only the odd-numbered clock cycles are shown.

Table 71. Program Listing for 'ACT8837 Pipelined Single-Precision Vector Divide, \(\mathbf{N}=1\)
\begin{tabular}{|c|c|c|c|}
\hline \multicolumn{2}{|r|}{REGISTER TRANSFERS} & ALU OPERATION & MULTIPLIER OPERATION \\
\hline 1. & LOAD RA, RB & ADD (RA, 0 ) & MULT(RA,RB) \\
\hline 3. & & \(\operatorname{ADD}(2,-\mathrm{P})\) & \(\operatorname{MULT}(\mathrm{S}, 1)\) \\
\hline 5. & & & MULT(S,P) \\
\hline 7. & LOAD RA & ADD (P,0) & MULT(RA,P) \\
\hline 9. & & \(\operatorname{ADD}(2,-P)\) & \(\operatorname{MULT}(\mathrm{S}, 1)\) \\
\hline 11. & & & MULT(S, P) \\
\hline 13. & LOAD RA & ADD(P,0) & MULT(RA,P) \\
\hline 15. & LOAD RB & \(\operatorname{ADD}(2,-\mathrm{P})\) & MULT(S,RB) \\
\hline 17. & \(Y \leftarrow P\) & & MULT(S,P) \\
\hline
\end{tabular}

In steps 1,7 , and 13,0 is added to \(X_{i}\) so that \(X_{i}\) appears two cycles later in the \(S\) register. In steps 3 and 9 , the \(X_{i}\) value in the \(S\) register is multiplied by 1 so that it appears in the \(P\) register two cycles later. In step 15, \(X_{i}\) (from the \(S\) register) is multiplied by the dividend \(A\) just input to RB.

Because no operations begin on even cycles, two vector divide operations may be interleaved, calculating two quotients in 20 cycles. Table 72 shows the data flow for computing two quotients, \(Y_{1}\) and \(Y_{2}\), where \(Y_{1}=A / B\) and \(Y_{2}=C / D\). The approximation for \(1 / B\) is denoted by \(W_{i}\), and the approximation for \(1 / D\) is denoted by \(X_{i}\). \(T_{i}=\left(2-B \times W_{i}\right)\), and \(Q_{i}=\left(2-D \times X_{i}\right)\).

Table 72. Data Flow for 'ACT8837 Pipelined Single-Precision Interleaved Vector Divide, \(\mathbf{N}=2\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & WO & XO & & & & & \(B\) & \(D\) & & \\
\hline RB & \(B\) & \(D\) & & & & & & & & \\
\hline S & & & \(W 0\) & \(X 0\) & TO & Q 0 & & & W 1 & X 1 \\
\hline P & & & \(B \times W 0\) & \(\mathrm{D} \times \mathrm{XO}\) & WO & XO & W 1 & X 1 & \(\mathrm{~B} \times \mathrm{W} 1\) & \(\mathrm{D} \times \mathrm{X} 1\) \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & & & B & D & & & & & & \\
\hline RB & & & & & A & C & & & & \\
\hline S & T 1 & Q 1 & & & W 2 & X 2 & T 2 & Q 2 & & \\
\hline P & W 1 & X 1 & W 2 & X 2 & \(\mathrm{~B} \times \mathrm{W} 2\) & \(\mathrm{D} \times \mathrm{X} 2\) & \(\mathrm{~A} \times \mathrm{W} 2\) & \(\mathrm{C} \times \mathrm{X} 2\) & Y 1 & Y 2 \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & & Y 1 & Y 2 \\
\hline CLK & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 & 20 \\
\hline
\end{tabular}

The program listing for an interleaved vector divide is similar to that for a single divide operation, with functions listed in each odd line and duplicated in the next even line for the second operation.

As previously stated, the time needed to compute two single-precision divide operations starting with a 4-bit seed ROM is 20 clock cycles. Since a new pair of divides can start at \(C L K=19\), the time required to perform the vector divide operation on two N -dimensional vectors is given by the following equation:
\[
\text { TIME }=[18 \times \text { CEILING(N/2) }+2] \text { cycles, }
\]
where the ceiling function rounds to the next highest integer for fractional values. With an 8 -bit seed ROM, the time reduces to [ \(12 \times\) CEILING(N/2) +2 ] cycles, which equals 2.5 million divides per second at 15 MHz .

\section*{Unpiped Mode}

Table 73 shows the data flow for a vector divide in unpiped, chained mode.
Table 73. Data Flow for 'ACT8837 Unpiped Single-Precision Vector Divide, \(\mathbf{N}=1\)
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & XO & & & \(B\) & & & \(B\) & & & \\
\hline RB & B & & & & & & & \(A\) & & \\
\hline S & & XO & TO & & X 1 & T 1 & & X 2 & T 2 & \\
\hline P & & \(\mathrm{B} \times \mathrm{X} 0\) & XO & X 1 & \(\mathrm{~B} \times \mathrm{X} 1\) & X 1 & X 2 & \(\mathrm{~B} \times \mathrm{X} 2\) & \(\mathrm{~A} \times \mathrm{X} 2\) & Y \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & & & Y \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}

This program uses the same methods as the pipelined version to keep \(X_{i}\) within the chip. The time needed to compute a vector divide of two N -element vectors is \((9 \mathrm{~N}+1)\) cycles with a 4 -bit seed ROM and \((6 \mathrm{~N}+1)\) cycles with an 8 -bit seed ROM.

\section*{Comparison of Pipelined and Unpiped Modes}

Using a 4-bit seed ROM, pipelined mode is faster if:
\[
[18 \times \operatorname{CEILING}(N / 2)+2] / F_{p}<(9 N+1) / F_{u}
\]
where \(F_{p}\) and \(F_{u}\) are the clock rates in pipelined and unpiped modes. As of publication, pipelined mode provides faster performance for input vectors with \(N>1\).

\section*{A General Principle}

The vector divide example illustrates a general programming principle that should be considered whenever a program begins a new instruction every other cycle．In cases where the C register is not used，it is simple to interleave another program，even one not performing the same function．

Interleaving programs is not as easy if the \(C\) register is used because the \(C\) register is the only nonpiped register．However，even using the \(C\) register，programs may often be interleaved by staggering one against the other so that their use of the \(C\) register does not overlap in time．Many of the programs so far discussed can be thought of as two such interleaved programs，with the \(C\) register being used to delay the first result until it can be combined with the second．（See，for example，the sum of products operation．）

\section*{SN74ACT8847 Vector Divide}

Since the＇ACT8847 has a built－in algorithm for divide，the microprogram is more simple than that for the＇ACT8837．Table 74 shows the data flow for pipelined operation．Data transfers and operations are summarized in the program listing in Table 75.

Table 74．Data Flow for＇ACT8847 Pipelined Single－Precision
Vector Divide
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & A1 & & & & & & A2 & & & \\
\hline RB & B1 & & & & & & B2 & & & \\
\hline S & & & & & & & & & & \\
\hline P & & & & & & & & \(A 1 / B 1\) & & \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & & & Y 1 & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}

Table 75．Program Listing for＇ACT8847 Pipelined Single－Precision Vector Divide


Note that the microinstructions are presented on the steps indicated ( \(1,7,13, \ldots\) ), with a six-cycle lapse before the next operands can be input to RA and RB. Performing a vector divide of two \(N\)-element single-precision vectors takes \((6 N+2)\) cycles in pipelined mode. \(M\) such pairs of vectors would require \([6(N \times M)+2]\) cycles in pipelined mode. In unpiped mode, the equation is \(7(N \times M)\).

\section*{Compare Operations on Data Vectors}

In independent ALU mode (unchained), two operands may be compared for equality \((A=B)\) and order \((A>B)\). Additionally, the absolute values of either or both operands may be compared. The compare function uses two status bits, the AGTB and AEQB output signals. (When any operation other than a compare is performed, either by the ALU or the multiplier, the AEQB signal is used as a zero detect. Hence, numerical results cannot be output in the same cycle in which comparison status is output.)

For greatest efficiency, programs for compare operations should be written without requiring conditional branches in the sequencer. If branches can be avoided, the microcoding is simplified and the programs are immediately scalable to SIMD systems employing many 'ACT8837 or 'ACT8847 chips.

This section covers vector max/min and list max/min operations.

\section*{Vector MAX/MIN}

The vector max/min operations compare corresponding elements of data vectors and select the maximum or minimum value to obtain the components of the output vector. Hence, for input vectors \(A\) and \(B\) and output vector \(Y\), each with \(N\) components,
\[
Y_{i}=\operatorname{MAX} / \operatorname{MIN}\left(A_{i}, B_{i}\right), \quad 1 \leq i \leq N .
\]

\section*{Pipelined Mode}

Table 76 shows the suggested data flow for a pipelined vector MAX operation, where \(\mathrm{Y}_{\mathrm{i}}\) is set to the max of \(\left(A_{i}, B_{i}\right)\) for all \(i\). Included are rows to indicate the setting of the chain mode instruction bit ( 19 for the 'ACT8837, 110 for the 'ACT8847) and the status bit being sensed.

Table 76．Data Flow for Pipelined Single－Precision Vector MAX
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & N & Y & Y & Y & N & Y & Y & Y & N & Y \\
\hline RA & A & A 1 & & B 1 & A 2 & A 2 & & B 2 & A 3 & A 3 \\
\hline RB & B 1 & & & & B 2 & & & & B 3 & \\
\hline S & & & & A 1 & & B 1 & & A 2 & & B 2 \\
\hline P & & & & & & A 1 & & & & A 2 \\
\hline C & & & & & & & & & & \\
\hline Y & & & & & & Y 1 & & & & Y 2 \\
\hline STATUS & & & \(\mathrm{A}>\mathrm{B}\) & & & & \(\mathrm{A}>\mathrm{B}\) & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}

A comparison starts at CLK \(=1,5\) ，etc．，when the chain－mode instruction bit is low．The result appears at CLK \(=3,7\) ，etc．，indicated by the AGTB and AEQB signals．AGTB is saved off－chip for use as instruction bit l6（output source）at CLK 4，8，etc．This value for 16 selects the output source，either the multiplier or the ALU result，at CLK 6，10，etc．For example，if a comparison result is \(A>B\) ，the AGTB signal goes high and is used to set 16 high．I6 then selects the multiplier result \(\left(A_{i}\right)\) to output．Similarly，if \(A \leq B, A G T B\) and 16 are low，and the ALU result \(\left(B_{j}\right)\) is output．The circuitous route taken by \(A_{i}\) on the way to the \(P\) register is necessary because it is not possible to pass RA or RB through the multiplier in parallel with passing the other through the ALU．

The program is not particularly well－packed and produces the vector max of a pair of vectors of length \(N\) in \((4 N+2)\) cycles．For \(M\) pairs of vectors of length \(N\) ，the total time is \((4 \mathrm{MN}+2)\) cycles．The program can be improved by applying the interleaving principle previously discussed．The steps are rearranged so that a new operation begins every other cycle，thus allowing two compare programs to be interleaved．Table 77 shows the suggested data flow for a pipelined vector min／max operation，where \(\mathrm{Y}_{\mathrm{i}}=\operatorname{MAX} / \mathrm{MIN}\left(A_{i}\right.\) ， \(\left.B_{i}\right)\) and \(Z_{i}=\operatorname{MAX} / \operatorname{MIN}\left(C_{i}, D_{i}\right)\) ．

Table 77．Data Flow for Pipelined Single－Precision Interleaved Vector MAX／MIN
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & N & N & Y & Y & Y & Y & N & N & Y & Y & Y & Y & N & N \\
\hline RA & A 1 & C 1 & A 1 & C 1 & B 1 & D 1 & A 2 & C 2 & A 2 & C 2 & B 2 & D 2 & & \\
\hline RB & B 1 & D 1 & & & & & B 2 & D 2 & & & & & & \\
\hline S & & & & & A 1 & C 1 & B 1 & D 1 & & & A 2 & C 2 & B 2 & D 2 \\
\hline P & & & & & & & A 1 & C 1 & & & & & A 2 & C 2 \\
\hline C & & & & & & & & & & & & & & \\
\hline Y & & & & & & & Y 1 & Z 1 & & & & & Y 2 & Z 2 \\
\hline STATUS & & & \(\mathrm{A}>\mathrm{B}\) & \(\mathrm{A}>\mathrm{B}\) & & & & & \(\mathrm{A}>\mathrm{B}\) & \(\mathrm{A}>\mathrm{B}\) & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 \\
\hline
\end{tabular}

Again，\(A_{i}\left(a n d C_{i}\right)\) reaches the \(P\) register by an indirect route．However，this tighter program performs \(M\) vector comparisons，two vector comparisons at a time，in ［ \(6 \times \mathrm{N} \times\) CEILING（M／2）+2 2］cycles．（As previously defined，the ceiling function rounds to the next highest integer for fractional values．）In this example，two separate vector
comparisons on two-dimensional vectors are performed, giving \(6 \times 2 \times 1+2=14\) cycles. For \(M=2\) pairs of vectors, all of length \(N\), the second program is as good as the first. For \(M>2\), the interleaved program performs increasingly better as \(M\) gets larger.

This second program requires more off-chip logic, since the status outputs at CLK 3 and 4 must be saved separately off-chip for use at CLK 5 and 6, respectively. This problem can easily be avoided by starting the calculations on the second pair of vectors two cycles later than shown (i.e., at CLK 4). The time necessary to perform the vector MAX operation on M pairs of N -dimensional vectors, two pairs concurrently, then increases to \([6 \times N \times \operatorname{CEILING}(M / 2)+4]\) cycles.

Data transfers and operations for the odd lines only are summarized in the program listing in Table 78. The complete program is obtained by repeating the equivalent of each odd-numbered line in the next even line for the second pair of vectors.

Table 78. Program Listing for Pipelined Single-Precision Interleaved Vector MAX/MIN
\begin{tabular}{|ll|l|c|}
\hline \multicolumn{2}{|c|}{ REGISTER TRANSFERS } & ALU OPERATION & \begin{tabular}{c} 
MULTIPLIER \\
OPERATION
\end{tabular} \\
\hline 1. & LOAD RA, RB & COMPARE(RA,RB) & \\
3. & LOAD RA & ADD(RA, 0\()\) & \\
5. & LOAD RA; \(\quad \mathrm{Y} \leftarrow \mathrm{P} / \mathrm{S}\) & ADD(RA,0) & MULT(S,1) \\
\hline
\end{tabular}

\section*{Unpiped Mode}

Table 79 shows the data flow for an unpiped vector MAX operation.
Table 79. Data Flow for Unpiped Single-Precision Vector MAX
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & N & Y & Y & N & Y & Y & N & Y & Y \\
\hline RA & A 1 & A 1 & B 1 & A 2 & A 2 & B 2 & A 3 & A 3 & B 3 \\
\hline RB & B 1 & & & B 2 & & & B 3 & & \\
\hline S & & & A 1 & B 1 & & A 2 & B 2 & & A 3 \\
\hline P & & & & A 1 & & & A 2 & & \\
\hline C & & & & & & & & & \\
\hline Y & & & & Y 1 & & & Y 2 & & \\
\hline STATUS & & \(\mathrm{A}>\mathrm{B}\) & & & \(\mathrm{A}>\mathrm{B}\) & & & \(\mathrm{A}>\mathrm{B}\) & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
\end{tabular}

The status bit is saved off-chip at \(C L K=2,5\), etc., and used at CLK \(=3,6\), etc., as the 16 bit of the instruction. 16 selects either the multiplier or ALU result to output to the \(Y\) bus at CLK \(=4,7\), etc.

The program computes the vector comparison of \(M\) pairs of vectors of length \(N\) in \([3 \times M \times(N+1)]\) cycles.

\section*{Comparison of Pipelined and Unpiped Operation}

Pipelined operation is faster if:
\[
[6 \times N \times \operatorname{CEILING}(M / 2)+2] / F_{p}<(3 \times M \times N+1) / F_{u}
\]
where \(F_{p}\) and \(F_{u}\) are the clock rates in pipelined and unpiped modes, respectively. As of publication, pipelined mode provides faster performance for \(M>1\).

\section*{List MAX/MIN}

The list max/min operations select the maximum or minimum value, \(Z\), of a list of \(N\) elements. Hence, for input vector A with N components and output Z ,
\[
Z=\operatorname{MAX} / \operatorname{MIN}\left(A_{i}\right), \quad \quad 1 \leq i \leq N .
\]

List min/max is an essential operation in computer graphics because it is used to find the "extents" of a polygon or polyhedron. The extents are the maximum values of \(\mathrm{X}, \mathrm{Y}\); and Z among the list of vertices for the object in question. Many forms of comparison are possible since the absolute value of either or both ALU operands may be employed. However, the example in this section assumes that the largest element of a list of N elements is desired.

\section*{Pipelined Mode}

Table 80 shows the data flow for a pipelined list MAX operation,
\[
\text { where } \left.M_{1}=\operatorname{MAX}\left(A_{1}, A_{2}\right) ; M_{i}=\operatorname{MAX}\left[M_{(i-1)}\right), A_{(i+1)}\right], 2 \leq i \leq N-2 \text {. }
\]

Table 80. Data Flow for Pipelined Single-Precision List MAX
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & N & Y & Y & Y & N & Y & Y & Y & N & Y & Y & Y & Y & Y & Y \\
\hline RA & A 1 & A 1 & A 2 & & & A 3 & A 3 & & & A 4 & A 4 & & & & & \\
\hline RB & & A 2 & & & & & & & & & & & & & & \\
\hline S & & & A 1 & & A 2 & & & & M 1 & & & & M 2 & & & M 3 \\
\hline P & & & & & A 1 & & & & A 3 & & & & A 4 & & & \\
\hline C & & & & & & M 1 & M 1 & & & M 2 & M 2 & & & M 3 & & \\
\hline Y & & & & & & & & & & & & & & & & M3 \\
\hline STATUS & & & & \(\mathrm{A}>\mathrm{B}\) & & & & \(\mathrm{A}>\mathrm{B}\) & & & & \(\mathrm{A}>\mathrm{B}\) & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 \\
\hline
\end{tabular}

As with vector comparison, the max/min of the absolute values is available, since the chip operates in independent ALU mode on the comparison steps. The comparison is between the RA register and the RB register in step 2 and between RA and \(C\) in steps 6 , 10 , etc. In these steps, the chip is switched into unchained, independent ALU mode. The status is saved off-chip and used to set the SRCC signal, which selects whether the P or \(S\) data goes into the \(C\) register in steps 5,9 , etc.

When the list max is in the \(C\) register, at \(C L K=4 N-2\), the \(C\) register contents must then be passed through one of the functional units to the output. The MAX/MIN of an \(N\)-element list therefore takes \(4 N\) cycles. \(M\) such vectors can be processed in \([M(4 N-1)+1]\) cycles.

Data transfers and operations for the list max operation are summarized in the program listing in Table 81. The program is carried out in pipelined mode, alternating between unchained and chained modes. The list max reaches the output in cycle 4 N .

Table 81. Program Listing for Pipelined Single-Precision List MAX


Comparison of Pipelined and Unpiped Modes
The equivalent unpiped program takes \([M(3 N-1)+1]\) cycles. Pipelined mode is fastest if:
\[
[M(4 N-1)+1] / F_{p}<[M(3 N-1)+1] / F_{u},
\]
where \(F_{p}\) and \(F_{u}\) are the clock rates in pipelined and unpiped modes, respectively. As of publication, pipelined mode provides faster performance for all \(M\) and \(N\).

\section*{Graphics Applications}

This section summarizes the concepts related to creating a three-dimensional image and examines a few of the matrix operations used in three-dimensional graphics processing. These operations include coordinate transformations and clipping operations. Additionally, this section illustrates some of the programming techniques used to perform these operations.

\section*{Creating a 3－D Image}

Conceptually，translating 3－D images to 2－D display screens involves defining a view volume that limits the scope of the vista the viewer can see at one time．For simplicity，a standardized frame of reference，in which the viewer＇s eye is located at the origin of the coordinate system，is adopted in this example．

As illustrated in Figures 76a and 76b，the arbitrary world coordinates of the objects under scrutiny are transformed into normalized＂viewing＂or＂eye＂coordinates that reflect this frame of reference．Once the normalizing transformation is complete，the images within the view volume are projected onto a 2－D view plane，which is assumed to be located， like a projection screen，at a suitable relative distance from the viewer（see Figures 76c and 77）．

A basic model for creating a 3－D view，illustrated in Figure 78a，transforms arbitrary world coordinates to normalized viewing coordinates and then＂clips＂the image to remove lines that do not fall within the normalized view volume．Clipping is followed by projecting the image to the 2－D projection plane（or＂window＂）．The image is then mapped onto a canonical 2－D viewport display and from there onto the physical device．

To incorporate image transformations，another model must be adapted（see Figure 78b）． After clipping，instead of projecting to the view plane，a perspective transformation is performed on the clipped viewing coordinates，transforming the view volume into a 3－D viewport，the＂screen system＂in which image transforms are performed．Then the image is projected to the 2－D viewport display and onto the physical device．

In both models，the clipping operation is performed on coordinates in the viewing system．This approach is referred to as＂clipping in the eye system．＂In practice，clipping is often performed after transformation to the screen system．A trivial accept／reject test is performed on viewing coordinates，the image is transformed to the screen system，and then clipping is performed．


Figure 76a. In a sequence of transformations, the world coordinate positions for the house are transformed into the normalized viewing coordinate system (also called the eye system). For clarity, the house is pictured outside the view column. Also shown are the direction vectors VUP (view up), VPN (view normal), and VUP' (the projection of VUP parallel to VUN onto the view plane.


Figure 76b. After a series of translations, rotations, and shearing and scaling operations, the view volume becomes the canonical perspective projection view volume, which is a truncated pyramid with apex at the origin, and the house has been transformed from the world to the viewing coordinate system.


Figure 76c. This figure illustrates the projection of the house from the perspective of the viewer, with eye located at the origin of the coordinate system.
J. D. Foley and A. Van Dam, Fundamentals of Interactive Computer Graphics, Addison-Wesley Publishing Company, Reading, MA, 1982, 291-293. Reprinted with permission.

The following sections illustrate programming techniques used in both of these approaches to normalizing，clipping，and transforming a 3－D image．The operations are grouped as＂3－D Coordinate Transforms，＂＂Clipping in the Eye System，＂and＂Clipping in the Screen System．＂


Figure 77．View Volume
Adapted with permission from a paper by Stephen R．Black entitled＂Digital Processing of 3－D Data to Generate Interactive Real－Time Dynamic Pictures＂from Volume 120 of the 1977 SPIE journal＂Three Dimensional Imaging．＂


Figure 78a. Model of Procedure for Creating a 3-D Graphic


Figure 78b. Model for Creating and Transforming a 3-D Image

\section*{Three-Dimensional Coordinate Transforms}

One of the computationally-intensive functions of a 3-D computer graphics system is that of transforming points within the object space, such as translating an object or rotating an object about an arbitrary axis. Equally complex is the transformation of points within the object space (or "world coordinate system") into points defined by a particular perspective and located within the viewing space (or "eye coordinate system"). This latter process, known as the viewing transformation, generates points in a left-handed cartesian system with the eye at the origin and the \(z\)-axis pointing in the direction of view. The arbitrary world-system view volume and the objects therein are translated, rotated, sheared, and scaled to match the predefined, canonical view volume of the eye system.

For a "realistic" image, the canonical view volume will be a truncated pyramid that mimics the cone of vision available to the human eye. Alternatively, the volume can be a unit cube. The series of operations that make up each transformation differ, but if homogeneous coordinates are used, either transformation can be expressed as a simple matrix multiply.

For each point \((X, Y, Z)\) in the world system，a projection in homogeneous coordinates is denoted by \(\left(X_{h}, Y_{h}, Z_{h}, W_{h}\right)\) where，
\[
\left(X_{h}, Y_{h}, Z_{h}, W_{h}\right)=\left(X \times W_{h}, Y \times W_{h}, Z \times W_{h}, W_{h}\right)
\]
and \(W_{h}\) is simply a scale factor，typically unity when floating point numbers are used． （With fixed point values，nonunity values of \(W_{h}\) are used to maximize use of the numeric range．）To transform a point in homogeneous coordinates，it is post－multiplied by a \(4 \times 4\) transform matrix：
\[
\left[X_{h^{\prime}}, Y_{h}, Z_{h}, W_{h}{ }^{\prime}\right]=\left[X_{h}, Y_{h}, Z_{h}, W_{h}\right] \times\left[\begin{array}{llll}
\text { A11 } & \text { A12 } & \text { A13 } & \text { A14 } \\
\text { A21 } & \text { A22 } & \text { A23 } & \text { A24 } \\
\text { A31 } & \text { A32 } & \text { A33 } & \text { A34 } \\
\text { A41 } & \text { A42 } & \text { A43 } & \text { A44 }
\end{array}\right]
\]

The transformed point can later be converted back to 3－space by dividing by \(\mathrm{W}_{\mathrm{h}}\) ：
\[
\left(X^{\prime}, Y^{\prime}, Z^{\prime}\right)=\left(X_{h^{\prime}} / W_{h}, Y_{h}^{\prime} / W_{h}^{\prime}, Z_{h^{\prime}}^{\prime} / W_{h^{\prime}}\right)
\]

The transform matrix is constructed by multiplying together a sequence of matrices， each of which performs a simple task．The product of 4 or 5 elementary matrices may be used to perform some complex overall operation on a set of points representing an object or an entire scene．Once constructed，the transform matrix is used on each point of the object to be transformed．

This section describes two approaches to the viewing transformation－－the general case and the specific yet typical case in which a reduced version of the transform matrix may be used．Performance times are given for \(15-\mathrm{MHz}\) and \(30-\mathrm{MHz}\) frequencies，which roughly correspond to the operating speeds of the＇ 8837 and＇ 8847 ，respectively．

\section*{Operation with General Transform Matrix}

Table 82 shows part of the data flow for the pipelined and chained program for the product of the homogeneous point \([\mathrm{X}, \mathrm{Y}, \mathrm{Z}, \mathrm{W}]\) and the \(4 \times 4\) transform matrix A ．

Table 82．Partial Data Flow for Product of \([\mathbf{X}, \mathrm{Y}, \mathrm{Z}, \mathrm{W}]\) and General Transform Matrix
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & X & Y & Z & W & X & Y & Z & W & X & Y \\
\hline RB & A 11 & A 21 & A 31 & A 41 & A 12 & A 22 & A 32 & A 42 & A 13 & A 23 \\
\hline S & & & & & \(\mathrm{~S} 1(1)\) & & \(\mathrm{S} 3(1)\) & \(\mathrm{S} 4(1)\) & \(\mathrm{S} 1(2)\) & T 1 \\
\hline P & & & \(\mathrm{P} 1(1)\) & \(\mathrm{P} 2(1)\) & \(\mathrm{P} 3(1)\) & \(\mathrm{P} 4(1)\) & \(\mathrm{P} 1(2)\) & \(\mathrm{P} 2(2)\) & \(\mathrm{P} 3(2)\) & \(\mathrm{P} 4(2)\) \\
\hline C & & & & & \(\mathrm{P} 2(1)\) & \(\mathrm{P} 2(1)\) & & \(\mathrm{S} 3(1)\) & \(\mathrm{P} 2(2)\) & \(\mathrm{P} 2(2)\) \\
\hline Y & & & & & & & & & & X \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline
\end{tabular}

The technique is that already illustrated for the sum of products operation. The numbers in parentheses indicate which column of the transform matrix is involved in the operation. Here, \(P 1_{(i)}=X \times A_{1 i}, P 2_{(i)}=Y \times A_{2 i}\), etc. \(S 1_{\text {(i) }}=P 1_{\text {(i) }}+0, S 3_{(i)}=S 1_{\text {(i) }}+P 3_{(i)}, S 4_{\text {(i) }}\) \(=P 2_{(i)}+P 4_{(i)}\), and \(T_{i}=S 3_{(i)}+S 4{ }_{(i)} \cdot T 1=X^{\prime}, T 2=Y^{\prime}, T 3=Z^{\prime}, T 4=W^{\prime}\). As in the sum of products illustration, in order to make the most efficient use of the \(S\) register, P2 is used directly instead of summing by 0 to form S 2 .

The time to transform N points in a system is \(16 \mathrm{~N}+6\) cycles. The system can transform approximately .94 million points per second at a clock rate of 15 MHz and 1.875 million points per second at a clock rate of 30 MHz .

\section*{Operation with the Reduced Transform Matrix and \(W_{h}=1\)}

Because viewing transformations are frequently carried out using a single-vanishingpoint perspective, the \(3 \times 1\) column that performs perspective transformations with multiple vanishing points is often not used. Additionally, with \(W_{h}=1\), the \(1 \times 1\) scale factor is often equal to one. In these cases, the transform matrix takes the following form:
\[
\left[\begin{array}{cc}
\cdots & 0 \\
\cdots & 0 \\
\cdots & 0 \\
\cdots & 1
\end{array}\right]
\]

With multiple vanishing points, and in other graphics operations such as clipping, \(4 \times 4\) matrices are used with nonzero values in the fourth column. The transform matrix is termed "reduced" when its fourth column is the same as that previously shown. In such cases, the transform of each point requires only 9 multiplications and 9 additions.

Table 83 shows part of the data flow for the reduced matrix program.
Table 83. Partial Data Flow for Product of \([\mathrm{X}, \mathrm{Y}, \mathrm{Z}, \mathrm{W}]\) and Reduced Transform Matrix
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|}
\hline RA & \(X\) & \(Y\) & \(Z\) & \(X\) & \(X\) & \(Y\) & \(Z\) & \(\times\) & \(X\) \\
\hline RB & A11 & A21 & A31 & A41 & A12 & A22 & A32 & A42 & A13 \\
\hline S & & & & & & \(\mathrm{S} 1(1)\) & \(\mathrm{S} 2(1)\) & & T 1 \\
\hline P & & & \(\mathrm{P} 1(1)\) & \(\mathrm{P} 2(1)\) & \(\mathrm{P} 3(1)\) & & \(\mathrm{P} 1(2)\) & \(\mathrm{P} 2(2)\) & \(\mathrm{P} 3(2)\) \\
\hline & & & & \(\mathrm{P} 1(1)\) & \(\mathrm{P} 2(1)\) & & \(\mathrm{S} 1(1)\) & \(\mathrm{P} 1(2)\) & \(\mathrm{P} 2(2)\) \\
\hline Y & & & & & & & & & \(\mathrm{X}^{\prime}\) \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
\end{tabular}

Again, the numbers in parentheses refer to the column of the transform matrix involved in the operation. In this case, however, only the first three columns are used. Hence, for \(1 \leq i \leq 3, P 1_{(i)}=X \times A_{1 i}, P 2_{(i)}=Y \times A_{2 i}\), etc. \(S 1_{(i)}=P 1_{(i)}+A_{4 i}, S 2_{(i)}=P 2_{(i)}+P 3_{(i)}\), and \(T_{i}=S 1_{(i)}+S 2_{(i)} . T 1=X^{\prime}, T 2=Y^{\prime}, T 3=Z^{\prime}\). Note that \(W\) values are not calculated since they are all 1.

The time to transform N points in a system is \((12 \mathrm{~N}+5)\) cycles. The system can transform 1.25 million points per second at 15 MHz and 2.5 million points per second at 30 MHz .

\section*{Three-Dimensional Clipping}

Once an image is transformed into viewing coordinates, it must be clipped so that lines extending outside the view volume are removed. There are several approaches to clipping, some more efficient than others. This section surveys the most commonly used techniques and estimates the throughput of several single- and multi-processor arrangements.

First considered is the technique of fully clipping the line segments to fit within the viewing pyramid in the eye coordinate system. This technique is commonly referred to as "clipping before division."

Clipping in the screen system is considered second. This method eliminates lines that are obviously invisible in the eye system; the rest are clipped after projection to the screen.

\section*{Clipping in the Eye System}

If an object is composed of straight line segments and a perspective view is to be taken, the viewing volume is a pyramid defined by the following plane equations:
\[
X=K \times Z, X=-K \times Z, Y=K \times Z, Y=-K \times Z,
\]
where K is a constant to be defined below. Thus, \(-\mathrm{KZ}<(\mathrm{X}, \mathrm{Y})<K Z\). Two other clipping planes are usually employed at \(Z=N\) and \(Z=F\), where \(N\) and \(F\) are the near and far limits, respectively, of the view. This gives:
\[
N<Z<F .
\]

Looking in the direction of the z -axis (see Figure 79), the eye can imagine a screen located at a distance N from the eye. K is formed from the half-screen height divided by N . A specific line segment might intersect any or all of the six clipping planes. One common approach to this problem is to use six processors in a pipeline, each clipping the line to one plane.


Figure 79. Viewing Pyramid Showing Six Clipping Planes
Consider the case of clipping the line defined by the points \(\mathrm{P} 1=(\mathrm{X} 1, \mathrm{Y} 1, \mathrm{Z} 1)\) and \(P 2=(X 2, Y 2, Z 2)\) against the \(Z=N\) plane. First computed are \((Z 1-N)\) and \((Z 2-N)\). If both are negative, the line is invisible, and a notation meaning an empty line is passed on. If both are positive, both ends of the line are on the visible side of the \(Z=N\) plane, and the line is passed on unclipped.

When one of these computed values is negative and the other positive, the line must be clipped and the new values for its endpoints passed down the rest of the pipeline. To do so, a parameter \(t\) that indicates what fraction of a segment \(Z 1 Z 2\), and therefore of P1P2 as a whole, lies on the P1 side of the \(\mathrm{Z}=\mathrm{N}\) plane, is computed as follows:
\[
t=(Z 1-N) /\left(Z_{1}-Z 2\right) .
\]

In general, the value of the parameter is derived as described in Newman and Sproull, \({ }^{1}\) using the following equations of the line: \(X=X 1+(X 2-X 1) u ; Y=Y 1+(Y 2-Y 1) u\); \(Z=Z 1+(Z 2-Z 1) u\). These equations are each inserted into the corresponding plane equation. In the current example, \(\mathrm{N}=\mathrm{Z1}+(\mathbf{Z 2}-\mathrm{Z} 1) \mathrm{t}\).

Since \(N\) is between \(Z 1\) and \(Z 2\), \(t\) is always positive, and the signs of \(Z 1-N\) and \(Z 2-N\) are used to determine which end to clip. If \(\mathrm{Z1}-\mathrm{N}\) is negative, the P 1 end is clipped, using the value of \(t\) to determine the delta in X1 and Y1. The coordinates for the new endpoint of the shortened line segment are given by:
\[
X 1^{\prime}=X 1+(X 2-X 1) \times t, Y 1^{\prime}=Y 1+(Y 2-Y 1) \times t, Z 1^{\prime}=N .
\]

\footnotetext{
1 Newman, W. M., and Sproull, R. F., Principles of Interactive Computer Graphics, McGraw-Hill, 1979.
}

Similarly for the case when the P2 end must be clipped:
\[
X 2^{\prime}=X 1+(X 2-X 1) \times t, Y 2^{\prime}=Y 1+(Y 2-Y 1) \times t, Z 2^{\prime}=N
\]

An alternative to clipping to one plane at a time entails clipping to all six planes at once. Both approaches are examined in the following sections.

\section*{Clipping to One Plane at a Time}

When a pipeline of six processors is used, each clipping the same line to one plane, each processor must wait for data from the previous processor and hold its solution until the next processor is ready to receive it. There is no reason to seek shortcuts through the computations by including branches in the program because there is little point in one of the processors completing its task earlier than the rest. This statement is true whether the six processors are driven from the same or from separate sequencers. Similarly, operating the pipeline asynchronously buys little time. Synchronous operation in the case of a clipping pipeline is likely to be almost as fast as, and much simpler and cheaper than, asynchronous operation.

Because shortcuts are not beneficial, the program can be written assuming the maximum amount of work will be required at each stage, whether the line requires clipping at that stage or not. If it is assumed that invisible lines are caught and eliminated as a separate, initial computation, branches from the clipping pipeline can be eliminated entirely. An alternative approach, in which branches would be beneficial, involves using two, three, or more 'ACT8837 or 'ACT8847 chips in parallel, rather than as a pipeline, each performing all six stages of clipping for individual lines. The program lends itself to this approach because the computations in each stage of the clipping pipeline are identical.

The method for clipping a line segment against the \(Z=N\) plane as one stage in a clipping pipeline, assuming invisible lines have been previously eliminated, will be illustrated. Two \(t\) values are computed \(-t_{1}\) for clipping the P1 end of the line segment and \(t_{2}\) for clipping the P2 end. If \(Z 1<N, t_{1}=(Z 1-N) /(Z 1-Z 2)\); otherwise, \(t_{1}=0\). If \(Z 2<N, t_{2}=(Z 2-N) /(Z 1-Z 2)\); otherwise, \(t_{2}=0\). The new endpoints for the line segment are computed as follows:
\[
\begin{aligned}
& X 1^{\prime}=X 1+(X 2-X 1) \times t_{1}, \\
& Y 1^{\prime}=Y 1+(Y 2-Y 1) \times t_{1}, \\
& Z 1^{\prime}=Z 1+(Z 2-Z 1) \times t_{1}, \\
& X 2^{\prime}=X 2-(X 2-X 1) \times t_{2}, \\
& Y 2^{\prime}=Y 2-(Y 2-Y 1) \times t_{2}, \\
& Z 2^{\prime}=Z 2-(Z 2-Z 1) \times t_{2} .
\end{aligned}
\]

Note that the denominator is the same in the equations for \(t_{1}\) and \(t_{2}\); it is this reciprocal computation that is expensive in time. However, in the 'ACT8837, it is also simple to interleave other computations with that of the reciprocal, and in the ' 8847 , the built-in divide is very fast.

A simple trick is used to compute the \(t_{i}\) values in a streamlined fashion. \(H_{i}=\left(Z_{i}-N\right)\) is first computed, followed by the sum \(H_{i}^{\prime}=H_{i}-\left|H_{i}\right|\). Note that if (Zi-N) is negative, \(H_{i}{ }^{\prime}=2 H_{i}=2(\mathrm{Zi}-\mathrm{N})\); otherwise, \(\mathrm{Hi}^{\prime}=0\). Hence, in a straightforward manner, a suitable numerator for \(t_{i}\) has been computed, regardless of the sign of \((\mathrm{Zi}-\mathrm{N})\). This approach avoids resorting to an "if/then" decision to compute \(\mathrm{t}_{\mathrm{i}}\).

To scale the denominator to the numerator, \(D=2(Z 1-Z 2)\) is computed, and the Newton-Raphson algorithm in the ' 8837 or the built-in divide instruction in the ' 8847 is used to determine the values of \(1 / D, t_{1}=\left|H_{1} \prime / D\right|\), and \(t_{2}=\left|H_{2}\right| D \mid\). New values of ( \(\mathrm{X} 1, \mathrm{Y} 1, \mathrm{Z} 2\) ) and ( \(\mathrm{X} 2, \mathrm{Y} 2, \mathrm{Z2}\) ) are then computed using \(\mathrm{t}_{1}\) and \(\mathrm{t}_{2}\).

The data flow and program listing for the clipping against \(Z=N\) operation as performed on the 'ACT8837 are given in Tables 84 and 85. Here, \(t_{1}=\left|\left(H_{i}-\left|H_{i}\right|\right) / D\right|\). Also, \(d=Z 1-Z 2, H_{1}=Z 1-N, H_{1}^{\prime}=H_{1}-\left|H_{1}\right|, H_{2}=H_{2}-\left|H_{2}\right|, R_{i}=\) successive approximations for \(1 / d, T_{i}=\left(2-d \times R_{i}\right)\), and \(\left.R_{(i+1}\right)=T i \times R_{i}\).

Table 84. Data Flow for Clipping a Line Segment Against the \(\mathbf{Z}=\mathbf{N}\) Plane Using the SN74ACT8837
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & Y & Y & Y & N & Y & N & Y & Y & Y & Y & Y & Y & Y & & N & N \\
\hline RA & Z1 & Z1 & Z2 & RO & & & & X2 & Y2 & d & & & & & & H1' & H2' \\
\hline RB & Z2 & N & N & d & & & & X1 & Y1 & & & O-S & & & & & \\
\hline S & & & d & H1 & H2 & R0 & \(\mathrm{H}^{\prime}\) & TO & H2' & \[
\begin{gathered}
\times 2- \\
\times 1 \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathrm{Y}_{2}- \\
\mathrm{Y} 1 \\
\hline
\end{array}
\] & R1 & & T1 & & & \\
\hline P & & & & & & d \(\times\) RO & & R0 & & R1 & & \(\mathrm{d} \times \mathrm{R} 1\) & & \[
\begin{gathered}
\mathrm{O}-\mathrm{S} \\
\mathrm{R} 1
\end{gathered}
\] & & 1/D & \\
\hline C & & & & & H1 & H2 & H2 & & & & & & & & & & 1/D \\
\hline Y & & & d & & & & H1' & & H2' & \[
\begin{gathered}
\mathrm{X}_{2} \\
\mathrm{X} 1 \\
\hline
\end{gathered}
\] & \[
\begin{array}{|c|}
\hline \mathrm{Y} 2- \\
\mathrm{Y} 1 \\
\hline
\end{array}
\] & & & & & & \\
\hline STATUS & & & & & & & & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 & 17 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & Y & Y & Y & Y & Y & Y & Y & & & \\
\hline RA & \[
\begin{gathered}
\text { (X2- } \\
\mathbf{X 1}) \\
\hline
\end{gathered}
\] & \[
\begin{aligned}
& (\mathrm{Y} 2- \\
& \mathrm{Y} 1)
\end{aligned}
\] & \[
\begin{aligned}
& (\mathrm{Z2}- \\
& \mathrm{Z1}) \\
& \hline
\end{aligned}
\] & & \[
\begin{aligned}
& \hline \times 2- \\
& \times 1)^{2} \\
& \hline
\end{aligned}
\] & \[
\begin{gathered}
(Y 2- \\
Y 1) \\
\hline
\end{gathered}
\] & \[
\begin{aligned}
& (\mathrm{Z2}- \\
& \mathbf{Z 1}) \\
& \hline
\end{aligned}
\] & & & & \\
\hline RB & & & X1 & Y1 & Z1 & & X1 & Y1 & Z1 & & \\
\hline S & & & & t2 & X1' & Y1' & Z1' & & X2' & Z2' & Z2' \\
\hline P & t1 & t2 & \[
\begin{gathered}
\hline \mathrm{X} 2- \\
\mathrm{X} 1) \\
\hline
\end{gathered}
\] & \[
\begin{aligned}
& (\mathrm{Y} 2- \\
& \mathrm{Y} 1) \\
& \hline
\end{aligned}
\] & \[
\frac{(\mathrm{Z2}}{\mathrm{Z} 1)}
\] & & \[
\begin{aligned}
& \hline \times 2- \\
& \mathrm{X} 1) \\
& \hline
\end{aligned}
\] & \[
\begin{aligned}
& \hline \mathrm{Y} 2- \\
& \mathrm{Y} 1) \\
& \hline
\end{aligned}
\] & \[
\begin{aligned}
& (\mathrm{Z2}- \\
& \mathrm{Z1}) \\
& \hline
\end{aligned}
\] & & \\
\hline & & & \(x+1\) & \(x+1\) & \(x+1\) & & \(x\) t2 & \(x+2\) & \(x+2\) & & \\
\hline C & & t1 & t1 & t2 & t2 & t2 & & & & & \\
\hline Y & & & & & X1' & \(\mathrm{Y}^{\prime}{ }^{\prime}\) & Z1' & & X2' & Y2' & Z2' \\
\hline STATUS & & & & & & & & & & & \\
\hline CLK & 18 & 19 & 20 & 21 & 22 & 23 & 24 & 25 & 26 & 27 & 28 \\
\hline
\end{tabular}

Table 85. Program Listing for Clipping a Line Segment Against the \(\mathbf{Z}=\mathbf{N}\) Plane Using the SN74ACT8837
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{4}{|c|}{REGISTER TRANSFERS} & ALU OPERATION & MULTIPLIER OPERATION \\
\hline 1. & LOAD RA, RB & \(\mathrm{Y} \leftarrow \mathrm{S}\) & & ADD (RA, -RB) & \\
\hline 2. & LOAD RA, RB & & & ADD (RA, -RB) & \\
\hline 3. & LOAD RA, RB & & & ADD (RA, -RB) & \\
\hline 4. & LOAD RA, RB; & & \(\mathrm{C} \leftarrow \mathrm{S}\) & ADD (RA, O) & MULT(RA,RB) \\
\hline 5. & LOAD RA, RB & \(Y \leftarrow S\) & \(\mathrm{C} \leftarrow \mathrm{S}\) & ADD ( \(\mathrm{C},-|\mathrm{C}|\) ) & \\
\hline 6. & & & & ADD ( \(2,-\mathrm{P}\) ) & MULT(S,I) \\
\hline 7. & & \(Y \leftarrow S\) & & ADD ( \(\mathrm{C},-|\mathrm{C}|\) ) & \\
\hline 8. & LOAD RA, RB & \(Y \leftarrow S\) & & ADD (RA, -RB) & MULT(S,P) \\
\hline 9. & & \(Y \leftarrow S\) & & ADD (RA, - RB) & \\
\hline 10. & LOAD RA & & & ADD ( \(\mathrm{P}, \mathrm{O}\) ) & MULT(RA,P) \\
\hline 11. & & & & & \\
\hline 12. & LOAD RB & & & ADD ( \(2,-\mathrm{P}\) ) & MULT(S,RB) \\
\hline 13. & & & & & \\
\hline 14. & & & & & MULT(S,P) \\
\hline 15. & & & & & \\
\hline 16. & LOAD RA & & \(C \leftarrow P\) & & MULT( \(\mid\) RA \(|,|\mathrm{P}|)\) \\
\hline 17. & LOAD RA & & & & MULT(|RA|, |C|) \\
\hline 18. & LOAD RA & & \(C \leftarrow P\) & & MULT(RA,P) \\
\hline 19. & LOAD RA & & & ADD ( \(\mathrm{P}, \mathrm{O}\) ) & MULT(RA,C) \\
\hline 20. & LOAD RA, RB & \(Y \leftarrow S\) & & ADD (P,RB) & MULT(RA,C) \\
\hline 21. & LOAD RA, RB & \(Y \leftarrow S\) & \(C \leftarrow S\) & ADD (P,RB) & \\
\hline 22. & LOAD RA, RB & \(Y \leftarrow S\) & & ADD (P,RB) & MULT(RA,C) \\
\hline 23. & LOAD RA, RB & & & & MULT(RA,C) \\
\hline 24. & LOAD RB & \(Y \leftarrow S\) & & ADD (P,RB) & MULT(RA,C) \\
\hline 25. & LOAD RB & \(Y \leftarrow S\) & & ADD (P,RB) & \\
\hline 26. & & \(Y \leftarrow S\) & & ADD (P,RB) & \\
\hline \[
\begin{array}{r}
27 . \\
28 . \\
\hline
\end{array}
\] & & & & & \\
\hline
\end{tabular}

In pipelined mode, computing \((Z 1-Z 2)\) takes 2 cycles. This value is passed off-chip and used to get the first approximation to \(0.5 /(\mathrm{Z1}-\mathrm{Z} 2)\) from an 8 -bit seed ROM. Iteration to correctly determine the value begins in the 4th cycle, with subsequent operations starting on even-numbered cycles. The computations of H 1 ' and H 2 ' are interleaved with the divide algorithm and are completed before it.
( \(\mathrm{X} 2-\mathrm{X} 1\) ), \((\mathrm{Y} 2-\mathrm{Y} 1)\), and \((Z 2-\mathrm{Z} 1)\) are also computed during the divide. The values of \(t_{1}\) and \(t_{2}\) are ready in steps 18 and 19. New values of \(X 1, X 2, Y 1, Y 2, Z 1\), and \(Z 2\) are all computed and output by step 28 . Each chip, therefore, clips against one clipping plane in 28 cycles. With a two-cycle overlap, the next line segment can be presented in cycle 26.

For the two X and two Y clipping planes, the calculations are slightly more complicated. For the \(X=K Z\) plane, the two parameters \(t_{i}\) are defined in terms of the values \(W_{1}=K Z_{1}\), \(W_{2}=K Z_{2}\) and \(H_{1}=W_{1}-X_{1}, H_{2}=W_{2}-X_{2}\) as follows:
\[
t_{1}=\left|H_{1}^{\prime} / 2\left(H_{1}-H_{2}\right)\right| \text { and } t_{2}=\left|H_{2}^{\prime} / 2\left(H_{1}-H_{2}\right)\right|
\]
where, as before, \(H_{i}^{\prime}=H_{i}-\left|H_{i}\right|\). The equations for the new endpoints, (X1', \(\mathrm{Y} 1^{\prime}, Z 1^{\prime}\) ) and ( X 2 ', Y 2 ', \(\mathrm{Z2}^{\prime}\) ), are the same as before. It is still possible to compute the new endpoints in under 30 cycles. At 15 MHz , a six-chip ' 8837 system would clip 577,000 line segments per second.

In the '8847 a similar process is employed, but the built-in divide instruction is used beginning in step 7 and ending in step 15. \(t_{1}\) and \(t_{2}\) are calculated by step 18, and the entire operation completes in step 27, one cycle shorter than for the '8837. The data flow is shown in Table 86. A six-processor ' 8847 system operating at 30 MHz would clip 1.2 million line segments per second with a new operation beginning every 25 cycles.

Table 86. Data Flow for Clipping a Line Segment Against the \(Z=N\) Plane Using the SN74ACT8847
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline RA & Z1 & Z1 & Z2 & X2 & & & & 0.5 & Y2 & \(\mathrm{H}^{\prime}\) & H2' & \multicolumn{3}{|l|}{\multirow{3}{*}{\[
\underset{\substack{\text { SAME AS FOR } \\ \text { S } \\ \hline}}{ }
\]}} \\
\hline RB & Z2 & N & N & X1 & & & & d & Y1 & & & & & \\
\hline S & & & d & H1 & H2 & \[
\begin{array}{|c}
\hline \mathrm{X} 2- \\
\mathrm{X} 1
\end{array}
\] & H1' & H2' & & & \[
\begin{gathered}
\mathrm{Y} 2- \\
\mathrm{Y} 1
\end{gathered}
\] & & & \\
\hline P & & & & & & & & & & 1/D & & 11 & t2 & \multirow{5}{*}{\[
\begin{gathered}
\text { STEPS } \\
20 \\
\text { THRU } \\
28
\end{gathered}
\]} \\
\hline C & & & & & H1 & H2 & & & & & 1/D & & t1 & \\
\hline Y & & & d & & & \[
\begin{array}{|c}
\hline \mathrm{x}_{2}- \\
\mathrm{X} 1 \\
\hline
\end{array}
\] & H1' & H2' & & & \[
\begin{array}{|c}
\hline \mathrm{Y} 2- \\
\mathrm{Y} 1 \\
\hline
\end{array}
\] & & & \\
\hline STATUS & & & & & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 14 & 15 & 16 & 17 & 18 & \\
\hline
\end{tabular}

Since the performance levels obtained from the six-chip systems described below are slower than the rate of endpoint transformation by a single-chip system, some further speed improvement is desirable. Hence, rather than going through the code for clipping to the X and Y planes, another approach is proposed.

\section*{Clipping to All Six Planes at a Time}

The "window edge clipping method" derived in Newman and Sproull can be used to clip to all six planes at once. Recall that the viewing volume for a perspective view is a pyramid defined by the following plane equations:
\[
X=K \times Z, X=-K \times Z, Y=K \times Z, Y=-K \times Z, Z=N, Z=F
\]
where \(K=S / N\) ，as defined in a previous section．Given a segment with endpoints \(P 1=(X 1, Y 1, Z 1)\) and \(P 2=(X 2, Y 2, Z 2)\) ，to perform the entire clipping operation on all six planes at once，the following two six－tuples must be computed：
\[
\begin{aligned}
& Q=(W 1+X 1, W 1-X 1, W 1+Y 1, W 1-Y 1, Z 1-N, F-Z 1)=(Q 1, Q 2, \ldots .), \\
& R=(W 2+X 2, W 2-X 2, W 2+Y 2, W 2-Y 2, Z 2-N, F-Z 2)=(R 1, R 2, . .),
\end{aligned}
\]
where \(W_{1}=K Z_{1}\) and \(W_{2}=K Z_{2}\) ．
Consider the case where \(\mathrm{X} 1<-\mathrm{W} 1\) ．Then， \(\mathrm{W} 1+\mathrm{X} 1<0\) ；i．e．， \(\mathrm{Q} 1<0\) ．In general，a negative element of \(Q\) indicates that \(P 1\) is on the invisible side of one of the clipping planes，while a negative element of \(R\) indicates the same for P2．To clip the line，the six parameters \(t_{j}\) for clipping the P1 end and the six parameters \(\mathrm{si}_{j}\) for clipping the P2 end are computed．Here，\(t_{i} 20=20 Q_{i} /\left(Q_{i}-R_{i}\right)\) and \(s_{i}=R_{i} /\left(R_{i}-Q_{i}\right)\) ．（Again，the equations of the line as described in Newman and Sproull are used）．

For example，to find the value \(t_{1}\) for clipping P 1 to the \(\mathrm{X}=-\mathrm{W}=-\mathrm{KZ}\) plane，the following equation is used：
\[
X 1+(X 2-X 1) t_{1}=-K\left[Z 1+(Z 2-Z 1) t_{1}\right]
\]

Solving for \(t_{1}\) ，
\[
t_{1}=(X 1+W 1) /[(X 1+W 1)-(X 2+W 2)]=Q 1 /(Q 1-R 1)
\]

In general，\(t_{i}=Q_{i} /\left(Q_{i}-R_{i}\right)\) ．Similarly， \(\mathbf{s i}_{i}=R_{i} /\left(R_{i}-Q_{i}\right)\) ．
To actually carry out the computations of \(t_{i}\) and \(s_{i}\) ，the trick discussed above is performed，and each element of \(Q\) and \(R\) is replaced with the difference of the element and its absolute value，to form Q＇and R＇．That is，
\(Q_{i}^{\prime}=2 \times Q_{i}\) if \(Q_{i}<0\) ，and \(Q_{i}^{\prime}=0\) otherwise.
\(R_{i}^{\prime}=2 \times R_{i}\) if \(R_{i}<0\) ，and \(R_{i}^{\prime}=0\) otherwise．
Next calculated is \(t_{i}=Q_{i}^{\prime} /\left[2\left(Q_{i}-R_{i}\right)\right]\) and \(s_{i}=R_{i}^{\prime} /\left[2\left(R_{i}-Q_{i}\right)\right]\) ，followed by \(T 1=\operatorname{MAX}\left(t_{i}\right)\) and \(T 2=1-\operatorname{MAX}\left(\mathrm{s}_{\mathbf{i}}\right)\) ．The P1 end is clipped using T1 and the P2 end is clipped using T2．

In an＇8837 three－processor parallel system，in which each processor is given the task of computing two \(t_{j}\) and two \(\mathrm{si}_{j}\) values，computing the \(\mathrm{Q}_{j}\)＇and \(\mathrm{R}_{\mathrm{j}}\)＇values takes 14 cycles， with the values of \(Q_{i}-R_{i}\) computed by step 13．The six divides， \(0.5 /\left(Q_{i}-R_{i}\right)\) ，are completed in step 30，assuming an 8 －bit seed ROM is used．The max／min operations take place in paraliel in two processors and complete at step \(54(24+30)\) ，and the new endpoints are ready by step \(60(6+54)\) ．The timing is the same using the＇ 8847 ．

The data flow and program listing for computing \(t_{1}, t_{2}, s_{1}\) ，and \(s_{2}\) by one of the three ＇8837 processors is given in Tables 87 and 88.

Table 87. Data Flow for Computing \(\mathbf{t}_{\mathbf{1}}, \mathbf{t}_{\mathbf{2}}, \mathbf{s}_{1}\), and \(\mathbf{s}_{2}\) Using an SN74ACT8837
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & Y & Y & Y & Y & N & N & N & Y & Y \\
\hline RA & K & K & & & & & & & W2 & Q1 \\
\hline RB & Z1 & Z2 & X1 & X1 & X2 & & & & X2 & R1 \\
\hline S & & & & & Q1 & Q2 & R1 & Q1' & Q2' & R1' \\
\hline P & & & W1 & W2 & & & & & & \\
\hline C & & & & W1 & W2 & Q1 & Q2 & R1 & & \\
\hline Y & & & & & & & & Q1' & Q2' & R1' \\
\hline STATUS & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\
\hline & & & & & & & & & & \\
\hline CHAIN & Y & N & Y & Y & Y & Y & Y & Y & Y & Y \\
\hline RA & Q2 & & R0 & & & & & & & \\
\hline RB & R2 & & d1 & & & & & & & \\
\hline S & R2 & Q1-R1 & Q2-R2 & R2' & RO & & T0 & & & \\
\hline P & & & & & \(\mathrm{d} \times \mathrm{RO}\) & & RO & & R1 & \\
\hline C & & R2 & & & & & & & & \\
\hline Y & & Q1-R1 & Q2-R2 & R2' & & & & & & \\
\hline STATUS & & & & & & & & & & \\
\hline CLK & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 & 20 \\
\hline & & & & & & & & & & \\
\hline CHAIN & Y & Y & Y & Y & Y & Y & Y & Y & N & N \\
\hline RA & & & & & Q1' & Q2' & R1' & & & \\
\hline RB & O-S & & & & & & & R2' & & \\
\hline S & R1 & & T1 & & & & & 1/D2 & & \\
\hline P & \(\mathrm{d} \times \mathrm{R} 1\) & & O-SR1 & & 1/D1 & 1/D2 & t1 & t2 & S1 & S2 \\
\hline C & & & & & & 1/D1 & 1/D1 & & & \\
\hline Y & & & & & & & \(t 1\) & t2 & S1 & S2 \\
\hline STATUS & & & & & & & & & & \\
\hline CLK & 21 & 22 & 23 & 24 & 25 & 26 & 27 & 28 & 29 & 30 \\
\hline
\end{tabular}

NOTE: Cycles \(13,15,17,19, \ldots, 25\) compute \(1 / D 1=0.5 / \mathrm{d} 1\); Cycles 14, 16, 18, 20, . . ,26 compute 1/D2 \(=0.5 / \mathrm{d} 2, \mathrm{di}=\mathrm{Qi}-\mathrm{Ri}\).

Table 88．Program Listing for Three－Processor Clip to Compute \(\mathbf{t}_{1}, \mathbf{t}_{\mathbf{2}}, \mathbf{s}_{1}\) ， and \(\mathbf{s}_{2}\) Only
\begin{tabular}{|c|c|c|c|c|c|}
\hline & \multicolumn{3}{|l|}{REGISTER TRANSFERS} & ALU OPERATION & MULTIPLIER OPERATION \\
\hline 1. & LOAD RA，RB & & & & MULT（RA，RB） \\
\hline 2. & LOAD RA，RB & & & & MULT（RA，RB） \\
\hline 3. & LOAD RB & & \(C \leftarrow P\) & ADD（ \(\mathrm{P}, \mathrm{RB}\) ） & \\
\hline 4. & LOAD RB & & \(C \leftarrow P\) & ADD（C，－RB） & \\
\hline 5. & LOAD RB & & \(\mathrm{C} \leftarrow \mathrm{S}\) & ADD（C，RB） & \\
\hline 6. & & \(Y \leftarrow S\) & \(C \leftarrow S\) & ADD（ \(\mathrm{C},-|\mathrm{C}|\) ） & \\
\hline 7. & & \(Y \leftarrow S\) & \(C \leftarrow S\) & ADD（ \(\mathrm{C},-|\mathrm{C}|\) ） & \\
\hline 8. & & \(Y \leftarrow S\) & & ADD（ \(\mathrm{C},-|\mathrm{C}|)\) & \\
\hline 9. & LOAD RA，RB & & & ADD（RA，－RB） & \\
\hline 10. & LOAD RA，RB & \(Y \leftarrow S\) & & ADD（RA，－RB） & \\
\hline 11. & LOAD RB，RB & \(Y \leftarrow-S\) & \(C \leftarrow S\) & ADD（RA，－RB） & \\
\hline 12. & & \(Y \leftarrow S\) & & ADD（ \(\mathrm{C},-|\mathrm{C}|\) ） & \\
\hline \multicolumn{6}{|c|}{CODE FOR TWO DIVISIONS} \\
\hline 25. & LOAD RA & \(Y \leftarrow S\) & \(C \leftarrow P\) & & MULT（RA，P） \\
\hline 26. & LOAD RA & \(Y \leftarrow S\) & & ADD（P，O） & MULT（RA，P） \\
\hline 27. & & \(Y \leftarrow S\) & & & MULT（RA，C） \\
\hline 28. & & \(Y \leftarrow S\) & & & MULT（S，RB） \\
\hline
\end{tabular}

This approach facilitates the transform of 288,000 line segments per second in a 3 －chip ＇ 8837 system running at 15 MHz and 576,000 line segments in an＇ 8847 system running at 30 MHz ．If branches are permitted in the sequencer，a considerable speedup is available for situations in which a large proportion of line segments are either invisible， and may be eliminated，or are completely visible，and may be passed without clipping．A single－processor system takes no more than 32 cycles，sometimes as few as 10 cycles， to reject an invisible line，whereas it takes 91 cycles to process lines that need both ends clipped．Hence，in a situation where \(50 \%\) of the line segments are invisible，the speed is in excess of 360,000 line segments per second at 20 MHz and 540,000 segments／ second at 30 MHz ．It is not uncommon for \(80 \%\) of lines to be invisible，in which case the speed would increase to 584,000 line segments at 20 MHz and 877,000 line segments at 30 MHz ．

To take advantage of this speedup，the only change in the sequence given above is that while computing Q and R ，the logical AND and OR is formed for the signs of the corresponding pairs of values，\(Q_{i}\) and \(R_{j}\) ．This is best performed off－chip if the＇ 8837 is being used but may be done using independent ALU（unchained）mode in the＇8837 or a logical operation in the＇ 8847 ．For the＇ 8837 ，with two operands \(Q_{i}\) and \(R_{i}\) ，Table 89 shows the \(A>B\) status bit for an \(A>B\) comparison on \(A=-Q_{i} \times\left|R_{i}\right|\) and \(B=\left|Q_{i}\right| \times R_{i}\) for all signs of \(Q_{i}\) and \(R_{i}\) ．

Table 89. \(\mathrm{A}>\mathrm{B}\) Comparison Function Table
\begin{tabular}{|c|c|c|c|c|c|}
\hline Sign \(\mathbf{Q}_{\mathbf{i}}\) & Sign \(\mathbf{R}_{\mathbf{i}}\) & \(\operatorname{Sign} \mathbf{A}=-\mathbf{Q}_{\mathbf{i}} \times\left|\mathbf{R}_{\mathbf{l}}\right|\) & \(\operatorname{Sign} \mathbf{B}=\left|\mathbf{Q}_{\mathbf{i}}\right| \times \mathbf{R}_{\mathbf{i}}\) & \(\mathbf{A}>\mathbf{B}\) & \(\mathbf{A}=\mathbf{B}\) \\
\hline- & - & + & - & T & F \\
- & + & + & + & F & T \\
+ & - & - & + & F & T \\
+ & + & - & F & F \\
\hline
\end{tabular}

The \(A>B\) status provides the needed AND function of the sign bits of \(Q_{i}\) and \(R_{i}\). In computing these \(A>B\) values, if \(A>B\) is TRUE, the sequencer branches to code that rejects the line as invisible. A comparison \(A>B\) of \(A=\left(Q_{i} \times\left|R_{i}\right|\right)\) and \(B=\left(\left|Q_{i}\right| \times R_{i}\right)\) gives the logical AND of the complement of the sign bits. It is TRUE when both \(Q_{i}\) and \(R_{i}\) are positive. If all six values are TRUE, the sequencer can branch to code that passes the line segment unclipped.

For a three-processor parallel system, lockstep operation with a single sequencer is still possible since all three processors are working on the same line segment, and the branch options apply equally to them all. The estimated time for a three-processor system is 56 cycles; not much interleaving is possible.

Now that the operations have been reduced to a minimum, the remaining steps are necessarily sequential. Rejecting invisible or passing totally visible line segments without division, however, is still beneficial.

\section*{Clipping in the Screen System}

In most graphics systems, full line clipping is not performed in the eye system. Instead, a trivial accept/reject test is performed, in which the line segments are simply tested against the six clipping planes. If a line has both ends on the invisible side of any one of the clipping planes, it is rejected. Lines surviving this test may still be outside the viewing pyramid. In any case, the lines are transformed to the screen coordinate system and then clipped against a cube defined by the simple plane equations \(-1<(X, Y, Z)<1\). The next three sections describe this process.

\section*{Trivial Accept/Reject Test}

In the eye system, the clipping planes are:
\[
X=W, X=-W, Y=W, Y=-W, Z=N \text {, and } Z=F \text {, }
\]
where \(W=K \times Z\). After \(-W 1\) and \(-W 2\) are computed, a sequence of comparison operations are performed, summarized as follows:
\[
\begin{array}{ll}
\text { with } X 1 \text { in } R B \text { and }-W 1 \text { in } P, & P>R B \text { (i.e., }-W 1>X 1 \text { ) } \\
\text { with } X 1 \text { in } R A \text { and }-W 1 \text { in } C, & R A>|C| \text { (i.e., } X 1>W 1) \\
\text { with } Y 1 \text { in } R B \text { and }-W 1 \text { in } C, & C>R B \\
\text { with } Y 1 \text { in } R A, & R A>|C| \text { comparison } \\
\text { with } Z 1 \text { in } R B \text { and } N \text { in } R A, & R A>R B \text { (i.e. } N>Z 1 \text { ) } \\
\text { with } Z 1 \text { in } R A \text { and } F \text { in } R B, & R A>R B \text { (i.e., } Z 1>F \text { ). }
\end{array}
\]

These six operations are carried out in successive cycles and then repeated for (X2, Y2, Z2). The two six-tuples are saved off-chip and a bit-wise AND is carried out. If any one of the resulting six boolean values is TRUE, the line is rejected. This entire operation takes only 16 cycles, thereby providing a speed of \(1,071,000\) line segments per second at 15 MHz and \(2,143,000\) line segments per second at 30 MHz . The data flow for an accept/ reject test is given in Table 90. Accept/reject testing of individual points takes only 8 cycles.

Table 90. Data Flow for Accept/Reject Testing
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & N & N & Y & Y & Y & Y & Y & Y & Y & Y & Y & Y & Y & Y & N & N \\
\hline RA & K & K & & X 1 & & Y 1 & N & Z 1 & -W 2 & X 2 & -W 2 & Y 1 & N & Z 2 & & \\
\hline RB & Z 1 & Z 2 & X 1 & & Y 1 & & Z 1 & F & X 2 & -W 2 & Y 1 & -W 2 & Z 2 & F & & \\
\hline S & & & & & & & & & & & & & & & & \\
\hline P & & & -W 1 & -W 2 & & & & & & & & & & & & \\
\hline C & & & -W 1 & -W 1 & -W 1 & -W 1 & & & & & & & & & & \\
\hline Y & & & -W 2 & & & & & & & & & & & & & \\
\hline STATUS & & & & -W 1 \\
\(>\mathrm{X} 1\) & \(\mathrm{X} 1>\mathrm{W} 1\) & -W 1 & \(\mathrm{PY} 1>\mathrm{W} 1\) & \(\mathrm{~N}>\mathrm{Z} 1\) & \(\mathrm{Z} 1>\mathrm{F}\) & -W 2 & \(>\mathrm{X} 2>\mathrm{W} 2\) & -W 2 & \(\mathrm{Y} 2>\mathrm{W} 2\) & \(\mathrm{~N}>\mathrm{Z} 2\) & \(\mathrm{Z} 2>\mathrm{F}\) & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 \\
\hline
\end{tabular}

\section*{Transformation to the Screen System}

After the line segments have passed the trivial accept/reject test, they are transformed to the screen coordinate system. The following transformation is first applied to the \(Z\) coordinate in order to scale its clipping planes to \(Z^{\prime}=-W\), and \(Z^{\prime}=W\) :
\[
Z^{\prime}=[-W \times(F+N)] /(F-N)+(2 \times W \times Z) /(F-N)
\]

The value of \(1 /(F-N)\) is constant for all line segments and is therefore computed only once. In fact, two constants, \(a=2 K /(F-N)\) and \(b=-(F+N) / 2\), can be available so that \(Z^{\prime}=Z \times a \times(b+Z)\). (Note that other transformations on \(Z\) can also be used.)

After the trivial accept/reject test, the following transformation to the screen system occurs:
\[
X_{S}=X / W, Y_{S}=Y / W, Z_{S}=Z^{\prime} / W
\]

The clipping planes then have these equations:
\[
X_{S}=-1, X_{S}=1, Y_{S}=-1, Y_{S}=1, Z_{S}=-1, Z_{S}=1
\]

Z1' and Z2' can be formed in 8 cycles. Only two reciprocals, \(1 / \mathrm{W} 1\) and \(1 / \mathrm{W} 2\), need to be computed, and they can be interleaved and completed in 13 cycles in an ' 8837 if an 8 -bit seed ROM is employed and in 12 cycles in an '8847. The line segment is transformed to the screen system in a further 6 cycles. The total is 26 cycles for the 'ACT8847 and 27 cycles for the 'ACT8837. A single-processor system would transform 600,000 line segments per second with a 15 MHz clock and 1.2 million line segments per second at 30 MHz .

Note that the above projection does not preserve planarity. See Newman and Sproull for perspective projections that do preserve planes.

\section*{The Clipping Operation}

The final operation on line segments is to clip them to the cube:
\[
X_{S}=1, X_{S}=-1, Y_{S}=1, Y_{S}=-1, Z_{S}=1 \text { and } Z_{S}=-1 .
\]

It is important to realize that the required resolution of \(X_{\mathrm{S}}, \mathrm{Y}_{\mathrm{S}}\) and \(\mathrm{Z}_{\mathrm{S}}\) may only be 10 or 11 bits. Any divisions needed in an ' 8837 implementation at this stage could feasibly be done entirely by table look-up. It would certainly not be necessary to perform more than one iteration if an 8 -bit seed ROM is employed. Two divisions can therefore be interleaved and completed in 7 cycles. However, three iterations are assumed in this example to give full single-precision accuracy.

Consider a three-processor pipeline, with each processor clipping against two parallel planes. The first will clip against the \(x\) planes \(-1<X<1\). For clipping the P 1 end of the line segment, \(Q=\left(1+X 1,1-X_{1}\right)\) is computed and \(Q^{\prime}\) is formed, where \(Q_{j}^{\prime}=Q_{i}-\left|Q_{i}\right|\). l.e.,
\[
\begin{aligned}
& Q_{1}^{\prime}=2\left(1+X_{1}\right), \text { if }\left(1+X_{1}\right)<0 ; Q_{1}^{\prime}=0 \text { otherwise. } \\
& Q_{2}^{\prime}=2(1-X 1), \text { if }\left(1-X_{1}\right)<0 ; Q_{2}^{\prime}=0 \text { otherwise. }
\end{aligned}
\]

At least one of \(Q_{i}^{\prime}\) will be zero; the other will be negative. Hence, \(\operatorname{MIN}\left(Q_{1}{ }^{\prime}, Q_{2}{ }^{\prime}\right)=Q_{1}{ }^{\prime}\) \(+\mathrm{Q}_{2}{ }^{\prime}=\left[\left(1+\mathrm{X}_{1}\right)-\left|1+\mathrm{X}_{1}\right|\right]+\left[(1-\mathrm{X} 1)-\left|1-\mathrm{X}_{1}\right|\right]\). Therefore, \(\mathrm{MIN}\left(\mathrm{Q}_{1}{ }^{\prime}, \mathrm{Q}_{2}{ }^{\prime}\right)=(1\) - |X1|) - |1 - |X1||. So, \(t=\left|\left(m_{1}-\left|m_{1}\right|\right) / 2 d\right|\) and \(s=\left|\left(m_{2}-\left|m_{2}\right|\right) / 2 d\right|\), where \(m_{i}=1-\left|X_{i}\right|\), and \(d=X 1-X 2\). Note that only one reciprocal is required per processor.

A three-processor parallel system would have each processor work on one dimension, supplying its pair of max parameters to a "second stage." The second stage would receive ( \(t_{x}, s_{x}\) ), ( \(t_{y}, s_{y}\) ), ( \(t_{z}, s_{z}\) ) from the above system, compute max \((t)=T\) and \(\max (\mathrm{s})=\mathrm{S}\), and then clip the line as before:
\[
\begin{aligned}
& X 1^{\prime}=X 1+(X 2-X 1) T, \\
& X_{2}^{\prime}=X 2-(X 2-X 1) S .
\end{aligned}
\]

The data flow and program listing for the program run by a processor working on the X dimension are given in Tables 91 and 92.

Table 91. Data Flow for the \(X\) Processor
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & N & Y & Y & N & Y & N & Y & Y & Y & Y & Y & Y & Y \\
\hline RA & X 1 & l & L & RO & & & & & & d & & & & \\
\hline RB & X 2 & X 1 & X 2 & d & & & & & & & & 0.5 & & \\
\hline S & & & d & m 1 & m 2 & R 0 & n 1 & \(\mathrm{T0}\) & n 2 & & & R 1 & & T 1 \\
\hline P & & & & & & \(\mathrm{d} \times \mathrm{RO}\) & & R 0 & & R 1 & & \(\mathrm{~d} \times \mathrm{R} 1\) & & 0.5 R 1 \\
\hline C & & & & & m 1 & m 2 & m 2 & & & & & & & \\
\hline Y & & & d & & & & n 1 & & n 2 & & & & & \\
\hline STATUS & & & & & & & & & & & & & & \\
\hline CLK & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 \\
\hline
\end{tabular}
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}
\hline CHAIN & Y & N & N & Y & Y & & & & & & & & & \\
\hline RA & & \(n 1\) & n 2 & & & & & & & & & & & \\
\hline RB & & & & & & & & & & & & & & \\
\hline S & & & & & & & & & & & & & & \\
\hline P & & \(1 / \mathrm{D}\) & & t & S & & & & & & & & & \\
\hline C & & & \(1 / \mathrm{D}\) & & & & & & & & & & & \\
\hline Y & & & & t & S & & & & & & & & & \\
\hline STATUS & & & & & & & & & & & & & \\
\hline CLK & 15 & 16 & 17 & 18 & 19 & 20 & 21 & 22 & 23 & 24 & 25 & 26 & 27 & 28 \\
\hline
\end{tabular}

NOTE: \(d=X 1-X 2 ; n_{i}=m_{i}-\left|m_{i}\right|\)
Table 92. Program Listing for the X Processor
\begin{tabular}{|c|c|c|c|c|}
\hline \multicolumn{3}{|r|}{REGISTER TRANSFERS} & ALU OPERATION & MULTIPLIER OPERATION \\
\hline 1. & LOAD RA, RB & \(Y \leftarrow S\) & ADD (RA, - RB) & \\
\hline 2. & LOAD RA, RB & & ADD (RA, -RB) & \\
\hline 3. & LOAD RA, RB & & ADD (RA, - RB) & \\
\hline 4. & LOAD RA, RB & & ADD (RA, O) & MULT (RA,RB) \\
\hline 5. & & \(Y \leftarrow S\) & ADD ( \(\mathrm{C},-|\mathrm{C}|\) ) & \\
\hline 6. & & & ADD (2,-P) & MULT (S,1) \\
\hline 7. & & \(Y \leftarrow S\) & ADD ( \(\mathrm{C},-|\mathrm{C}|\) ) & \\
\hline 8. & & & & MULT (S,P) \\
\hline 9. & & & & \\
\hline 10. & LOAD RA & & ADD ( \(\mathrm{P}, \mathrm{O}\) ) & MULT (RA,P) \\
\hline 11. & & & & \\
\hline 12. & LOAD RB & & ADD ( \(2,-\mathrm{P}\) ) & MULT (S,RB) \\
\hline 13. & & & & \\
\hline 14. & & & & MULT (S,P) \\
\hline 15. & & & & \\
\hline 16. & LOAD RA & \(Y \leftarrow P\) & & MULT (RA, P) \\
\hline 17. & LOAD RA & \(Y \leftarrow P\) & & MULT (RA,P) \\
\hline 18. & & & & \\
\hline 19. & & & & \\
\hline
\end{tabular}

The three-processor parallel clipping system operates on a fixed loop of 17 instructions and can therefore clip 0.88 million line segments per second at 15 MHz and 1.76 million line segments per segment at 30 MHz . The second stage could not keep up with this rate without being implemented as several processors. A single processor can form the two max values in 23 cycles (a loop of 21 cycles) while two processors would take only 12 cycles (a loop of 10). The final clipping of the two endpoints takes about 11 cycles (a loop of 9 cycles).

To summarize, the fastest clipping system operates in the normalized screen coordinate system. It has six processors arranged in three stages - a three-processor parallel system with each processor working on each dimension; a two-processor system to form the two max values; and a single-processor third stage to clip the endpoints. The combined speed would be equal to that of the first stage, as previously described. A slightly slower four-processor system would use one processor for computing the two max values in the second stage.

\section*{Summary of Graphics Systems Performance}

The previous section considered several approaches to the design of computer graphics systems based on the 'ACT8837 and the 'ACT8847. Table 93 summarizes the results. Table 94 shows the options available in combining the sub-systems listed in Table 93 into a design for a graphics system.

Table 93. Summary of Graphics Systems Performance
\begin{tabular}{|ll|c|c|}
\hline \multicolumn{2}{|c|}{ SUB-SYSTEM } & SPEED AT 15 MHz & SPEED AT 30 MHz \\
\hline a Transform, \(4 \times 4\) matrix, & 1 ACT88X7 cycle & 0.94 M points \(/ \mathrm{s}\) & 1.875 M points \(/ \mathrm{s}\) \\
\hline b Transform, \(3 \times 3\) matrix, & 1 ACT88×7 cycle & 1.25 M points \(/ \mathrm{s}\) & 2.5 M points \(/ \mathrm{s}\) \\
\hline c Eye clipping pipe, & 6 ACT88X7 cycles & 0.577 M lines \(/ \mathrm{s}\) & 1.2 M lines \(/ \mathrm{s}\) \\
\hline d Eye clipping & 3 ACT88X7 cycles & 0.288 M lines \(/ \mathrm{s}\) & 0.576 M lines \(/ \mathrm{s}\) \\
\hline e Eye Accept/Reject test & 1 ACT88X7 cycle & 1.071 M lines \(/ \mathrm{s}\) & 2.143 M lines \(/ \mathrm{s}\) \\
\hline f Screen clipping & 5 ACT88X7 cycles & 0.88 M lines \(/ \mathrm{s}\) & 1.76 M lines \(/ \mathrm{s}\) \\
\hline g Screen clipping & 4 ACT88X7 cycles & 0.71 M lines \(/ \mathrm{s}\) & 1.42 M lines \(/ \mathrm{s}\) \\
\hline
\end{tabular}

Table 94. Available Options for Graphics System Designs
\begin{tabular}{|ll|r|c|}
\hline \multicolumn{2}{|c|}{ SYSTEM } & SPEED AT 15 MHz & SPEED AT 30 MHz \\
\hline 1 (a or b\()+\mathrm{c}\), & 7 ACT88X7 cycles & 0.577 M lines \(/ \mathrm{s}\) & 1.2 M lines \(/ \mathrm{s}\) \\
\hline II \((\mathrm{a}\) or b\()+\mathrm{d}\), & 2 ACT88X7 cycles & 0.288 M lines \(/ \mathrm{s}\) & 0.576 M lines \(/ \mathrm{s}\) \\
\hline III \((\mathrm{a}\) or b\()+\mathrm{f}\), & 6 ACT88X7 cycles & 0.88 M lines \(/ \mathrm{s}\) & 1.76 M lines \(/ \mathrm{s}\) \\
\hline IV \(2 \times(\mathrm{a}\) or b\()+\mathrm{c}+\mathrm{g}\), & 7 ACT88X7 cycles & 2.5 M lines \(/ \mathrm{s}\) & 3.75 M lines \(/ \mathrm{s}\) \\
\hline
\end{tabular}

In the fourth system, it is assumed that 2 processors are used for the transform of endpoints so as to balance the high clipping rate. It is also assumed that the accept/ reject stage will eliminate more than \(60 \%\) of the line segments so that the clipping system can keep up with the transform processors.

\section*{Overview}

1

\section*{SN74ACT8818 16-Bit Microsequencer}

\section*{2}
SN74ACT8832 32-Bit Registered ALU
SN74ACT8836 \(32-\times 32\)-Bit Parallel Multiplier ..... 4
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support8
Mechanical Data

\section*{Support}

\section*{Design Support for TI's SN74ACT8800 Family}

TI's '8800 32-bit processor family is supported by a variety of tools developed to aid in design evaluation and verification. These tools will streamline all stages of the design process, from assessing the operation and performance of an individual device to evaluating a total system application. The tools include functional models, behavioral models, microcode development software, as well as the expertise of TI's VLSI Logic applications group.

\section*{Functional Evaluation Models Aid in Device Evaluation}

Many design decisions can easily be made and evaluated before hardware or board prototypes are needed, using functional evaluation software models. The result is shortened design cycles and lower design costs.

Texas Instruments offers functional evaluation models for many of the devices in the ' 8800 family. These models are written in Microsoft \({ }^{\circledR}{ }^{\circledR}\) and can be used in standalone mode or as callable functions.

These models are designed to provide insight into the operation of the devices by allowing the designer to write microcode and run it through the model. This allows the designer to select the device that best executes a specific application and provides a head start in evaluating programming performance.

The models correctly represent device timing in clock cycles, measured from the input of control and data to the output of results and status. Hence, initial performance estimates for a particular design can be made by relating the number of clock cycles required for an operation to the typical ac timing data for the device.

\section*{Behavioral Simulation Models Simplify System Debugging}

System simulation with behavioral models can further shorten design time and ease design effort. The behavioral simulation models that support TI's ' 8800 chip set have the timing-control and error-handling capability to perform thorough PCB and system simulation. These models decrease the time spent in debugging and reduce the number of required prototype runs.

Users of system simulation models report a reduction by more than half in the number of prototype runs typically required to produce the highest-quality system. This savings in time reduces costs and gets the product to market as much as several months earlier than could be done using traditional methods.

Behavioral models for TI's '8800 family are written at the functional behavioral level and, therefore, are faster and easier to use and take up less disk space than some other types of simulation models. This higher efficiency means a simulation run can include more IC models and yet require less CPU time than an equivalent simulation using other types of models.

These behavioral simulation models also provide explicit error messages that can help in the debugging process. For example, if a design violates a device set-up time, the model explains, via an error message, what type of violation occurred, at what point it occurred in the simulation run, and specifically which part's set-up time was violated. Then, the model continues on with the run as if no violation occurred, saving time rather than crashing the run at every error.

In other words, an expert debugger is built right into the simulation.
The models are available with commercial and military timing and interact with a variety of simulators.

\section*{Behavioral Models for TI's '8800 Family are Easily Obtained}

Texas Instruments has been working closely with both Quadtree Software Corporation and Logic Automation Incorporated to produce software behavioral simulation models of many of its VLSI devices. Since accuracy is key to solving design problems, we've provided Quadtree and Logic Automation with test patterns for most of our devices to ensure each model passes the same set of test vectors as does the actual silicon device.

Quadtree offers a library of Designer's Choice \({ }^{\text {mw }}\) full-functional behavioral models of Texas Instruments ' 8800 32-bit processor building block devices.

Logic Automation Smartmodel \({ }^{\text {TM }}\) library contains many Texas Instruments products, including devices from the ' 8800 chip set.

These companies may be contacted directly at the addresses below. General information about behavioral model support for the ' 8800 family may be obtained by calling Texas Instruments at (214) 997-5402.

LOGIC AUTOMATION INCORPORATED
P.O. Box 310

Beaverton, OR 97075
(503) 690-6900

QUADTREE SOFTWARE CORPORATION
1170 Route 22 East
Bridgewater, NJ 08807
(201) 725-2272

\section*{'8800 SDB Design Kit}

TI offers an '8800 Software Development Board (SDB) Design Kit as an evaluation and training tool. The ' 8800 SDB kit uses a range of software development tools to allow users to evaluate performance and write microprograms for several of the '8800 building blocks. Using the SDB, microcode can be developed earlier in a system's design cycle so that code development parallels, rather than follows, prototype design.

The ' 8800 SDB Design Kit consists of a combination of specially developed hardware, software, and documentation including:
- The '8800 Software Development Board Assembly
- The '8800 SDB User's Guide
- Floppy disk with MS-DOS \({ }^{\text {m }}\) software tools written in Microsoft C, several example microprograms, and demo programs. Source code is included.
- Microcode definition files for use with HILEVEL, STEP Engineering, and Texas Instruments microcode development tools.

Built on a PC/AT card occupying a single slot, the '8800 SDB contains an 'ACT8818 microsequencer, 'ACT8832 registered ALU, and an 'ACT8847 floating point/integer processor, along with 32 K by 128 bits of microcode memory, and 32 K by 32 bits of local data memory. A block diagram of the ' 8800 SDB is detailed in Figure 8-1. The board operates under an MS-DOS environment.

The SDB Design Kit complements other ' 8800 family development tools such as functional evaluation and behavioral simulation models. It actually provides the next step beyond simulators. System code can be executed in a realtime environment that includes conditional branching, on-board data memory, and single-step/breakpoint facilities.

For additional technical information, contact VLSI System Engineering at (214) 997-3970. For ordering information, please call your local field sales representative.


Figure 8-1. '8800 SDB Block Diagram

\section*{Program Code Generation Using the TI Meta Assembler}

The TI Meta Assembler (TIM) provides the means to create object microcode files and to support listings for programs that execute in architectures without standard instruction sets. The end-product of TIM is an absolute object code module in suitable format for downloading to PROM programmers or to the emulator memories of development systems. TIM is fully compatible with some other assemblers as well.

\section*{Systems Expertise is a Phone Call Away}

Texas Instruments VLSI Logic applications group is available to help designers analyze TI's high-performance VLSI products, such as the '8800 32-bit processor family. The group works directly with designers to provide ready answers to device-related questions and also prepares a variety of applications documentation.

The group may be reached in Dallas, at (214) 997-3970.

uoddns

\section*{Overview}

\section*{1}

\section*{SN74ACT8818 16-Bit Microsequencer \\ 2}

\section*{SN74ACT8832 32-Bit Registered ALU 3}
SN74ACT8836 32-×32-Bit Parallel Multiplier
SN74ACT8837 64-Bit Floating Point Processor ..... 5
SN74ACT8841 Digital Crossbar Switch ..... 6
SN74ACT8847 64-Bit Floating Point/Integer Processor ..... 7
Support

\section*{Mechanical Data}
SN74ACT8818 \(11 \times 11\) GC PACKAGE
SN74ACT8832 \(17 \times 17\) GB PACKAGE
SN74ACT8836 \(15 \times 15\) GB PACKAGE
SN74ACT8837 \(17 \times 17\) GB PACKAGE
SN74ACT8841 \(15 \times 15\) GB PACKAGE
SN74ACT8847 ..... \(17 \times 17\) GA PACKAGE

9

\section*{\(11 \times 11\) GB pin grid array ceramic package}


ALL POSSIBLE PIN LOCATIONS ARE SHOWN. SEE APPLICABLE PRODUCT DATA SHEETS FOR ACTUAL PIN LOCATIONS USED.

NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within \(0,381(0.051)\) radius relative to the center of the ceramic.

\section*{\(11 \times 11\) GC pin grid array ceramic package}


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within 0,381 ( 0.051 ) radius relative to the center of the ceramic.

\section*{\(13 \times 13\) GB pin grid array ceramic package}


2,54 (0.100) T.P.


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within \(0,381(0.051)\) radius relative to the center of the ceramic.

\section*{\(13 \times 13\) GC pin grid array ceramic package}

ełeg ןeэ!uецэән


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within 0,381 ( 0.051 ) radius relative to the center of the ceramic.

\section*{\(15 \times 15\) GB pin grid array ceramic package}


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within 0,381 ( 0.051 ) radius relative to the center of the ceramic.

\section*{\(15 \times 15\) GC pin grid array ceramic package}



NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within 0,381 (0.051) radius relative to the center of the ceramic.

\section*{\(17 \times 17\) GA pin grid array ceramic package}


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at
meximum material condition and within 0,381 ( 0.051 ) radius relative to the center of the ceramic.

\section*{\(17 \times 17\) GB pin grid array ceramic package}


NOTE A: Pins are located within \(0,13(0.005)\) radius of true position relative to each other at maximum material condition and within 0,381 ( 0.051 ) radius relative to the center of the ceramic.

\section*{TI Sales Offices}

ALABAMA: Huntsville (205) 837-7530.
ARIZONA: Phoenlx (602) 995-1007; Tucson (602) 292-2640.

CALIFORNIA: Irvine (714) 660-1200;
Rosevilie (916) 786-9208;
San Dlego (619) 278-9601;
Santa Clara (408) 980-9000
Torrance (213) 217-7010;
Torrance (213) \(217-7010\);
Woodland Hills (818) 704.7759.
COLORADO: Aurora (303) 368-8000
CONNECTICUT: Wallingford (203) 269-0074.
FLORIDA: Altamonte Springs (305) 260-2116; Ft. Lauderdale (305) 973-8502; Tampa (813) 885-7411.
GEORGIA: Norcross (404) 662-7900 ILLINOIS: Arlington Helghts (312) 640-2925.
INDIANA: Carmel (317) 573-6400;
Ft. Wayne (219) 424-5174.
IOWA: Cedar Rapids (319) 395-9550.
KANSAS: Overland Park (913) 451-4511.
MARYLAND: Columbla (301) 964-2003.
MASSACHUSETTS: Waltham (617) 895-9100.
MICHIGAN: Farmington Hills (313) 553-1569; Grand Raplds (616) 957-4200.
MINNESOTA: Eden Prairie (612) 828-9300. MISSOURI: St. Louls (314) 569-7600. NEW JERSEY: Iselln (201) 750-1050. NEW MEXICO: Albuquerque (505) 345-2555. NEW YORK: East Syracuse (315) 463-9291; Melville (516) 454-6600;
Pittsford (716) 385-6770;
Poughkeepsle (914) 473-2900
NORTH CAROLINA: Charlotte (704) 527-0933; Raleigh (919) 876-2725.
OHIO: Beachwood (216) 464-6100;
Beaver Creek (513) 427-6200.
OREGON: Beaverton (503) 643-6758.
PENNSYLVANIA: Blue Bell (215) 825-9500. PUERTO RICO: Hato Rey (809) 753-8700. TENNESSEE: Johnson Clty (615) 461-2192.
TEXAS: Austin (512) 250-7655;
Houston (713) 778-6592;
Richardson (214) 680-5082;
San Antonlo (512) 496-1779.
UTAH: Murray (801) 266-8972.
WASHINGTON: Redmond (206) 881-3080.
WISCONSIN: Brookfield (414) 782-2899.
CANADA: Nepean, Ontarlo (613) 726-1970; Richmond Hill, Ontarlo (416) 884-918
St. Laurent, Quebec (514) 336-1860.

\section*{TI Regional Technology Centers}

CALIFORNIA: Irvine (714) 660-8105; Santa Clara (408) 748-2220;
GEORGIA: Norcross (404) 662.7945. ILLINOIS Arlington Helghts (312) 640-2909. MASSACHUSETTS: Waltham (617) 895-9196. TEXAS: Rlchardson (214) 680-5066. CANADA: Nepean, Ontario (613) 726-1970.

\section*{TI Distributors}

\author{
TI AUTHORIZED DISTRIBUTORS
}

Arrow/Klerulff Electronics Group Arrow (Canada)
Future Electronics (Canada)
GRS Electronics Co., Inc.
Hall-Mark Electronics
Marshall Industries
Newark Electronics
Schweber Electronics
Time Electronics
Wyle Laboratorles
Zeus Components
-OBSOLETE PRODUCT ONLY Rochester Electronics, Inc. Newburyport, Massachusetts (508) 462-9332

ALABAMA: Arrow/Kierulff (205) 837-6955; Hall-Mark (205) 837-8700; Marshall (205) 881-9235; Schweber (205) 895-0480.
ARIZONA: Arrow/Kierulff (602) 437-0750 Hall-Mark (602) 437-1200; Marshall (602) 496-0290; Schweber (602) 431-0030; Wyle (602) 866-2888.
CALIFORNIA: Los Angeles/Orange County: Arrow/Kierulff (818) 701-7500, (714) 838-5422 Hall-Mark (818) 773-4500, (714) 669-4100; Marshall (818) 407-0101, ( 818 ) 459-5500,
(714) 458-5395; Schweber (818) 880-9686 (714) 458-5395; Schweber (818) 880-9686; (714) 863-0200, (213) 320-8090; Wyle (818) 880-9000, (714) 863-9953; Zeus (714) 921-9000; (818) 889-3838; Sacramento: Hall-Mark (916) 624-9781;
Marshall (916) 635-9700; Schweber (916) 364-0222; Wyle (916) 638-5282; Hall-Mark (619) 268-1201; Marshall (619) 578-9600 Schweber (619) 450-0454; Wyle (619) 565-9171; San Francisco Bay Area: Arrow/Kierulft (408) 745-6600, Hall-Mark (408) 432-0900; Marshall (408) 942-4600; Schweber (408) 432-7171; Wyle (408) 727-2500; Zeus (408) 998-5121.
COLORADO: Arrow/Kierulff (303) 790-4444; Hall-Mark (303) 790-1662; Marshall (303) 451-8383; Schweber (303) 799-0258; Wyle (303) 457-9953.
CONNETICUT: Arrow/Kierulff (203) 265-7741; Hall-Mark (203) 271-2844; Marshall (203) 265-3822; Schweber (203) 264-4700.
FLORIDA: Ft. Lauderdale:
Arrow/Kierulff (305) 429-8200; Hall-Mark (305) 971-9280;
Marshall (305) 977-4880; Schweber (305) 977-7511;
Orlando: Arrow/Kierulff (407) 323-0252; Hall-Mark (407) 830-5855, Marshall (407) 767-8585 Tampa: Hall-Mark (813) 530-4543. Marshall (813) 576-1399; Schweber (813) 541-5100
GEORGIA: Arrow/Kierulff (404) 449-8252 Hall-Mark (404) 447-8000; Marshall (404) 923-5750;
Schweber (404) 449-9170.

ILLINOIS: Arrow/Kierulff (312) 250-0500; Hall-Mark (312) 860-3800; Marshall (312) 490-0155 Newark (312) 784-5100; Schweber (312) 364-3750.
INDIANA: Indianapolis: Arrow/Kierulff (317) 243-9353; Hall-Mark (317) 872-8875; Marshall (317) 297-0483; Schweber (317) 843-1050.
IOWA: Arrow/Kierulff (319) 395-7230; Schweber (319) 373-1417.
KANSAS: Kansas CIty: Arrow/Kierulff (913) 541-9542; Hall-Mark (913) 888-4747; Marshall (913) 492-3121; Schweber (913) 492-2922.

MARYLAND: Arrow/Kierulff (301) 995-6002; Hall-Mark (301) 988-9800; Marshall (301) 235-9464; Schweber (301) 840-5900; Zeus (301) 997-1118.
MASSACHUSETTS Arrow/Kierulff (508) 658-0900; Hall-Mark (508) 667-0902; Marshall (508) 658-0810; Hall-Mark (508) 667-0902; Marshall (508) 658-08
Schweber (617) 275-5100; Time (617) 532-6200; Wyle (617) 273-7300; Zeus (617) 863-8800.
MICHIGAN: Detrolt: Arrow/Kierulff (313) 462-2290; Hall-Mark (313) 462-1205; Marshall (313) 525-5850; Grand Rapids: Arrow/Kierulff (616) 243-0912.

MINNESOTA: Arrow/Kierulff (612) 830-1800; Hail-Mark (612) 941-2600; Marshall (612) 559-2211; Schweber (612) 941-5280.
MISSOURI: St. Louis: Arrow/Kierulff (314) 567-6888; Hall-Mark (314) 291-5350; Marshall (314) 291-4650; Schweber (314) 739-0526.
NEW HAMPSHIRE: Arrow/Kierulff (603) 668-6968; Schweber (603) 625-2250.

NEW JERSEY: Arrow/Kierulff (201) 538-0900, (609) 596-8000; GRS Electronics (609) 964-8560; Hall-Mark (201) 575-4415, (201) 882-9773, (609) 235-1900; Marshall' (201) 882-0320, (609) 234-91ग0; Schweber (201) 227-7880.

NEW MEXICO: Arrow/Kierulff (505) 243-4566.
NEW YORK: Long Island:
Arrow/Kierulff (516) 231-1009; Hall-Mark (516) 737-0600 Marshall (516) 273-2424; Schweber (516) 334-7474 Zeus (914) 937-7400;
Rochester: Arrow/Kieruif; (716) 427-030) 235-7620; Schweber (716) 424-2222;
Syracuse: Marshall (607) 798-1611.
NORTH CAROLINA: Arrow/Kierulff (919) 876-3132, (919) 725-8711; Hall-Mark (919) 872-0712;
(919) 725-8711; Hall-Mark (919) 872-0712;
Marshall (919) 878-9882; Schweber (919) 876-0000.

OHIO: Cleveland: Arrow/Kierulff (216) 248-3990; Hall-Mark (216) 349-4632; Marshall (216) 248-1788; Schweber (216) 464-2970
Columbus: Hali-Mark (614) 888-3313;
Marshall (513) 898-4480; Schweber (513) 439-1800.
OKLAHOMA: Arrow/Kierulff (918) 252-7537;
Schweber (918) 622-8003.
OREGON: Arrow/Kierulff (503) 645-6456; Marshall (503) 644-5050; Wyle (503) 640-6000.
PENNSYLVANIA: Arrow/Kierulff (412) 856-7000, (215) 928-1800; GRS Elect (412) 963 -6804.

TEXAS: Austin: Arrow/Kierulff (512) 835-4180; Hall-Mark (512) 258-8848; Marshall (512) 837-1991; Schweber (512) 339-0088; Wyle (512) 834-9957; Dallas: Arrow/Kierulff (214) 380-6464 Hall-Mark (214) 553-4300; Marshall (214) 233-5200; Schweber (214) 661-5010; Wyle (214) 235-9953; Zeus (214) 783-7010;
El Paso: Marshall (915) 593-0706;
Houston: Arrow/Kierulff (713) 530-4700 Hall-Mark (713) 781-6100; Marshall (713) 895-9200; Schweber (713) 784-3600; Wyle (713) 879-9953
UTAH: Arrow/Kierulff (801) 973-6913; Hall-Mark (801) 972-1008; Marshall (801) 485-1551. Wyle (801) 974-9953.
WASHINGTON: Arrow/Kierulff (206) 575-4420; Marshall (206) 486-5747; Wyle (206) 881-1150.
WISCONSIN: Arrow/Kierulff (414) 792-0150; Hall-Mark (414) 797-7844; Marshall (414) 797-8400; Schweber (414) 784-9020.
CANADA: Calgary: Future (403) 235-5325; Edmonton: Future (403) 438-2858;
Montreal: Arrow Canada (514) 735-5511;
Future (514) 694-7710;
Ottawa: Arrow Canada (613) 226-6903; Future (613) 820-8313;
Quebec CIty: Arrow Canada (418) 871-7500; Toronto: Arrow Canada (416) 672-7769; Future (416) 638-4771; Marshall (416) 674-2161 Vancouver: Arrow Canada (604) 291-2986; Future (604) 294-1166.

\section*{Customer Response Center}

TOLL FREE: (800) 232-3200
OUTSIDE USA: (214) 995-6611 (8:00 a.m. - 5:00 p.m. CST)

\section*{TI Worldwide Sales Offices}

ALABAMA: Huntsvilie: 500 Wynn Drive, Suite 514, Huntsville, AL 35805, (205) 837-7530.
ARIZONA: Phoenix: 8825 N. 23rd Ave., Phoenix, AZ 85021, (602) 995-1007;TUCSON: 818 W . Miracle Mile, Suite 43, Tucson, AZ 85705, (602) 292-2640.
CALIFORNIA: Irvine: 17891 Cartwright Dr., Irvine, CA 92714, (714) 660-1200; Roseville: 1 Sierra Gate San Diego: \(\mathbf{4 3 3 3}\) View Ridge Ave. Suite 100
San Diego: 43 92123 , (619) 278-9601:
Santa Clara: 5353 Betsy Ross Dr., Santa Clara, CA 95054, (408) 980-9000; Torrance: 690 Knox St.,
Torrance, CA 90502, (213) 217.7010 Woodland Hilis: 21220 Erwin St., Woodland Hills, CA 91367, (818) 704-7759.
COLORADO: Aurora: 1400 S. Potomac Ave Suite 101, Aurora, CO 80012 , (303) 368-8000
CONNECTICUT: Wallingford: 9 Barnes Industrial Park Rd., Barnes Industrial Park, Wallingford,
CT 06492, (203) 269-0074.

FLORIDA: Altamonte Springs: 370 S. North Lake Blvd, Altamonte Springs, FL 32701, (305) 260-2116; Ft. Lauderdale: 2950 N.W. 62nd St.
Ft. Lauderdale, FL 33309, (305) 973-8502 Tampa: 4803 George Rd., Suite 390,
Tampa, FL 33634, (813) 885-7411
GEORGIA: Norcross: 5515 Spalding Drive, Norcross, GA 30092, (404) 662-7900
ILLINOIS: Arlington Heights: 515 W . Algonquin, Arlington Heights, IL 60005, (312) 640-2925.

INDIANA: Ft. Wayne: 2020 Inwood Dr.,
Ft. Wayne, in 46815, (219) 424-5174
Ft. Wayne, in 46815, (219) 424-5174; Carmel: 550 Congressional Dr., Carmel, IN 46032 ,
( 317 ) \(573-6400\) (317) 573-6400.

IOWA: Codar Rapids: 373 Collins Rd. NE, Suite 201 Cedar Rapids, IA 52402, (319) 395-9550.
KANSAS: Overland Park: 7300 College Blvd., Lighton
Plaza, Overland Park, KS 66210 , ( 913 ) 451.4511 .
MARYLAND: Columbia: 8815 Centre Park Dr. Columbia MD 21045, (301) 964-2003
MASSACHUSETTS: Waltham: 950 Winter St., Waltham, MA 02154, (617) 895-9100
MiCHIGAN: Farmington Hills: 33737 W. 12 Mile Rd. Farmington Hills, Mi 48018, (313) 553-1569 Grand Rapids: 3075 Orchard Vista Dr. S.E. Grand Rapids, MI 49506, (616) 957-4200. MINNESOTA: Eden Prairie: 11000 W. 78th
Eden Prairie, MN 55344 (612) 828-9300. MISSOURI: St. Louis: 11816 Borman Drive St. Louis, MO 63146, (314) 569-7600. NEW JERSEY: Iselin: 485E U.S. Route 1 South, Parkway Towers, Iselin, NJ 08830 (201) 750-1050. NEW MEXICO: Albuquerque: 2820-D Broadbent Pkwy NE, Albuquerque, NM 87107, (505) 345-2555.
NEW YORK: East Syracuso: 6365 Collamer Dr., East Syracuse, NY 13057, (315) 463-9291; Molville: 1895 Wait Whitman Rd., P.O. Box 2936 Melville, NY 11747, (516) 454-6600 Pittsford: 2851 Clover St., Pittsford, NY 14534 (716) 385-6770;

PYoughkeepsio: 385 South Rd., Poughkeepsie, 4) 473-2900.

NORTH CAROLINA: Charlotte: 8 Woodlawn Green, Woodlawn Rd., Charlotte, NC 28210, 1704 527-0933; Raieigh: 2809 Highwoods Blvd., Suite 100 Raleigh, NC 27625, (919) 876-2725.
OHIO: Beachwood: 23775 Commerce Park Rd., Beachwood, OH 44122 , (216) 464-6100, Beavercreek: 4200 Colonel Glenn Hwy., Beavercreek, OH 45431, (513) 427-6200.

OREGON: Beaverton: 6700 SW 105th St., Suite 110, Beaverton, OR 97005, (503) 643-6758.
PENNSYLVANIA: Blue Bell: 670 Sentry Pkwy, Blue Bell, PA 19422, (215) 825-9500.
PUERTO RICO: Hato Rey: Mercantil Plaza Bldg. Suite 505, Hato Rey, PR 00918, (809) 753-8700
TENNESSEE: Johnson City: Erwin Hwy, .O. Drawer 1255, Johnson City, TN 37605 615) 461-2192.

TEXAS: Austin: 12501 Research Blvd., Austin, TX 78759, (512) 250-7655; Richardson: 1001 E Campbell Rd., Richardson, TX 75081
214) 680-5082; Houston: 9100 Southwest Frwy., Suite 250, Houston, TX 77074, (713) 778-6592; San Antonio: 1000 Central Parkway South,
San Antonio, TX 78232, (512) 496-1779.
UTAH: Murray: 5201 South Green St., Suite 200 Murray, UT 84123, (801) 266-8972.
WASHINGTON: Redmond: 5010 148th NE, BIdg B, Suite 107, Redmond, WA 98052, (206) 881-3080. WISCONSIN: Brookfield: 450 N. Sunny Slope, Suite 150, Brookfield, WI 53005, (414) 782-2899.'
CANADA: Nepean: 301 Moodie Drive, Mallorn Center, Nepean, Ontario, Canada, K2H9C4
(613) 26 -1970. Richmond Hill: 280 Centre St. E.
416) 884-9181. St Leurent: Ville Sta

Quebec, 9460 Trans Canada Hwy., St. Laurent
Quebec, Canada H4S1R7, (514) 336-1860.

ARGENTINA: Texas Instruments Argentina Viamonte 1119, 1053 Capital Federal, Buenos Aires, Argentina, 541/748-3699

AUSTRALIA (\& NEW ZEALAND): Texas Instruments Australia Ltd.: 6-10 Talavera Rd., North Ryde (Sydney), New South Wales, Australia 2113 \(2+887-1122 ; 5\) th Floor, 418 St. Kilda Road Melbourne, Victoria, Australia 3004, 3+267-4677; Melt Philip, Highway, Elizabeth, South Australia 5112 ,
171 , \(255-2066\).

AUSTRIA: Texas Instruments Ges.m.b.H. ndustriestrabe B/16, A-2345 Brunn/Gebirge, 2236-846210.
BELGIUM: Texas Instruments N.V. Belgium S.A.: 11, Avenue Jules Bondetlaan 11, 1140 Brussels, Belgium Avenue Jules
(O2) 242-3080.
BRAZIL: Texas Instruments Electronicos do Brasil BRAZIL: Texas instruments Electronicos do Brasil
Ltda.: Rua Paes Leme, 524-7 Andar Pinheiros, 05424
Sao Paulo, Brazil, 0815-6166.

DENMARK: Texas Instruments A/S, Mairelundvej 46E, 2730 Herlev, Denmark, 2-917400.
FINLAND: Texas Instruments Finland OY:
Ahertajantie 3, P.O. Box 81, ESP00. Finland, 190 0-461-422.

FRANCE: Texas Instruments France: Paris Office, BP 678 -10 Avenue Morane-Saulnier, 78141 VelizyVillacoublay cedex (1) 30701003

GERMANY (Fed. Republic of Germany): Texas Instruments Deutschland GmbH : Haggertystrasse 1, 8050 Freising, \(8161+80-4591\); Kurfuerstendamm 195/196, 1000 Berlin 15, 30+882-7365; III, Hagen 43/Kibbelstrasse, 19, 4300 Essen, 201-24250: Kirchhorsterstrasse 2, 3000 Hannover 51 \(511+648021\); Maybachstrabe 11, 7302 Ostfildern
2-Nelingen, \(711+34030\).

HONG KONG: Texas instruments Hong Kong Ltd., 8th Floor, World Shipping Ctr., 7 Canton Rd., Kowloon, Hong Kong, (852) 3-7351223
IRELAND: Texas Instruments (ireland) Limited: \(7 / 8\) Harcourt Street, Stillorgan, County Dublin, Eire 16
ITALY: Texas Instruments Italia S.p.A. Divisione Semiconduttori: Viale Europa, 40, 20093 Cologne Magliana, 38, 00148 Roma, (06) 5222651. Via Amendola, 17, 40100 Bologna, (051) 554004.
JAPAN: Tokyo Marketing/Sales (Headquarters): Texas Instruments Japan Ltd., MS Shibaura Bldg., 9 F 03.769-8700. Texas Instruments Japan Ltd. Nissh wai Bldg. 5F, 30 Imabashi 3 -chome, Higashi-ku Osaka 541, Japan, 06-294-1881: Daini Toyota W Bldg. 7F, 10-27 Meieki 4-chome, Nakamura-ku, Nagoya 450, 052-583-8691; Daiichi Seimei Bldg. 6F 3-10 Oyama-cho, Kanazawa 920, Ishikawa-ken, 0762-23-5471; Daiichi Olympic Tachikawa Bldg. 6F, 1-25-12 Akebono-cho, Tachikawa 190, Tokyo, 0425-27-6426; Matsumoto Showa Bldg. 6F, 2-11 O263-33-1060: Yokohama Nishiguchi KN BIdg. 6 F 2-8-4 Kita-Saiwai-cho, Nishi-ku, Yokohama 220 045-322-6741; Nihon Seimei Kyoto Yasaka Bldg. 5F, 843-2 Higashi Shiokohiidori, Nishinotoh-in Higashi-iru, Shiokouji, Shimogyo-ku, Kyoto 600, 075-341-7713; 2597.1. Aza Harudai, Oaza Yasaka, Kitsuki 873, Oita ken, 09786-3-3211; Miho Plant, 2350 Kihara Miho mura, Inashiki-gun 300-04, Ibaragi-ken.
0298-85-2541
KOREA: Texas Instruments Korea Ltd., 28th Fl., Trade Tower, \#159, Samsung-Dong, Kangnam-ku, Seoul, Korea \(2+551\) - 2810 .
MEXICO: Texas Instruments de Mexico S.A.: Alfonso Reves-i15 Col. Hipodromo Condesa, Mexico, D.F. Mexico 06120, 525/525-3860.

MIDDLE EAST: Texas Instruments: No. 13, 1 st Floor Mannai Bidg., Diplomatic Area, P.O. Box 26335 , Manama Bahrain, Arabian Gulf, \(973+274681\).
NETHERLANDS: Texas Instruments Holland B.V. 19 Hogehilweg, 1100 AZ Amsterdam-Zuidoost, Holiand \(20+5602911\).
NORWAY: Texas Instruments Norway A/S: PB106, Refstad 0585, Oslo 5, Norway, (2) 155090

PEOPLES REPUBLIC OF CHINA: Texas Instruments China tnc., Beijing Representative Office, 7-05 Citic Bidg., 19 Jianguomenwai Dajje, Beijing, China, (861) 5002255 , Ext. 3750

PHILIPPINES: Texas Instruments Asia Ltd.: 14th Floor Ba- Lepanto Bldg., Paseo de Roxas, Makati, Metro Manila, Philippines, 817-60-31
PORTUGAL: Texas Instruments Equipamento Electronico (Portugal), Lda.: Rua Eng. Frederico Ulrich 2650 Moreira Da Maia, 4470 Maia, Portugal, 2-948-1003.
SINGAPORE \(1+\) INDIA, INDONESIA, MALAYSIA,
THAILAND): Texas Instruments Singapore (PTE) Ltd. Asia Pacific Division, 101 Thompson Rd. "23-0
United Square, Singapore 1130, \(350 \cdot 8100\).
SPAIN: Texas Instruments Espana, S.A.: C/Jose
Lazaro Galdiano No. 6, Madrid 28036, \(1 / 458.14 .58\)
SWEDEN: Texas instruments International Trade Corporation (Sverigefilialen): S-164-93, Stockholm, Corporation (Sverigefilial
Sweden, 8-752-5800.
SWITZERLAND: Texas Instruments, Inc., Reidstrasse 6, CH-8953 Dietikon (Zuerich) Switzerland,

TAIWAN: Texas instruments Supply Co., 9th Floor Bank Tower, 205 Tun Hwa N. Rd., Taipei, Taiwan, Republic of China, \(2+713\)-9311.
UNITED KINGDOM: Texas Instruments Limited: Manton Lane, Bedford, MK41 7PA, England, 0234 270111.

SN74ACT8800 Family
32-Bit CMOS Processor
Building Blocks

\section*{ERRATA}

\title{
SN74ACT8800 Family 32-Bit CMOS Processor Building Blocks
}

\section*{Errata}

\section*{IMPORTANT NOTICE}

Texas Instruments (TI) reserves the right to make changes to or to discontinue any semiconductor product or service identified in this publication without notice. TI advises its customers to obtain the latest version of the relevant information to verify, before placing orders, that the information being relied upon is current.

TI warrants performance of its semiconductor products to current specifications in accordance with TI's standard warranty. Testing and other quality control techniques are utilized to the extent TI deems necessary to support this warranty. Unless mandated by government requirements, specific testing of all parameters of each device is not necessarily performed.

TI assumes no liability for TI applications assistance, customer product design, software performance, or infringement of patents or services described herein. Nor does TI warrant or represent that any license, either express or implied, is granted under any patent right, copyright, mask work right, or other intellectual property right of TI covering or relating to any combination, machine, or process in which such semiconductor products or services might be or are used.

Copyright © 1989, Texas Instruments Incorporated Printed in U.S.A.

\title{
ERRATA \\ TO THE SN74ACT8800 FAMILY DATA MANUAL (SCSSO06B) \\ JUNE 1989 REVISIONS
}

These errata pages contain corrections to the following specifications:
1. Switching Characteristics, pg. 7-37
2. Setup and Hold Times, pg. 7-38
3. CLK/RESET Requirements, pg 7-38
4. Switching Characteristics, pg. 7-39
5. Switching Characteristics, pg. 7-41.

If you should have any further questions or concerns, contact your nearest TI field sales office, local authorized TI distributor, or the TI Customer Response Center at 1-800-232-3200.
- Page 7-37 - Replace the switching characteristics with the following:

\section*{switching characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{PARAMETER} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{то (OUTPUT)} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPESO} & SN74ACT8847-30 & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN MAX & \\
\hline 1 & \(\mathrm{t}_{\mathrm{pd} 1}\) & DA/DB/Inst & Y OUTPUT & 111 & \(\dagger\) & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 2\)} & INPUT REG & Y OUTPUT & 110 & 70 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & 70 & \\
\hline \multirow[b]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 3\)} & PIPELN REG & Y OUTPUT & 10x & 54 & \multirow[b]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & 54 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 4\)} & OUTPUT REG & Y OUTPUT & 0xX & 20 & \multirow[b]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xx & 20 & \\
\hline 5 & tpd5 & SELMS/ \(\overline{\text { LS }}\) & Y OUTPUT & XXX & + 18 & ns \\
\hline 6 & \({ }^{\text {tpd }}\) 6 & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 希 & ns \\
\hline 7 & \({ }^{\text {tpd }} 7\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 है & ns \\
\hline 8 & \({ }^{t} \mathrm{pd} 8\) & SELMS/[̄S & Y OUTPUT INVALID & XXX & 1.5 & ns \\
\hline \multirow{3}{*}{9} & \multirow{3}{*}{\(t_{d} 1\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/o feedback & 56 & \multirow{6}{*}{ns} \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/feedback \({ }^{\ddagger}\) & 56 & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 W/FLOWC§ & 66 & \\
\hline \multirow{3}{*}{10} & \multirow{3}{*}{\({ }^{t} d 2\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 w/o feedback & 30 & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 w/feedback \({ }^{\ddagger}\) & 30 & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 W/FLOWC§ & 36 & \\
\hline 11 & \({ }^{t} \mathbf{d} 3\) & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock.
(PIPES2-PIPESO = OXX)} & \(12 \quad \mathrm{t}_{\mathrm{d}}-\mathrm{Ol}\) & ns \\
\hline 12 & ten1 & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & 15 & \multirow{4}{*}{ns} \\
\hline 13 & ten2 & \(\overline{\mathrm{OEC}}, \overline{\mathrm{OES}}\) & STATUS & XxX & 15 & \\
\hline 14 & \({ }^{\text {dis }} 1\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & \(x \times x\) & 15 & \\
\hline 15 & \(\mathrm{t}_{\text {dis } 2}\) & \(\overline{\mathrm{OEC}}, \overline{\mathrm{OES}}\) & STATUS & XXX & 15 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\({ }^{\ddagger}\) Applies to all feedback cases except where operands are fed back using FLOWC to bypass C register. (Please see Figure 13 for feedback paths).
§Operands are fed back using FLOWC to bypass the C register.
\(\boldsymbol{I}_{\mathrm{t}_{\mathrm{d}}}\) is the clock cycle period.
- Page 7-38 - Replace the setup and hold times with the following:

\section*{setup and hold times}
\begin{tabular}{|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multicolumn{2}{|r|}{\multirow[t]{2}{*}{PARAMETER}} & \multirow[t]{2}{*}{} & \multicolumn{2}{|l|}{SN74ACT8847-30} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & MIN & MAX & \\
\hline 16 & \(\mathrm{t}_{\text {su }} 1\) & Inst/control before CLK \(\uparrow\) & XX0 & 12 & & \multirow{6}{*}{ns} \\
\hline 17 & \(\mathrm{t}_{\text {su }}\) & DA/DB before CLK \(\uparrow\) & X×0 & 11 & & \\
\hline 18 & \(\mathrm{t}_{\text {su3 }}\) & DA/DB before 2nd CLK \(\uparrow\) (DP) & XX1 & 40 & & \\
\hline 19 & \(\mathrm{t}_{\text {su4 }}\) & CONFIG1-0 before CLK \(\uparrow\) & XX0 & 12 & & \\
\hline 20 & tsu5 & SRCC before CLKC \(\uparrow\) & XXX & 12 & & \\
\hline 21 & \(\mathrm{t}_{\text {su6 }}\) & \(\overline{\text { RESET }}\) before CLK \(\uparrow\) & XX0 & 12 & & \\
\hline 22 & \(t_{\text {h } 1}\) & Inst/control after CLK \(\uparrow\) & XXX & 3 & & \multirow{4}{*}{ns} \\
\hline 23 & th2 & DA/DB after CLK \(\uparrow\) & XXX & 4 & & \\
\hline 24 & th3 & SRCC after CLKC \(\uparrow\) & XXX & 1 & & \\
\hline 25 & th4 & \(\overline{\text { RESET }}\) after CLK \(\uparrow\) & XX0 & 6 & & \\
\hline
\end{tabular}
- Page 7-38 - Replace the CLK/RESET requirements with the following:

\section*{CLK/RESET requirements}
\begin{tabular}{|c|c|c|c|c|c|}
\hline \multicolumn{3}{|c|}{\multirow[b]{2}{*}{PARAMETER}} & \multicolumn{2}{|l|}{\multirow[t]{2}{*}{\begin{tabular}{|l|}
\hline SN74ACT8847-30 \\
\hline MIN MAX \\
\hline
\end{tabular}}} & \multirow[b]{2}{*}{UNIT} \\
\hline & & & & & \\
\hline \multirow{3}{*}{\({ }^{\text {w }}\) w} & \multirow{3}{*}{Pulse duration} & CLK high & 10 & & \multirow{3}{*}{ns} \\
\hline & & CLK low & 10 & & \\
\hline & & \(\overline{\text { RESET }}\) & 10 & & \\
\hline
\end{tabular}
- Page 7-39 - Replace the switching characteristics with the following:

\section*{switching characteristics}
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{\begin{tabular}{l}
PARAM- \\
ETER
\end{tabular}} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{TO (OUTPUT)} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPES0} & \multicolumn{2}{|l|}{SN74ACT8847-40} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN & MAX & \\
\hline 1 & \(t_{\text {tpd }}\) & DA/DB/Inst & Y OUTPUT & 111 & & \(\dagger\) & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{\text {tpd}}\) 2} & INPUT REG & Y OUTPUT & 110 & & 90 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & & 90 & \\
\hline \multirow[b]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 3\)} & PIPELN REG & Y OUTPUT & 10x & & 60 & \multirow[b]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & & 60 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 4\)} & OUTPUT REG & Y OUTPUT & 0XX & & 24 & \multirow[b]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xx & & 24 & \\
\hline 5 & \({ }^{\text {tpd5 }}\) & SELMS/ \(\overline{\text { SS }}\) & Y OUTPUT & XXX & & 20 & ns \\
\hline 6 & \({ }^{t} \mathrm{pd} 6\) & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 & & ns \\
\hline 7 & \({ }^{t} \mathrm{pd} 7\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 & & ns \\
\hline 8 & \({ }^{t} \mathrm{pd} 8\) & SELMS/ \(\overline{L S}\) & Y OUTPUT INVALID & XXX & 1.5 & & ns \\
\hline \multirow{3}{*}{9} & \multirow{3}{*}{\({ }^{t} 11\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/o feedback & 72 & & \multirow{6}{*}{ns} \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/feedback \({ }^{\ddagger}\) & 72 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 W/FLOWC§ & 84 & & \\
\hline \multirow{3}{*}{10} & \multirow{3}{*}{\(\mathrm{t}_{\mathrm{d} 2}\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 w/o feedback & 40 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 w/feedback \({ }^{\ddagger}\) & 40 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 W/FLOWC§ & 47 & & \\
\hline 11 & \({ }^{\text {t }}\) 3 & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock.
(PIPES2-PIPESO = OXX)} & 12 & \(t_{d}-01\) & ns \\
\hline 12 & ten1 & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XxX & & 16 & \multirow{4}{*}{ns} \\
\hline 13 & \(\mathrm{t}_{\mathrm{en} 2}\) & OEC, \(\overline{O E S}\) & STATUS & \(x \times x\) & & 16 & \\
\hline 14 & \(\mathrm{t}_{\text {dis } 1}\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 16 & \\
\hline 15 & \({ }^{\text {dis }}\) 2 & \(\overline{\text { OEC, }}\) OES & STATUS & XXX & & 16 & \\
\hline
\end{tabular}
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\({ }^{\ddagger}\) Applies to all feedback cases except where operands are fed back using FLOWC to bypass C register. (Please see Figure 13 for feedback paths).
\({ }^{\S}\) Operands are fed back using FLOWC to bypass the C register.
\(\boldsymbol{I}_{t_{d}}\) is the clock cycle period.
- Page 7-41 - Replace the switching characteristics with the following:
switching characteristics
\begin{tabular}{|c|c|c|c|c|c|c|c|}
\hline \multirow[t]{2}{*}{NO.} & \multirow[t]{2}{*}{PARAMETER} & \multirow[t]{2}{*}{FROM (INPUT)} & \multirow[t]{2}{*}{то (OUTPUT)} & \multirow[t]{2}{*}{PIPELINE CONTROLS PIPES2-PIPESO} & \multicolumn{2}{|l|}{SN74ACT8847-50} & \multirow[t]{2}{*}{UNIT} \\
\hline & & & & & MIN & MAX & \\
\hline 1 & \(\mathrm{t}_{\mathrm{pd} 1}\) & DA/DB/Inst & Y OUTPUT & 111 & & \(\dagger\) & ns \\
\hline \multirow[b]{2}{*}{2} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 2\)} & INPUT REG & Y OUTPUT & 110 & & 120 & \multirow[b]{2}{*}{ns} \\
\hline & & INPUT REG & STATUS & 110 & & 120 & \\
\hline \multirow[t]{2}{*}{3} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 3\)} & PIPELN REG & Y OUTPUT & 10x & & 75 & \multirow[t]{2}{*}{ns} \\
\hline & & PIPELN REG & STATUS & 10x & & 75 & \\
\hline \multirow[t]{2}{*}{4} & \multirow[b]{2}{*}{\({ }^{t} \mathrm{pd} 4\)} & OUTPUT REG & Y OUTPUT & 0xX & & 36 & \multirow[t]{2}{*}{ns} \\
\hline & & OUTPUT REG & STATUS & 0xx & & 36 & \\
\hline 5 & \({ }_{\text {tpd5 }}\) & SELMS/[̄S & Y OUTPUT & XXX & & 24 & ns \\
\hline 6 & \({ }^{\text {tpd6 }}\) & CLK \(\uparrow\) & Y OUTPUT INVALID & all but 111 & 3.0 & & ns \\
\hline 7 & \({ }^{\text {tpd }} 7\) & CLK \(\uparrow\) & STATUS INVALID & all but 111 & 3.0 & & ns \\
\hline 8 & \({ }^{t} \mathrm{pd} 8\) & SELMS/[̄] & \[
\begin{aligned}
& \hline \text { Y OUTPUT } \\
& \text { INVALID }
\end{aligned}
\] & XXX & 1.5 & & ns \\
\hline \multirow{3}{*}{9} & \multirow{3}{*}{\(t_{d 1}\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/o feedback & 100 & & \multirow{6}{*}{ns} \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 w/feedback \({ }^{\ddagger}\) & 100 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 010 W/FLOWC \({ }^{\text {s }}\) & 117 & & \\
\hline \multirow{3}{*}{10} & \multirow{3}{*}{\({ }^{t} d 2\)} & CLK \(\uparrow\) & CLK \(\uparrow\) & \(000 \mathrm{w} / \mathrm{o}\) feedback & 50 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 w/feedback \({ }^{\ddagger}\) & 50 & & \\
\hline & & CLK \(\uparrow\) & CLK \(\uparrow\) & 000 W/FLOWC§ & 60 & & \\
\hline 11 & \({ }^{t} \mathbf{d} 3\) & \multicolumn{3}{|l|}{Delay time, CLKC after CLK to insure data captured in C register is data clocked into sum or product register by that clock.
\[
\text { (PIPES2-PIPESO }=0 X X)
\]} & 12 & \(\mathrm{t}_{\mathrm{d}}\) - \({ }^{\text {d }}\) & ns \\
\hline 12 & ten1 & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XXX & & 20 & \multirow{4}{*}{ns} \\
\hline 13 & \(\mathrm{t}_{\mathrm{en} 2}\) & OEC, \(\overline{O E S}\) & STATUS & XXX & & 20 & \\
\hline 14 & \(\mathrm{t}_{\text {dis } 1}\) & \(\overline{\mathrm{OEY}}\) & Y OUTPUT & XxX & & 20 & \\
\hline 15 & \(\mathrm{t}_{\text {dis } 2}\) & \(\overline{\text { OEC, }} \overline{\text { OES }}\) & STATUS & XXX & & 20 & \\
\hline
\end{tabular}

\footnotetext{
\({ }^{\dagger}\) This parameter no longer tested and will be deleted on next Data Manual revision.
\({ }^{\ddagger}\) Applies to all feedback cases except where operands are fed back using FLOWC to bypass C register. (Please see Figure 13 for feedback paths).
§Operands are fed back using FLOWC to bypass the C register.
\(I_{t_{d}}\) is the clock cycle period.
}

\section*{TI Sales Offices}

ALABAMA: Huntsville (205) 837-7530.
ARIZONA: Phoenix (602) 995-1007; Tucson (602) 292-2640.

CALIFORNIA: Irvine (714) 660-1200;
Roseville (916) 786-9208;
San Diego (619) 278-9601;
Santa Clara (408) 980-9000; Torrance (213) 217-7010; Woodland Hills (818) 704-7759.
COLORADO: Aurora (303) 368-8000. CONNECTICUT: Wallingford (203) 269-0074.
FLORIDA: Altamonte Springs (305) 260-2116; Ft. Lauderdale (305) 973-8502;
Tampa (813) 885-7411.
GEORGIA: Norcross (404) 662-7900.
ILLINOIS: Arlington Helghts (312) 640-2925.
NDIANA: Carmel (317) 573-6400;
Ft. Wayne (219) 424-5174.
IOWA: Cedar Raplds (319) 395-9550.
KANSAS: Overland Park (913) 451-4511.
MARYLAND: Columbla (301) 964-2003.
MASSACHUSETTS: Waltham (617) 895-9100.
MICHIGAN: Farmington HIIts (313) 553-1569; Grand Rapids (616) 957-4200.
MINNESOTA: Eden Prairle (612) 828-9300.
MISSOURI: St. Louls (314) 569-7600.
NEW JERSEY: Iselin (201) 750-1050.
NEW MEXICO: Albuquerque (505) 345-2555.
NEW YORK: East Syracuse (315) 463-9291; litsford (716) 385-6770
Poughkeepsie (914) 473-2900.
NORTH CAROLINA: Charlotte (704) 527-0933
Raleigh (919) 876-2725.
OHIO: Beachwood (216) 464-6100; Beaver Creek (513) 427-6200.

OREGON: Beaverton (503) 643-6758.
PENNSYLVANIA: Blue Bell (215) 825-9500.
PUERTO RICO: Hato Rey (809) 753-8700.
TENNESSEE: Johnson City (615) 461-2192.
TEXAS: Austin (512) 250-7655;
Houston (713) 778-6592;
Richardson (214) 680-5082;
UTAH: Murray (801) 266-8972.
WASHINGTON: Redmond (206) 881-3080. WISCONSIN: Brookfleld (414) 782-2899. CANADA: Nepean, Ontarlo (613) 726-1970; ichmond Hill, Ontario (416) 884-9181;

\section*{TI Regional Technology Centers}

CALIFORNIA: Irvine (714) 660-8105;
Santa Clara (408) 748-2220;
GEORGIA: Norcross (404) 662-7945.
ILLINOIS Arlington Helghts (312) 640-2909. MASSACHUSETTS: Waltham (617) 895-9196. TEXAS: Richardson (214) 680-5066.
CANADA: Nepean, Ontarlo (613) 726-1970.

\section*{TI Distributors}

\author{
TI AUTHORIZED DISTRIBUTORS \\ Arrow/Klerulff Electronics Group \\ Arrow (Canada) \\ Future Electronics (Canada) \\ GRS Electronics Co., Inc. \\ Hall-Mark Electronics \\ Marshall Industries \\ Newark Electronics \\ Schweber Electronics \\ Time Electronics \\ Wyle Laboratorles \\ Zeus Components
}
- OBSOLETE PRODUCT ONLYRochester Electronics, Inc. Newburyport, Massachusetts (508) 462-9332

ALABAMA: Arrow/Kierulff (205) 837-6955; Hall-Mark (205) 837-8700; Marshall (205) 881-9235 Schweber (205) 895-0480,

ARIZONA: Arrow/Kierulff (602) 437-0750; Hall-Mark (602) 437-1200; Marshall (602) 496-0290 Schweber (602) 43t-0030; Wyle (602) 866-2888.
CALIFORNIA: Los Angeles/Orange County: Arrow/Kierulff (818) 701-7500, (714) 838-5422; Hall-Mark (818) 773-4500, (714) 669-4100; (714) 458-5395; Schweber (818) 880-9686;
(714) 863-0200, (213) 320-8090; Wyle (818) 880-9000 (714) 863-9953; Zeus (714) 921-9000; (818) 889-3838; Sacramento: Hall-Mark (916) 624-9781; Marshall (916) 635-9700; Schweber (916) 364-0222; Wyle (916) 638-5282;
San Diego: Arrow/Kierulff (619) 565-4800 Hall-Mark (619) 268-1201; Marshall (619) 578-9600 Schweber (619) 450-0454; Wyle (619) 565-9171; San Francisco Bay Area: Arrow/Kierulff (408) 745-6600, Hall-Mark (408) 432-0900; Marshall (408) 942-4600 Schweber (408) 432-7171; Wyle (408) 727-2500; Zeus (408) 998-5121.

COLORADO: Arrow/Kierulff (303) 790-4444; Hall-Mark (303) 790-1662; Marshall (303) 451-8383 Schweber (303) 799-0258; Wyle (303) 457-9953.
CONNETICUT: Arrow/Kierulff (203) 265-7741; Hall-Mark (203) 271-2844; Marshall (203) 265-3822 Schweber (203) 264-4700.

FLORIDA: Ft. Lauderdale:
Arrow/Kierulff (305) 429-8200; Hall-Mark (305) 971-9280; Marshall (305) 977-4880; Schweber (305) 977-7511, Orlando: Arrow/Kierulff (407) 323-0252, Hall-Mark (407) 830-5855; Marshall (407) 767-8585 Schweber (407) 331-7555; Zeus (407) 365-3000; Tampa: Hall-Mark (813) 530-4543;
Marshall (813) 576-1399; Schweber (813) 541-5100.
GEORGIA: Arrow/Kierulff (404) 449-8252 Hall-Mark (404) 447-8000; Marshall (404) 923-5750 Schweber (404) 449-9170.
ILLINOIS: Arrow/Kierulff (312) 250-0500; Hall-Mark (312) 860-3800; Marshall (312) 490-0155 Newark (312) 784-5100; Schweber (312) 364-3750.

NDIANA: Indianapolis: Arrow/Kierulff (317) 243-9353; Hall-Mark (317) 872-8875; Marshall (317) 297-0483; Schweber (317) 843-1050.
IOWA: Arrow/Kierulff (319) 395-7230; Schweber (319) 373-1417.

KANSAS: Kansas Clity: Arrow/Kierulff (913) 541-9542; Hall-Mark (913) 888-4747; Marshall (913) 492-3121; Schweber (913) 492-2922.

MARYLAND: Arrow/Kierulff (301) 995-6002; Hall-Mark (301) 988-9800; Marshall (301) 235-9464; Schweber (301) 840-5900; Zeus (301) 997-1118.
MASSACHUSETTS Arrow/Klerulff (508) 658-0900; Hall-Mark (508) 667-0902; Marshall (508) 658-0810; Schweber (617) 275-5100; Time (617) 532-6200; Wyle (617) 273-7300; Zeus (617) 863-8800.
MICHIGAN: Detroit: Arrow/Kierulff (313) 462-2290 Hall-Mark (313) 462-1205; Marshall (313) 525-5850; Newark (313) 967-0600; Schweber (313) 525-8100; Grand Raplds: Arrow/Kierulff (616) 243-0912.
MINNESOTA: Arrow/Kierulff (612) 830-1800; Hall-Mark (612) 941-2600; Marshall (612) 559-2211; Schweber (612) 941-5280.
MISSOURI: St. Louls: Arrow/Kierulff (314) 567-6888; Hall-Mark (314) 291-5350; Marshall (314) 291-4650; Schweber (314) 739-0526.

NEW HAMPSHIRE: Arrow/Kierulff (603) 668-6968; Schweber (603) 625-2250.
NEW JERSEY: Arrow/Kierulff (201) 538-0900, (609) 596-8000; GRS Electronics (609) 964-8560; (609) (609) 234-91 0 ; Schweber (201) 227-7880.

NEW MEXICO: Arrow/Kierulff (505) 243-4566.
NEW YORK: Long Island:
Arrow/Kierulft (516) 231-1009; Hall-Mark (516) 737-0600; Arrow/Kierulff (516) 231-1009; Hall-Mark (516) 737-060;
Marshall (516) 273-2424; Schweber (516) 334-7474; Zeus (914) 937-7400
Rochester: Arrow/Kierulff (716) 427-0300 Hall-Mark (716) 425-3300; Marshall (716) 235-7620; Schweber (716) 424-2222;
Syracuse: Marshall (607) 798-1611.
NORTH CAROLINA: Arrow/Kierulff (919) 876-3132, (919) 725-8711; Hall-Mark (919) 872-0712 Marshall (919) 878-9882; Schweber (919) 876-0000.

OHIO: Cleveland: Arrow/Kierulff (216) 248-3990; Hall-Mark (216) 349-4632; Marshall (216) 248-1788; Schweber (216) 464-2970;
Columbus: Hall-Mark (614) 888-3313;
Dayton: Arrow/Kierulff (513) 435-5563
Marshall (513) 898-4480; Schweber (513) 439-1800.
OKLAHOMA: Arrow/Kierulff (918) 252-7537; Schweber (918) 622-8003.
OREGON: Arrow/Kierulff (503) 645-6456; Marshall (503) 644-5050; Wyle (503) 640-6000.
PENNSYLVANIA: Arrow/Kierulff (412) 856-7000, (215) 928-1800; GRS Electronics (215) 922-7037; Marshall (412) 963-0441; Schweber (215) 441-0600, (412) 963-6804.

TEXAS: Austin: Arrow/Kierulff (512) 835-4180; Hall-Mark (512) 258-8848; Marshall (512) 837-1991; Schweber (512) 339-0088; Wyle (512) 834-9957; Dallas: Arrow/Kierulff (214) 380-6464;
Hall-Mark (214) 553-4300; Marshall (214) 233-5200; Schweber (214) 661-5010; Wyle (214) 235-9953; Zeus (214) 783-7010
Paso. Marshail (915) 593-0706
Houston: Arrow/Kierulff (713) 530-4700;
Hall-Mark (713) 781-6100; Marshall (713) 895-9200; chweber (713) 784-3600; Wyle (713) 879-9953.

UTAH: Arrow/Kierulff (801) 973-6913;
Hall-Mark (801) 972-1008; Marshall (801) 485-1551; Wyle (801) 974-9953.
WASHINGTON: Arrow/Kierulff (206) 575-4420; Marshall (206) 486-5747; Wyle (206) 881-1150.
WISCONSIN: Arrow/Kierulff (414) 792-0150; Hall-Mark (414) 797-7844; Marshall (414) 797-8400; Schweber (414) 784-9020.
CANADA: Calgary: Future (403) 235-5325;
Edmonton: Future (403) 438-2858;
Montreal: Arrow Canada (514) 735-5511;
Future (514) 694-7710;
Ottawa: Arrow Canada (613) 226-6903;
Future (613) 820-8313;
Quebec City: Arrow Canada (418) 871-7500; Toronto: Arrow Canada (416) 672-7769; Future (416) 638-4771; Marshall (416) 674-2161; Vancouver: Arrow Canada (604) 291-2986; Future (604) 294-1166.

\section*{Customer \\ Response Center}

TOLL FREE: (800) 232-3200
OUTSIDE USA: (214) 995-6611
(8:00 a.m. - 5:00 p.m. CST)```


[^0]:    EPIC is a trademark of Texas Instruments Incorporated.

[^1]:    EPIC is a trademark of Texas Instruments Incorporated.

[^2]:    ${ }^{\dagger}$ This is the increase in supply current for each input that is at one of the specified TTL voltage levels rather - than O V or $\mathrm{V}_{\mathrm{CC}}$.

[^3]:    ${ }^{\dagger}$ Decrementing register/counter A or B and sensing a zero.

[^4]:    ${ }^{\dagger}$ No control effect when DRA＇or DRB＇selected（MUX2－MUXO）$=$ HLH）because B3－BO are address inputs．

[^5]:    ${ }^{\dagger}$ This is the increase in supply current for each input that is at one of the specified TTL voltge levels rather then $\mathrm{O} V$ to $\mathrm{V}_{\mathrm{CC}}$.

[^6]:    ${ }^{\dagger} \mathrm{N}=8$ for quad 8 －bit mode， 16 for dual 16 －bit mode， 32 for 32 －bit mode．
    ${ }^{\ddagger}$ The least significant half of the product is in the MQ register．

[^7]:    ${ }^{\dagger} \mathrm{N}=8$ in quad 8 -bit mode, 16 in dual 16 -bit mode, 32 in 32 -bit mode
    $\ddagger$ Unfixed
    ${ }^{\S}$ Fixed (corrected)

[^8]:    ${ }^{\dagger} \mathrm{F}=\mathrm{ALU}$ result
    $\mathrm{n}=\mathrm{nth}$ byte
    Register file 3 gets F if byte selected, S if byte not selected.

[^9]:    $\dagger_{\mathrm{F}}=\mathrm{ALU}$ result
    $\mathrm{n}=\mathrm{nth}$ package
    Register file 12 gets F if byte selected, S if byte not selected.

[^10]:    ${ }^{\dagger}$ Normalization not complete at the end of this instruction cycle.

[^11]:    ${ }^{\dagger} \mathrm{C}$ is ALU carry－out and is evaluated before shift operation．ZERO and $N$（negative）are evaluated after shift operation．OVR（overflow）is evaluated after ALU operation and after shift operation．

[^12]:    ${ }^{\dagger} \mathrm{C}$ is ALU carry out and is evaluated before shift operation. ZERO and N (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

[^13]:    $\ddagger$ After the intermediate operation（ADD），overflow has occurred and OVR status signal is set high．When the arithmetic right shift is executed，the sign bit is corrected（see Table 16 for shift definition notes）．

[^14]:    ${ }^{\ddagger} \mathrm{C} n$ is ALU carry-out and is evaluated before shift operation. ZERO and $N$ (negative) are evaluated after shift operation. OVR (overflow) is evaluated after ALU operation and after shift operation.

[^15]:    $\ddagger$ This is the increase in supply current for each input that is at one of the specified TTL voltage levels rather than 0 or $V_{C C}$.

[^16]:    EPIC is a trademark of Texas Instruments Incorporated.

[^17]:    ${ }^{\dagger}$ On the first active clock edge (see CLKMODE, Table 17), data in this column is loaded into the temporary register. On the next rising edge, operands in the temporary register and the DA/DB buses are loaded into the RA and RB registers.

[^18]:    $\dagger$ The precision of the integer to floating point conversion is set by 18.
    $\ddagger$ This converts single precision floating point to double precision floating point and vice versa．If the 18 pin is low to indicate a single－precision input，the result of the conversion will be double precision．If the 18 pin is high，indicating a double－precision input，the result of the conversion will be single precision．

[^19]:    ${ }^{\dagger}$ See Table 15.

[^20]:    ${ }^{\dagger}$ During this operation, 18 selects precision of the result.

[^21]:    

