CPU Organizations
- At the core of every computer chip lies the CPU, where the arrangement and structure of registers play a
central role. In this unit, we'll dive deep into the intricacies of CPU design, specifically examining
how
registers are organized and structured.
Types of CPU Organizations
- Single Accumulator Organization
- General Register Organization
- Stack organization
Single Accumulator Organization (Not Important)
- We are familiar with the general format of a register, which is utilized for storing and
manipulating data. A register typically consists of a mode, opcode, and operand.
- Mode: Specifies the address mode.
- Opcode: Indicates the operation to be performed.
- Data: Contains the actual data.
This type of register is employed in computer architectures.
- Now, let's delve into the Single Accumulator Organization: Basic computers often adopt the
single accumulator organization, where the accumulator serves as a dedicated register.
- Operations are executed within the Arithmetic Logic Unit (ALU), which resides inside the Central
Processing Unit (CPU). The CPU is directly linked to the register, ensuring high-speed data
transfer.
- The input data for the ALU is sourced from the accumulator register, and the result of the ALU
calculation is then stored back in the accumulator.
- Single Accumulator Organization is chosen when cost-effectiveness is a priority. By minimizing
the number of registers, we reduce the overall system cost.
- Registers are the fastest form of memory, and connecting them directly to the CPU ensures rapid
data exchange, contributing to high system performance. The more registers used, the higher the
cost. To achieve cost-effectiveness, a single accumulator register is employed.
- Definition of Single Accumulator Organization: In this architecture, a single
accumulator register is designated for arithmetic and logic operations. It optimizes cost by
limiting the number of registers, emphasizing efficiency in simple computing systems.
General Architecture of Single Accumulator Organization
- In this architecture, we have a Program Counter (PC) responsible for storing addresses, and it
is connected to the memory. Additionally, there is an Accumulator (AC) connected to the
Arithmetic Logic Unit (ALU).
- The data stored in memory is retrieved and transferred to the Instruction Register (IR), from
where it is moved to the Accumulator (AC) and subsequently processed in the Arithmetic Logic
Unit (ALU).
- The ALU and memory components are interconnected through a common bus, facilitating the exchange
of data between them.
This type of organization supports single-address instructions. As an example, consider the
instruction sequence for the operation C = A + B:
- LDA A: Load the content of memory location A into the Accumulator (AC).
- ADD B: Add the content of memory location B to the data in the Accumulator
(AC).
- Store C: Store the result in the Accumulator (AC) back to the memory location
designated for C.
General Register Organization
- To understand General Register Organization, it's essential to grasp the major components within
a CPU:
- Storage Components: These include registers and flip-flops, serving as
temporary storage for data.
- Execution Components: The Arithmetic Logic Unit (ALU) is responsible for
carrying out calculations and logical operations.
- Transfer Components: The bus facilitates the transfer of data between
storage and execution components.
- Control Component: The control unit oversees and directs the functioning of
other components within the CPU.
- Memory locations play a crucial role in storing various data types such as pointers, counters,
return addresses, temporary results, and partial products. However, accessing memory is a
time-consuming task. To enhance efficiency, intermediate values are stored in processor
registers. These registers are interconnected through a common bus system, allowing seamless
communication not only for direct data transfer but also for coordinating various
microoperations.
- Definition of General Register Organization: In computing, General Register
Organization refers to the systematic arrangement and utilization of registers within the CPU.
These registers serve as high-speed, temporary storage for data and play a vital role in
enhancing computational efficiency by minimizing the need for frequent memory access.
A Bus Organization for Seven CPU Registers
- The depicted bus organization features seven CPU registers, and its functionality is detailed as
follows:
- The output of each register is linked to two multiplexers (MUX), both of which play a crucial
role in transferring register data into the Arithmetic Logic Unit (ALU).
- Two buses, A and B, are utilized for data transfer. The selection lines in each multiplexer
determine whether to choose data from a register or from input data. Data is transmitted to the
ALU via buses A and B.
- The OPR (Operation) signal serves to define the type of operation to be executed by the ALU.
- The result of the operation conducted by the ALU can be directed to other units within the
system or stored in any of the processor registers.
- A decoder is employed to select the register where the result will be stored. The decoder
activates one of the register load inputs, specifying the destination register for storing the
result.
Example, Let we want to perform the operation R1 ← R2 + R3
- To do this operation, Control Unit generates following singal (Control Word).
Control Word
- A control word, designed for the aforementioned CPU organization, consists of four fields as
illustrated below:
- The three bits of SEL A are dedicated to transferring the contents of a register into
BUS A.
- The three bits of SEL B are assigned to transferring the contents of a register into BUS
B.
- The three bits of SELD are utilized for selecting a destination register. This
facilitates the decision of whether to store the result in a register or to transmit it
outside the ALU.
- The five bits of OPR define the type of operation to be performed by the ALU. This field
governs the arithmetic or logical operation executed by the ALU based on the specified
opcode.
Code for for Register Selection
Operation Code for ALU
Stack Organization
- The memory of a CPU can be organized as a STACK, a structure where information is stored in a
Last-In-First-Out (LIFO) manner. This means that the item last stored is the first to be removed
or popped.
- To manage the items, a stack uses a stack pointer (SP) register. The stack pointer stores the
address of the last item in the stack, essentially pointing to the topmost element. Stack
operations involve two main actions:
- Insertion (Push): When an item is added to the stack, it is referred to
as insertion or a push operation.
- Deletion (Pop): When an item is removed from the stack, it is known as
deletion or a pop operation.
- There are two main types of stacks:
- Register Stack: Utilizes processor registers to create a stack
structure, enhancing speed and efficiency in certain operations.
- Memory Stack: Involves using dedicated memory locations to implement
the stack structure.
Register Stack
- When processor registers are organized in a stack-like fashion, it is termed a register
stack. The diagram above illustrates a 64-word register stack.
- The stack pointer (SP) contains the address of the topmost element in the stack.
- When the stack is empty, the EMPTY flag is set to 1, and when the stack is full, the FULL
flag is set to 1.
- The DR (Data Register) contains the data either being popped from or pushed into the stack.
- Additional benefits of a register stack include faster access times and reduced memory bus
contention, making it suitable for certain computing tasks requiring high-speed data
manipulation.
- For example, in the figure, three items (A, B, and C) are placed in the stack, with item C
at the top. Thus, the stack pointer (SP) holds the address of C (SP = 3).
PUSH Operation:
- When performing a PUSH operation to add an element (let's say E) to the stack, the following
steps are executed:
- Step 1: Increment the Stack Pointer (SP) by 1 so that it points to an
empty slot.
SP ← SP + 1
[Increment stack pointer]
- Step 2: Store the value of the Data Register (DR) at the address
pointed to by SP.
M[SP] ← DR
[Write the item on top of the stack]
- Step 3: Check boundary conditions.
If (SP = 0)
, then set FULL ← 1
indicating the stack
is full.
EMPTY ← 0
signifies that the stack is not empty.
POP Operation:
- When performing a POP operation to remove an element from the stack, the following steps are
executed:
- Step 1: Retrieve the data from the address stored in the Stack Pointer
(SP) and store it in the Data Register (DR).
DR ← M[SP]
[Fetch item from the top of the stack]
- Step 2: Decrement the Stack Pointer (SP) by 1.
SP → SP - 1
- Step 3: Check boundary conditions.
if (SP = 0) then (EMPTY ← 1) [Check if the stack is empty]
FULL ← 0 [Mark the stack as not full]
Memory Stack
- When primary memory (RAM) is organized in the form of a stack, it is referred to as a Memory
Stack.
- The Program Counter (PC) indicates the address of the next instruction in the program.
- The Address Register (AR) points to an array of data within the memory stack.
- The Stack Pointer (SP) identifies the top of the stack.
- In the illustrated figure, the initial value of SP is 4001, and the stack grows with
decreasing addresses. Consequently, the first item stored in the stack is at address 4000,
the second item at address 3999, and the last item at address 3000.
PUSH Operation
SP → SP - 1
M[SP] ← DR
POP Operation
DR ← M[SP]
SP → SP + 1
- The PUSH operation involves decrementing the Stack Pointer (SP) to allocate space for a new
item and storing the value from the Data Register (DR) at the address pointed to by the
updated SP.
- The POP operation retrieves the item from the top of the stack by copying the data from the
address indicated by SP to DR. Subsequently, SP is incremented to free up space in the
stack.
Addressing Modes
An instruction format is a collection of bits that defines the type of instruction, operands, and the
type of operation. The instruction format is represented by a rectangular box, and a basic instruction
format includes the following fields: Opcode, Mode, and Address.
- Opcode: Defines the type of operation to be performed, such as add, subtract,
complement, and shift.
- Address field: Defines the address of operands.
- Mode (or addressing mode) field: Defines the method by which operands are fetched,
modifying the address field of the instruction to determine the actual address of the data.
Addressing Modes:
- 1. Implied Addressing Mode: The zero-address instruction and all instructions using
the accumulator are implied-mode instructions. For example, the "complement accumulator" instruction
is implied-mode because the operand is in the accumulator.
- 2. Immediate Addressing Mode: In this mode, the operand is specified in the
instruction itself, having an operand field instead of an address field.
For example:
ADD 10, 20
.
- 3. Register Addressing Mode: Used when data is stored in processor registers, and
the address part of the instruction contains the address of the processor register.
For
example:
SUB R1, R2
.
- 4. Register Indirect Addressing Mode: The instruction has the address of a
processor register, which contains the address of the operand in memory.
- 5. Direct Address Mode: The instruction has the address of a memory cell where the
data is stored, and the effective address is the address stored in the instruction.
- 6. Indirect Address Mode: The address field of the instruction has a memory address
where the data is stored.
- 7. Autoincrement or Autodecrement Address Mode: Used when fetching a series of
data, and the address part of the instruction gives the starting address, which is incremented or
decremented to fetch the next data from memory.
- 8. Relative Address Mode: The content of the program counter is added to the
address part of the instruction to obtain the effective address of data.
- 9. Indexed Addressing Mode: The content of an index register is added to the
address part of the instruction to obtain the effective address, useful for accessing data arrays in
memory.
- 10. Base Register Addressing Mode: Similar to indexed addressing mode, the content
of a base register is added to the address part of the instruction to obtain the effective address.
Data Transfer Instructions
Data transfer and manipulation are fundamental aspects of computer architecture, integral to the
execution of instructions within a computing system.
- These instructions are typically categorized into
three main types:
- Data Transfer Instructions
- Data Manipulation Instructions
- Program Control Instructions
Data Transfer Instructions
Data transfer instructions facilitate the movement of data from one location to another within the
computer system. These instructions are essential for controlling the flow of information,
including:
- Data transfer between memory and processor registers
- Data transfer between processor registers and input or output devices
- Data transfer between different processor registers
The table below presents a list of eight common data transfer instructions widely utilized across
various computer architectures:
Data Manipulation Instructions
Data manipulation instructions play a critical role in performing operations on data within a
computer system. These instructions can be broadly categorized into three types:
- Arithmetic Instructions
- Logical and Bit Manipulation Instructions
- Shift Instructions
Arithmetic Instructions
Arithmetic instructions encompass fundamental operations such as addition, subtraction,
multiplication, and division. The table below provides a list of typical arithmetic
instructions:
Logical and Bit Manipulation Instructions
Logical instructions are designed to perform binary operations on data stored in registers. These
instructions consider each bit of the operand individually. Here are some common logical and bit
manipulation instructions:
Shift Instructions
Shift instructions move bits within a register either to the left or right. Logical shifts insert
0 to the end bit position. The table below illustrates various types of shift instructions:
Program Control Instructions
In a computer system, instructions are typically stored in successive memory locations, and the
execution of a program involves fetching instructions from these consecutive memory locations. As
each instruction is fetched, the program counter is incremented to contain the address of the next
instruction in sequence. Program control instructions play a crucial role in directing the flow of a
program and managing the execution process.
General program control instructions encompass a variety of operations that dictate the execution
flow. Some of these instructions are outlined in the table below:
Parallel Processing
- In older computers, only a single instruction used to be executed at a time, leading to the wastage
of ALU time and an inability to fully utilize processing capabilities. To address this inefficiency,
the concept of parallel processing was introduced.
- Parallel processing involves the simultaneous execution of multiple instructions, allowing for
concurrent data processing and faster execution times. Instead of processing instructions
sequentially, parallel processing techniques enable more efficient use of computing resources. For
example:
- While an instruction is being executed in the ALU, the next instruction can be read from
memory.
- A system may have two or more ALUs, capable of executing multiple instructions
simultaneously.
- Multiple processors may operate concurrently, enhancing overall system performance.
The primary purpose of parallel processing is to accelerate computer capabilities by leveraging
increased hardware resources.
- Parallel processing can be examined at various levels of complexity:
- At a lower level, the distinction between parallel and serial operations is based on the
type of registers used.
- Shift registers operate in a serial fashion, processing one bit at a time, while registers
with parallel load operate with all bits of the word simultaneously.
- At a higher level of complexity, parallel processing can involve a multiplicity of
functional units performing identical or different operations simultaneously.
The following diagram illustrates a processor with multiple functional units, showcasing the additional
components added to increase productivity and enable parallel processing:
Parallel processing can be classified in various ways, one of which is introduced by M.J. Flynn. Flynn's
classification divides computers into four major groups based on the sequence of instructions read from
memory and the operations performed in the instruction and data streams:
- Single Instruction Stream, Single Data Stream (SISD): In SISD architecture, a
single instruction stream is executed on a single data stream. This is the traditional von Neumann
architecture where one instruction is processed at a time.
- Single Instruction Stream, Multiple Data Streams (SIMD): SIMD architecture involves
the processing of a single instruction simultaneously on multiple data streams. This is commonly
seen in vector processors, where the same operation is applied to multiple data elements
concurrently.
- Multiple Instruction Streams, Single Data Stream (MISD): MISD architecture,
although rare in practice, involves multiple instruction streams operating on a single data stream.
This concept is not widely implemented due to its complexity and limited applicability.
- Multiple Instruction Streams, Multiple Data Streams (MIMD): MIMD architecture
allows for the simultaneous execution of multiple instruction streams on multiple data streams. This
is a versatile and widely used parallel processing architecture found in modern multi-core
processors and parallel computing systems.
Pipelining
- Pipelining involves dividing a process into several suboperations, with each suboperation associated
with a segment.
- The output of each segment is stored in a register, and this register information is passed to the
next segment, facilitating a continuous flow of data.
- Each segment operates independently, allowing for concurrent execution of all segments in the
pipeline.
- The term "pipelining" is derived from the sequential transfer of information from one segment to
another.
Example: Performing Ai * Bi + Ci; for i = 1 to 7.
Segment 1: R1 ← Ai, R2 ← Bi
Segment 2: R3 ⇆ R1 * R2, R4 ← Ci
Segment 3: R5 ← R3 + R4
Pipelining is an efficient technique that allows for the overlap of different stages of instruction
execution, thereby improving overall throughput. Each segment operates concurrently, enabling the
processor to handle multiple instructions simultaneously. This approach significantly enhances the speed
and efficiency of data processing in modern computer architectures.