Memory

This unit delves into the fundamental aspects of computer memory within the context of Computer Architecture and Organization. Memory, a vital component, significantly influences the efficiency and functioning of computers. The topics covered include the Classification of Memories, exploring RAM organization, Static RAM, and Dynamic RAM. Additionally, we will examine ROM organization, including PROM, EPROM, EEPROM, and EAPROM. The concept of Memory Hierarchy will be discussed, shedding light on Cache Memory, Mapping techniques, and the importance of the locality of references. Advanced topics such as Virtual Memory will also be explored, focusing on demand paging, Page Faults, and Page Replacement strategies.

Random Access Memory (RAM), Read-Only Memory (ROM), Cache memory, and processor registers collectively constitute the primary memory of a computer system. Primary memory, also known as main memory, plays a pivotal role as it is directly accessed by the Central Processing Unit (CPU). This central storage area is characterized by its speed and serves as the primary repository for programs and data within a computer architecture.
RAM, being volatile memory, allows for quick read and write operations, facilitating the temporary storage of actively used programs and data. ROM, on the other hand, is non-volatile and retains information even when the power is turned off. It typically contains firmware and essential system instructions that are crucial for the computer's basic functions.
Cache memory acts as an intermediary between the main memory and the CPU, storing frequently accessed data and instructions to expedite processing speed. Processor registers are small, high-speed storage locations within the CPU itself, used to store and manage crucial data for immediate processing.
The seamless interaction between these components ensures efficient data retrieval and execution, contributing to the overall performance and responsiveness of the computer system. As a cohesive unit, RAM, ROM, Cache memory, and processor registers collectively form the backbone of the main memory hierarchy, playing a critical role in the seamless functioning of modern computing devices.

Memory Hierarchy

Memory is a fundamental component of digital computers, serving as the repository for data and programs. However, relying on a single type of memory poses limitations on the storage capacity and access speed. To address this, computer systems employ a concept known as Memory Hierarchy, which encompasses various memory levels to optimize performance.

Using only one type of memory, such as RAM, is impractical for storing all programs and data, and relying solely on secondary storage (e.g., hard disk) leads to slow access times by the CPU. The solution lies in employing multiple memory levels, combining two or three types of memory to form the memory hierarchy of a digital computer.
- Memory Hierarchy is a pyramid-like structure that visually represents the different types of memory in a computer.
- In this hierarchy, various memories, including auxiliary, main, cache, and registers, are positioned at different levels.
- Starting from the bottom, auxiliary memory, represented by magnetic disks and tapes, is slower but offers high capacity.
- Main memory, located above auxiliary memory, is relatively faster and communicates directly with the CPU. It stores data and programs required for execution.
- Cache memory, positioned above main memory, is smaller but faster. It directly supports the CPU by holding the currently executed programs and data segments.
- Registers, located inside the CPU, are the smallest and fastest memory components directly attached to the processor.
Comparing these memory levels, a pattern emerges. As we move from top to bottom, the size of the memory increases (registers being the smallest), but the access time also increases, indicating slower access speeds. Conversely, moving from bottom to top in the hierarchy, the speed of access increases along with the cost of the memory.
Understanding Cache Memory:
- The utilization of cache memory is crucial in computer systems due to the inherent speed difference between the main memory and the CPU registers. The primary purpose of cache memory is to bridge this speed gap and enhance overall system performance.
- When the CPU needs to access data or instructions, it first checks whether the required information is available in the cache memory. Cache memory is smaller in size but significantly faster than the main memory. It stores frequently accessed data and instructions, aiming to provide the CPU with quick and efficient access to the most relevant information.
- The cache operates on the principle of temporal and spatial locality of reference, meaning that it stores recently accessed data and anticipates future data needs based on the program's execution patterns. By doing so, cache memory minimizes the time the CPU spends waiting for data, ultimately reducing latency and enhancing the overall speed of data retrieval.
- Cache memory acts as a buffer between the high-speed registers of the CPU and the slower main memory. When the CPU requires data, it first checks the cache. If the required data is found in the cache (a cache hit), it can be retrieved quickly. In case of a cache miss, where the required data is not in the cache, the CPU fetches the data from the larger and slower main memory.
- Overall, cache memory is a pivotal component in modern computer architecture, contributing to the efficient and swift operation of processors by optimizing the data access process and mitigating the performance gap between the CPU and main memory.

RAM Organization

RAM, which stands for Random Access Memory, is a type of volatile memory. Volatility means that the data stored in RAM is lost when the computer is switched off. The term "random access" implies that data in RAM can be accessed in any order, making it a versatile and fast form of memory.
During the computer's startup process, the operating system is loaded from secondary storage (such as a hard disk drive or solid-state drive) into RAM. Similarly, when an application is launched, it is loaded into RAM. This is because accessing data from RAM is over a hundred times faster for the CPU compared to accessing it from secondary storage. RAM is constructed using flip-flop circuits, allowing for swift and random access by the CPU.
RAM is categorized into two main types:
1. Static RAM (SRAM): SRAM is a type of RAM that uses flip-flop circuits to store each bit of data. It is faster and more expensive than Dynamic RAM (DRAM). SRAM retains its data as long as power is supplied, making it suitable for cache memory.
  - In static RAM (SRAM), each memory cell is typically constructed using 6 transistors and does not involve the use of capacitors. The absence of capacitors sets SRAM apart from dynamic RAM (DRAM). Instead of capacitors, SRAM utilizes flip-flops to store data, eliminating the need for constant refreshing.
  - One notable advantage of SRAM is that it does not require the frequent refreshing that is characteristic of DRAM. The use of flip-flops for data storage allows SRAM to retain its contents without the need for continuous refreshing. However, it's essential to note that data stored in SRAM is volatile and will be lost when the power supply is disrupted or turned off.
2. Dynamic RAM (DRAM): DRAM, on the other hand, requires constant refreshing of the stored data as it tends to leak away over time. It is slower compared to SRAM but is more cost-effective and provides higher storage capacity. DRAM is commonly used as the main memory in computer systems.
  - In dynamic RAM (DRAM), each memory cell is constructed using a combination of one transistor and one capacitor. The unique design of DRAM allows it to store data in the form of an electric charge within the capacitor. However, there is a notable characteristic of DRAM cells – the capacitor tends to discharge relatively quickly.
  - Due to this inherent property, dynamic RAM requires constant refreshing to maintain the integrity of stored data. The process involves charging the capacitor thousands of times per second to compensate for the discharge and ensure that the data remains intact.
RAM plays a crucial role in the overall performance of a computer, providing quick access to data that is actively being used by the CPU. The distinction between SRAM and DRAM highlights the trade-offs between speed, cost, and capacity in different computing scenarios.

RAM Block Diagram

The block diagram below illustrates the key components of a Random Access Memory (RAM).

The RAM has a capacity of 128 words, with each word consisting of eight bits. This configuration necessitates a 7-bit address for memory addressing. The 8-bit bidirectional data bus facilitates the transfer of data to and from the memory.
The memory's read and write operations are controlled by specific lines. The read line is utilized for fetching data from the RAM, while the write line is responsible for storing data into the memory.
Two control lines, CS1 and CS2, are employed to activate the RAM chip. It's important to note that the RAM becomes active when CS1 equals 1 and CS2 equals 0. These control lines play a crucial role in managing the access and functionality of the memory.
In summary, the RAM block diagram demonstrates the interplay of various components, including the address lines, data bus, read and write control lines, and chip select lines, to enable efficient memory operations.

The function table

Case 1 : When CS1 = 0 and CS2 = 0. In this condition RAM is not activated. So reading and writing is not possible.
Case 2 : When CS1 = 0 and CS2 = 1. In this condition RAM is not activated. So reading and writing is not possible.
Case 3 : When CS1 = 1 and CS2 = 0. In this condition RAM is activated. But RD = 0 and WR = 0, so reading and writing is not possible.
Case 4 : When CS1 = 1 and CS2 = 0. In this condition RAM is activated. Now RD = 0 and WR = 1, so we can write (store) data into RAM.
Case 5 : When CS1 = 1 and CS2 = 0. In this condition RAM is activated. Now RD = 1 and WR = 0, so we can read (fetch) data from RAM.
Case 6 : When CS1 = 1 and CS2 = 1. In this condition RAM is not activated. So reading and writing is not possible.

Read-Only Memory (ROM)

ROM, an acronym for Read-Only Memory, is a type of non-volatile memory.
Unlike Random Access Memory (RAM), ROM is designed for read-only operations, meaning that data stored in ROM can only be read and not modified.
It serves as a permanent storage medium, and the data within a basic ROM cannot be altered or written to after manufacturing.
Being non-volatile implies that the information stored in ROM is retained even when the power is turned off.
During the manufacturing process, information is permanently written into ROM, making it an ideal choice for storing critical system instructions and data that should remain unchanged.
ROM includes a specialized program known as a bootstrap loader, which plays a crucial role in initiating the startup process of a computer. The bootstrap loader is essential for loading the operating system into the computer's memory.
While commonly associated with computers, ROM chips find application in various electronic devices such as washing machines and microwave ovens, where the need for permanent and unalterable data storage arises.

Block Diagram of ROM

The block diagram illustrates the organization of a ROM chip, similar to RAM, but with the distinctive feature that ROM is designed for read-only operations, eliminating the need for a write line (WR).
In the presented diagram, a 512-byte ROM is showcased, characterized by nine address lines. These address lines collectively specify any one of the 512 locations within the ROM chip.
Activation of the ROM occurs when the chip select inputs CS1 = 1 and CS2 = 0. These control lines play a vital role in enabling access to the ROM, allowing the retrieval of stored data from specific memory locations.

Types of ROM

PROM (Programmable Read-Only Memory):
- PROM is a type of read-only memory that allows users to program it once with desired data.
- Users purchase a blank PROM and input the required data. Once programmed, the data becomes permanent and cannot be modified.
- Programming PROM involves burning small fuses within the chip, making it a one-time programmable and non-erasable memory.
- PROM is suitable for applications where the data is fixed and does not need frequent updates.
EPROM (Erasable and Programmable Read-Only Memory):
- EPROM can be erased and reprogrammed using special electrical signals or ultraviolet (UV) rays.
- EPROMs that employ UV rays for erasure are referred to as UVEPROM, while those using electrical signals are known as EEPROM (Electrically Erasable Programmable Read-Only Memory).
- During programming, an electrical charge is stored in EPROM, and this charge is retained for more than 10 years, providing non-volatile memory storage.
- EPROM offers the advantage of reusability, making it suitable for applications where data updates are required, but not as frequently as EEPROM.
EEPROM (Electrically Erasable Programmable Read-Only Memory):
- EEPROM is programmed and erased electrically, allowing for greater flexibility compared to EPROM.
- It supports erasing and reprogramming cycles, typically up to ten thousand times, making it more versatile for applications requiring frequent updates.
- Both erasing and programming in EEPROM are relatively quick, taking about 4 to 10 milliseconds (ms).
- EEPROM provides the capability to selectively erase and program specific memory locations, offering a granular approach to data modification.
- Applications for EEPROM include settings storage in electronic devices, where frequent updates or customization of data are necessary.

Cache Memory

The main memory's speed is significantly lower compared to the speed of processors, impacting overall system performance.
To address this, a fast cache memory is employed to reduce memory access time, ensuring that the processor can retrieve data quickly without spending excessive time accessing the main memory.
The efficiency of the cache mechanism relies on the principle of "locality of reference," where a set of data or instructions is accessed repeatedly.
Cache memory, typically constructed using fast Static Random Access Memories (SRAMs), stores frequently accessed data or instructions.
When the processor needs data or instructions, it first checks the cache. If the required information is found in the cache, a "Hit" occurs, avoiding the need to access the slower main memory. If the information is not in the cache, the processor proceeds to access the main memory and, if necessary, the secondary memory.
The access time of cache memory is faster than that of the main memory, contributing to improved system performance.
The performance of cache memory is quantified by a metric known as the "hit ratio," which is the ratio of the number of hits to the total number of memory accesses (hits and misses).
Mathematically, the Hit Ratio (H) is calculated as: H = Number of Hits / (Number of Hits + Number of Misses).

Types of Cache Memory

Level 1 Cache (L1 Cache):
- L1 Cache is the primary cache located on the processor chip itself.
- It is designed to store a small amount of frequently accessed data and instructions for quick retrieval.
- Due to its proximity to the processor, L1 Cache offers extremely fast access times, contributing to enhanced overall system performance.
- L1 Cache is further divided into separate caches for instructions (L1i) and data (L1d).
Level 2 Cache (L2 Cache):
- L2 Cache is located on a separate chip but is still situated near the processor.
- It has a larger capacity compared to L1 Cache and serves as a secondary cache layer.
- L2 Cache helps bridge the speed gap between the processor and the main memory.
- Both instructions and data are stored in the unified L2 Cache, making it a crucial component for optimizing system performance.
Level 3 Cache (L3 Cache):
- L3 Cache is a shared cache that may be located on the processor chip or a separate chip.
- It has an even larger capacity compared to L2 Cache and is shared among multiple processor cores within a system.
- L3 Cache helps improve the overall efficiency of multi-core processors by providing a larger pool of shared cache memory.
- Its larger size and shared nature make L3 Cache effective in handling diverse workloads and improving system-level performance.
Unified Cache:
- Unified Cache combines the storage of instructions and data in a single cache, unlike separate L1i and L1d caches.
- It simplifies cache management and reduces the complexity of addressing both instruction and data caches separately.
- Unified Caches are commonly found in modern processors to streamline memory access and enhance overall efficiency.

Cache Mapping

Cache mapping is a crucial technique employed to manage the transfer of content from the main memory to the cache memory.
It involves the transformation of data from the comparatively slower main memory to the faster cache memory, optimizing the overall speed of data access for the processor.
In the architecture, the CPU is directly connected to the cache memory, forming a high-speed link crucial for rapid data retrieval.
The cache memory, in turn, is connected to the main memory, which serves as an intermediary between the fast cache memory and the slower secondary memory.
The primary purpose of cache mapping is to ensure that the memory speed matches the CPU's processing speed. By directly accessing the cache memory, the CPU minimizes the time spent waiting for data retrieval.
During the mapping process, data is transferred from the main memory to the cache memory, facilitating quick access to frequently used instructions and data by the processor.
Another related term is "paging," which involves the transfer of data from the secondary memory to the main memory. This process complements cache mapping, ensuring that a hierarchy of memory levels is efficiently utilized to meet the varying speed requirements of different memory types.
3 Types of Mapping:
1. Associative Mapping:
  - Associative mapping is a cache mapping technique where a block of main memory can be placed in any cache location.
  - There is no fixed relationship between the block's address in main memory and its location in the cache.
  - This flexibility allows for efficient utilization of cache space but comes at the cost of increased complexity and hardware requirements.
  - Associative mapping is well-suited for scenarios where the program's memory access patterns are unpredictable, providing high flexibility in cache usage.
2. Direct Mapping:
  - Direct mapping is a simpler cache mapping technique where each block of main memory is mapped to a specific location in the cache.
  - There is a fixed relationship between the block's address in main memory and its location in the cache, determined by a mathematical function.
  - This method is efficient in terms of hardware complexity but may lead to cache conflicts where multiple blocks contend for the same cache location.
  - Direct mapping is suitable for scenarios where memory access patterns are somewhat predictable and can benefit from a straightforward mapping approach.
3. Set-Associative Mapping:
  - Set-associative mapping combines features of both associative and direct mapping, providing a compromise between flexibility and simplicity.
  - The cache is divided into sets, and each set contains multiple lines. Each block from main memory can be placed in any line within a specific set.
  - This allows for a degree of flexibility in cache usage while addressing some of the issues associated with direct mapping, such as cache conflicts.
  - Set-associative mapping strikes a balance, making it suitable for a wide range of applications with varying memory access patterns.

Locality of References

Definition: Locality of reference refers to a phenomenon in which a computer program accesses the same set of memory locations very frequently for a particular time period. It signifies the tendency of a computer program to access instructions whose addresses are near one another. This property is often observed in loops within a program.

Types of Localities:

Temporal Locality:
- Temporal locality implies that current data being fetched may be needed again soon. To exploit this, the data or instruction is stored in the cache memory, eliminating the need to search the main memory for the same data repeatedly.
- When the CPU fetches data from RAM, it is also stored in the cache memory based on the assumption that the same data or instruction may be needed in the near future. This phenomenon is known as temporal locality.
- If certain data is referenced, there is a high probability that it will be referenced again in the near future, making temporal locality a key optimization factor in caching.
Spatial Locality:
- Spatial locality assumes that if a memory location has been accessed, there is a high likelihood that a nearby or consecutive memory location will be accessed soon after. To capitalize on this, nearby memory references are also stored in the cache memory for faster access.
- For instance, the traversal of a one-dimensional array in any instruction set benefits from spatial locality optimization.
- Spatial locality enhances caching efficiency by anticipating and preloading adjacent memory locations that are likely to be accessed shortly.

Virtual Memory

Virtual memory creates the illusion of a larger memory space, providing the appearance of abundant memory despite physical limitations.
The technique of virtual memory allows users to utilize more memory for a program than the actual physical memory capacity of a computer.
It is a concept that conveys the illusion to users that their available main memory is equal to the capacity of secondary storage media (or auxiliary memory).
Need for Virtual Memory:
1. Virtual memory is an imaginary memory concept used when a program exceeds the size of the available main memory.
2. It functions as temporary memory, complementing the RAM (or main memory) of the system.
Virtual Memory (VM) is a memory management capability of an operating system that utilizes both hardware and software to address physical memory shortages by temporarily transferring data between RAM and disk storage.
Example Scenario:
Consider a system with 4 GB of main memory and a program size of 16 GB. Since it is not feasible to store the entire 16 GB program in the 4 GB main memory at once, virtual memory comes into play. A portion of the 16 GB program that is currently in use is transferred to the main memory. When that portion is no longer needed, it is moved back to the secondary memory in a process known as swapping.
The fundamental idea behind virtual memory is to keep only those parts of the program currently in use in the memory, while the rest remains on the disk drive.

Implementation of Virtual Memory

To implement virtual memory (VM), a designated portion of the hard disk (HDD) is allocated by the system. This allocated portion can be either a file or a separate partition.
In Windows, this allocated space is represented by a file named pagefile.sys, while in Linux, a distinct partition is often used for virtual memory.
When the system requires more memory (RAM) than is currently available, it transfers some of its data from the main memory (RAM) to the hard disk drive.
The additional memory doesn't physically exist in RAM; rather, it is a storage space on the disk. This implementation is achieved through a process known as swapping, involving the exchange of data between the main memory and the hard disk.
This swapping mechanism allows the system to create an illusion of extended memory space, efficiently managing memory demands beyond the physical limits of RAM.

Address Space and Memory Space

Virtual Address and Address Space:
- An address used by a programmer is referred to as a virtual address, and the collective set of these addresses constitutes the address space.
- The address space represents the range of virtual addresses that a program can use during its execution.
- It provides an abstraction for the programmer, offering a seemingly continuous and expansive range of addresses, regardless of the underlying physical memory layout.
Physical Address and Memory Space:
- An address in the main memory, corresponding to a location where data or instructions are stored, is termed a physical address. The set of these addresses constitutes the memory space.
- Memory space represents the actual locations within the physical memory hardware where data is stored and can be directly accessed.
- It reflects the real and finite capacity of the physical memory modules installed in the computer system.
Each address that is referenced by the CPU goes through an address mapping (or address translation) from so called virtual address to a physical address in main memory.
Virtual memory system provides a mechanism for transalting program-generated addresses into correct main memory locations dynamically.
The address translation or mapping is handled automatically by the hardware by means of a mapping table.

Memory Table for Mapping a Virtual Address

Virtual addresses, typically processed through memory mapping tables, are translated to physical addresses or mapped. The process can be better understood with the help of the following steps:

First, the virtual address is received in the virtual address register.
Subsequently, the virtual address is sent to the memory mapping table, which holds crucial information about the location of the address in the main memory and its accessibility.
The details, including the location of the address in the main memory, are stored in the memory table buffer register.
This information is then transferred to the main memory address register, now represented as a physical address, where it is stored.
With the physical address, the data in the main memory can be accessed.
The accessed data is stored in the main memory buffer register.

There are two primary virtual memory management techniques that handle the mapping from virtual addresses to physical addresses:

Paging (Specifically Known as Demand Paging)

Paging is a virtual memory management technique that divides both virtual and physical memory into fixed-sized pages. When a process needs data from the disk, only the necessary pages are loaded into the main memory. Demand Paging optimizes memory usage by bringing in data on-demand, reducing the initial load time and allowing for more efficient utilization of resources.
Segmentation (Demand Segmentation)

Segmentation is another virtual memory management approach that divides memory into variable-sized segments. Each segment represents a logical unit, such as a function or a data structure. When a process requires a specific segment, only that segment is loaded into memory. Demand Segmentation allows for flexibility in managing memory spaces and is particularly useful in scenarios where the size of data structures varies dynamically.