Cache
Question 1 |
13.5 |
Hit ratio of cache=0.94
Word size is 64 bits = 8 bytes.
Cache line size = 256 bytes = 32 words
Main memory access time=20ns(time for first word)+155ns(time for remaining 31 words, 31*5=155ns) = 175 ns
Average access time = h1*t1+(1-h1)(t1+t2) = t1+(1-h1)t2
⇒ 3+(0.06)(175) = 13.5 ns
Question 2 |
A1 = 0x42C8A4, A2 = 0x546888, A3 = 0x6A289C, A4 = 0x5E4880
Which one of the following is TRUE?
A1 and A4 are mapped to different cache sets. | |
A1 and A3 are mapped to the same cache set.
| |
A3 and A4 are mapped to the same cache set.
| |
A2 and A3 are mapped to the same cache set.
|
Each of the physical addresses mentioned contain 6 hexa-decimal digits, so the physical address is 24 bits long.
Block size is 256 bytes, block offset = 8 bits as it is a byte addressable memory.
Cache size = 64KB
Number of blocks in the cache = 64KB/256B = 256
It is a 4-way set associative cache, so no. of sets in the cache = 256/4 = 64 = 2^6
In the physical address we need 6 bits for the SET number.
TAG bits = 24 - 6 - 8 = 10
So the 32 bits physical address is divided as (10 TAG bits + 6 SET number bits + 8 OFFSET bits)
The given physical addresses are in hexadecimal. So when we convert them into binary each hexadecimal digit will take 4 binary bits.
From the binary form of the addresses the least significant 8 bits are for the OFFSET, the next 6 bits are for the SET number. And the next 10 bits are for the TAG bits.
You can see the bold and underlined bits for the SET number for each of the given addresses.
A1 = 0x42C8A4 = (0100 0010 1100 1000 1010 0100) ==> set bits = 00 1000
A2 = 0x546888 = (0101 0100 0110 1000 1010 0100) ==> set bits = 10 1000
A3 = 0x6A289C = (0110 1010 0010 1000 1010 0100) ==> set bits = 10 1000
A4 = 0x5E4880 = (0101 1110 0100 1000 1010 0100) ==> set bits = 00 1000
From the given options option-4 is TRUE as A2, A3 are mapped to the same cache SET.
Question 3 |
The size of the physical address space of a processor is 2P bytes. The word length is 2W bytes. The capacity of cache memory is 2N bytes. The size of each cache block is 2M words. For a K-way set-associative cache memory, the length (in number of bits) of the tag field is
P - N - log2K | |
P - N + log2K
| |
P - N - M - W - log2K | |
P - N - M - W + log2K |
Each word is of size 2W bytes.
Number of words in physical memory = 2(P-W)
So the physical address is P-W bits
Cache size is 2N bytes.
Number of words in the cache = 2(N-W)
Block size is 2M words
No. of blocks in the cache = 2(N-W-M)
Since it is k-way set associative cache, each set in the cache will have k blocks.
No. of sets = 2(N-W-M ) / k
SET bits = N-W-M-logk
Block offset = M
TAG bits = P-W-(N-M-W-logk)-M = P-W-N+M+W+logk-M = P - N + logk
Question 4 |
Consider a two-level cache hierarchy with L1 and L2 caches. An application incurs 1.4 memory accesses per instruction on average. For this application, the miss rate of L1 cache is 0.1; the L2 cache experiences, on average, 7 misses per 1000 instructions. The miss rate of L2 expressed correct to two decimal places is __________.
0.05 | |
0.06 | |
0.07 | |
0.08 |
For 1000 instructions total number of memory references = 1000 * 1.4 = 1400
These 1400 memory references are first accessed in the L1.
Since the miss rate of L1 is 0.1, for 1400 L1 references the number of misses = 0.1 * 1400 = 140
We know when there is a miss in L1 we next access the L2 cache.
So number of memory references to L2 = 140
It is given that there are 7 misses in L2 cache. Out of 140 memory references to L2 cache there are 7 misses.
Hence the miss rate in L2 cache = 7/140 = 0.05
Question 5 |
Consider a 2-way set associative cache with 256 blocks and uses LRU replacement. Initially the cache is empty. Conflict misses are those misses which occur due to contention of multiple blocks for the same cache set. Compulsory misses occur due to first time access to the block. The following sequence of accesses to memory blocks
(0, 128, 256, 128, 0, 128, 256, 128, 1, 129, 257, 129, 1, 129, 257, 129)
is repeated 10 times. The number of conflict misses experienced by the cache is __________.
76 | |
79 | |
80 | |
81 |
If a block is accessed once and then before its second access if there are k-unique block accesses and the cache size is less than k, and in that case if the second access is amiss then it is capacity miss. In this case the cache doesn't have the size to hold all the k-unique blocks that came and so when the initial block came back again it is not in the cache because capacity of the cache is less than the unique block accesses k. Hence it is capacity miss.
If a block is accessed once and then before its second access if there are k-unique block accesses and the cache size is greater than k, and in that case if the second access is a miss then it is conflict miss. In this case the cache can hold all the k-unique blocks that came and even then when the initial block came back again it is not in the cache because it got replaced, then it is conflict miss.
LRU will use the function xmod128

Cache size = 256 Bytes
2 way set associative cache
So, no. of cache sets = 256/2 = 128
Blue → Compulsory miss
Red → Conflict miss
At the end of first round we have 4, compulsory misses & 4 conflict misses.
Similarly, if we continue from Round-2 to last round, in every round we will get 8 conflict misses.
Total conflict misses = 4+9(8) = 4+72 = 76 (conflict misses)
Question 6 |
A cache memory unit with capacity of N words and block size of B words is to be designed. If it is designed as a direct mapped cache, the length of the TAG field is 10 bits. If the cache unit is now designed as a 16-way set-associative cache, the length of the TAG field is ___________ bits.
14 | |
15 | |
16 | |
17 |
(Tag bits + bits for block number + bits for block offset)
With block size being B words no. of bits for block offset = log (B)
Because the cache capacity is N words and each block is B words, number of blocks in cache = N / B
No. of bits for block number = log (N/B)
So, the physical address in direct mapping case
= 10 + log (N/B) + log (B)
= 10 + log (N) – log B + log B
= 10 + log (N)
If the same cache unit is designed as 16-way set associative, then the physical address becomes
(Tag bits + bits for set no. + Bits for block offset)
There are N/B blocks in the cache and in 16-way set associative cache each set contains 16 blocks.
So no. of sets = (N/B) / 16 = N / (16*B)
Then bits for set no = log (N/16B)
Bits for block offset remain the same in this case also. That is log (B).
So physical address in the set associative case
= tag bits + log (N/16*B) + log B
= tag bits + log (N) – log (16*B) + log B
= tag bits + log (N) – log 16 – log B + log B
= tag bits + log N – 4
The physical address is the same in both the cases.
So, 10 + log N = tag bits + log N – 4
Tag bits = 14
So, no. of tag bits in the case 16-way set associative mapping for the same cache = 14.
Question 7 |
In a two-level cache system, the access times of L1 and L2 caches are 1 and 8 clock cycles, respectively. The miss penalty from the L2 cache to main memory is 18 clock cycles. The miss rate of L1 cache is twice that of L2. The average memory access time (AMAT) of this cache system is 2 cycles. The miss rates of L1 and L2 respectively are:
0.111 and 0.056 | |
0.056 and 0.111 | |
0.0892 and 0.1784 | |
0.1784 and 0.0892 |
AMAT = (L1 hit rate)*(L1 access time) + (L1 miss rate)*(L1 access time + L1 miss penalty)
= L1 hit rate * L1 access time + L1 miss rate * L1 access time + L1 miss rate * L1 miss penalty
We can write,
L1 miss rate = 1 - L1 hit rate
AMAT = L1 hit rate * L1 access time + (1 - L1 hit rate) * L1 access time + L1 miss rate * L1 miss penalty
By taking L1 access time common,
= (L1 hit rate + 1 - L1 hit rate)* L1 access time + L1 miss rate * L1 miss penalty
AMAT = L1 access time + L1 miss rate * L1 miss penalty
We access L2 only when there is a miss in L1.
So, L1 miss penalty is nothing but the effective time taken to access L2.
L1_miss_penalty = Hit_rate_of_L2* Access time of L2 + MissRate of L2 *(Access time of L2+ miss penalty L2)
= Hit_rate_of_L2* Access time of L2 + MissRate of L2 *Access time of L2 + MissRate of L2 * miss penalty L2
By taking Access time of L2 common we get,
= Access time of L2 * (Hit_rate_of_L2 + MissRate of L2 ) + MissRate of L2 * miss penalty L2
We know, MissRate of L2 = 1 - Hit_rate_of_L2 → Hit_rate_of_L2 + MissRate of L2 = 1
So, the above formula becomes,
L1_miss_penalty = Access time of L2 + (MissRate of L2 * miss penalty L2)
It is given,
access time of L1 = 1,
access time of L2 = 8,
miss penalty of L2 = 18,
AMAT = 2.
Let, miss rate of L2 = x.
Since it is given that L1 miss rate is twice that of 12 miss rate, L1 miss rate = 2 * x.
Substituting the above values,
L1_miss_penalty = Access time of L2 + (MissRate of L2 * miss penalty L2)
L1_miss_penalty = 8 + (x*18)
AMAT = L1 access time + L1 miss rate * L1 miss penalty
2 = 1 + (2*x) (8+18*x)
36*x2+ 16*x -1 = 0
By solving the above quadratic equation we get,
x = Miss rate of L2 = 0.056
Miss rate of L1 = 2*x = 0.111
Question 8 |
Consider a machine with a byte addressable main memory of 232 bytes divided into blocks of size 32 bytes. Assume that a direct mapped cache having 512 cache lines is used with this machine. The size of the tag field in bits is _____________.
18 | |
19 | |
20 | |
21 |
So the physical address is 32 bits long.
Each block is of size 32(=25) Bytes. So block offset 5.
Also given that there are 512(=29) cache lines, since it is a direct mapped cache we need 9 bits for the LINE number.
When it is directed mapped cache, the physical address can be divided as
(Tag bits + bits for block/LINE number + bits for block offset)
So, tag bits + 9 + 5 = 32
Tag bits = 32 - 14 = 18
Question 9 |
The read access times and the hit ratios for different caches in a memory hierarchy are as given below.

The read access time of main memory is 90 nanoseconds. Assume that the caches use the referred-word-first read policy and the write back policy. Assume that all the caches are direct mapped caches. Assume that the dirty bit is always 0 for all the blocks in the caches. In execution of a program, 60% of memory reads are for instruction fetch and 40% are for memory operand fetch. The average read access time in nanoseconds (up to 2 decimal places) is ___________.
4.72 | |
4.73 | |
4.74 | |
4.75 |
Hierarchical memory (Default case):
For 2-level memory:
The formula for average memory access time = h1 t1 + (1-h1)(t1 + t2)
This can be simplified as
t1 + (1-h1)t2
For 3-level memory:
h1 t1 + (1-h1)(t1 + h2 t2 + (1-h2)(t2 + t3))
This can be simplified as
t1 + (1-h1)t2 + (1-h1)(1-h2)t3
Instruction fetch happens from I-cache whereas operand fetch happens from D-cache.
Using that we need to calculate the instruction fetch time (Using I-cache and L2-cache) and operand fetch time (Using D-cache and L2-cache) separately.
Then calculate 0.6 (instruction fetch time) + 0.4(operand fetch time) to find the average read access time.
The equation for instruction fetch time = t1 + (1-h1 ) t2 + (1-h1 )(1-h2 ) t3
= 2 + 0.2*8 + 0.2*0.1*90 = 5.4ns
Operand fetch time = t1 + (1-h1)t2 + (1-h1)(1-h2)t3 = 2 + 0.1*8 + 0.1*0.1*90 = 3.7ns
The average read access time = 0.6*5.4 + 0.4*3.7 = 4.72ns
Question 10 |
The width of the physical address on a machine is 40 bits. The width of the tag field in a 512 KB 8-way set associative cache is _________ bits.
24 | |
25 | |
26 | |
27 |
(Tag bits + bits for set no. + Bits for block offset)
In question block size has not been given, so we can assume block size 2x Bytes.
The cache is of size 512KB, so number of blocks in the cache = 219/2x = 219-x
It is 8-way set associative cache so there will be 8 blocks in each set.
So number of sets = (219 − x)/8 = 216 − x
So number of bits for sets = 16−x
Let number of bits for Tag = T
Since we assumed block size is 2x Bytes, number of bits for block offset is x.
So, T + (16−x) + x = 40
T + 16 = 40
T = 24
Question 11 |
A file system uses an in-memory cache to cache disk blocks. The miss rate of the cache is shown in the figure. The latency to read a block from the cache is 1 ms and to read a block from the disk is 10 ms. Assume that the cost of checking whether a block exists in the cache is negligible. Available cache sizes are in multiples of 10 MB.

The smallest cache size required to ensure an average read latency of less than 6 ms is _______ MB.
30 | |
31 | |
32 | |
33 |
So we consider it as hierarchical memory.
But it is given that “assume that the cost of checking whether a block exists in the cache is negligible”, which means don't consider the checking time in the cache when there is a miss.
So formula for average access time becomes h1t1 + (1-h1)(t2) which is same as for simultaneous access.
Though the memory is hierarchical because of the statement given in the question we ignored the cache access time when there is a miss and effectively the calculation became like simultaneous access.
The average access time or read latency = h1t1 + (1-h1)t2.
It is given that the average read latency has to be less than 6ms.
So, h1t1 + (1-h1)t2 < 6
From the given information t1 = 1ms, t2 = 10ms
h1*1+(1-h1)10 < 6
10-9h1 < 6
-9h1 < - 4
-h1 < - 4/9
-h1 < -0.444
Since in the given graph we have miss rate information and 1-h1 gives the miss rate, so we add 1 on both sides of the above inequality.
1-h1 < 1-0.444
1-h1 < 0.555
So for the average read latency to be less than 6 ms the miss rate hsa to be less than 55.5%.
From the given graph the closest value of miss rate less than 55.5 is 40%.
For 40% miss rate the corresponding cache size is 30MB.
Hence the answer is 30MB.
Question 12 |
14 | |
15 | |
16 | |
17 |
Question 13 |
Consider a machine with a byte addressable main memory of 220 bytes, block size of 16 bytes and a direct mapped cache having 212 cache lines. Let the addresses of two consecutive bytes in main memory be (E201F)16 and (E2020)16. What are the tag and cache line address (in hex) for main memory address (E201F)16?
E, 201 | |
F, 201 | |
E, E20 | |
2, 01F |
ac No. of cache lines in cache is 212 bytes which needs 12 bits. So next lower 12 bits are line indexing bits.
And the remaining top 4 bits are tag bits (out of 20). So answer is (A).
Question 14 |
n⁄N | |
1⁄N | |
1⁄A | |
k⁄n |
As the set associativity size keeps increasing then we don't have to replace any block, so LRU is not of any significance here.
Question 15 |
20 | |
21 | |
22 | |
23 |
Cache size = 16K bytes = 214 Bytes
block size = 8 words = 8⨯4 Byte = 32 Bytes = 25 Bytes
(where each word = 4 Bytes)
No. of blocks =214/25=29
block offset =9bits
Because it is 4-way set associative cache, no. of sets =29/4=27
Set of set = 7 bits
TAG = 32 – (7 + 5) = 20 bits

Question 16 |
A queue cannot be implemented using this stack. | |
A queue can be implemented where ENQUEUE takes a single instruction and DEQUEUE takes a sequence of two instructions. | |
A queue can be implemented where ENQUEUE takes a sequence of three instructions and DEQUEUE takes a single instruction. | |
A queue can be implemented where both ENQUEUE and DEQUEUE take a single instruction each. |
Suppose:

Dequeue:

If we want to delete an element, that first we need to delete 1.
Enqueue:

Question 17 |
A smaller block size implies better spatial locality | |
A smaller block size implies a smaller cache tag and hence lower cache tag overhead | |
A smaller block size implies a larger cache tag and hence lower cache hit time | |
A smaller block size incurs a lower cache miss penalty |
Question 18 |
Width of tag comparator | |
Width of set index decoder | |
Width of way selection multiplexor | |
Width of processor to main memory data bus |
Width of set index decoder also will be affected when set offset is changed.
A k-way set associative cache needs k-to-1 way selection multiplexer. If the associativity is doubled the width of way selection multiplexer will also be doubled.
With of processor to main memory data bus is guaranteed to be NOT affected as this is not dependent on the cache associativity.
Question 19 |
(j mod v) * k to (j mod v) * k + (k-1) | |
(j mod v) to (j mod v) + (k-1) | |
(j mod k) to (j mod k) + (v-1) | |
(j mod k) * v to (j mod k) * v + (v-1) |
Question 20 |
4 | |
5 | |
6 | |
7 |
Capacity of the chips available = 1K
No. of address lines = 16K/1K = 16
Hence we can use 4 × 16 decoder for this. But we were only given 2 × 4 decoders.
So 4 decoders are required in inner level as from one 2×4 decoder we have only 4 output lines whereas we need 16 output lines.
Now to point to these 4 decoders, another 2×4 decoder is required in the outer level.
Hence no. of 2×4 decoders to realize the above implementation of RAM = 1 + 4 = 5
Question 21 |
11 | |
14 | |
16 | |
27 |
Cache block size = 32 Bytes
So, number of blocks in the cache = 256K / 32 = 8 K
It is a 4-way set associative cache. Each set has 4 blocks.
So, number of sets in cache = 8 K / 4 = 2 K = 211.
So, 11 bits are needed for accessing a set. Inside a set we need to identify the cache block.
Since cache block size is 32 bytes, block offset needs 5 bits.
Out of 32 bit address, no. of TAG bits = 32 - 11 - 5 = 32-16 = 16
So, we need 16 tag bits.
Question 22 |
160 Kbits | |
136 Kbits | |
40 Kbits | |
32 Kbits |
Cache block size = 32 Bytes
So, number of blocks in the cache = 256K / 32 = 8 K
It is a 4-way set associative cache. Each set has 4 blocks.
So, number of sets in cache = 8 K / 4 = 2 K = 211.
So, 11 bits are needed for accessing a set. Inside a set we need to identify the cache block.
Since cache block size is 32 bytes, block offset needs 5 bits.
Out of 32 bit address, no. of TAG bits = 32 - 11 - 5 = 32-16 = 16
So, we need 16 tag bits.
It is given that in addition to address tag there are 2 valid bits, 1 modified bit and 1 replacement bit.
So size of each tag entry = 16 tag bits + 2 valid bits + 1 modified bit + 1 replacement bit = 20 bits
Size of cache tag directory=Size of tag entry×Number of tag entries
= 20×8 K
=160 Kbits
Question 23 |
4864 bits | |
6144 bits | |
6656 bits | |
5376 bits |
So we need 8 bits for indexing the 256 blocks in the cache. And since a block is 32 bytes we need 5 word bits to address each byte.
So out of remaining (32 - 8 - 5), 19 bits should be tag bits.
So tag entry size = 19 + 1 (valid bit) + 1 (modified bit) = 21 bits
∴ Total size of metadata = 21 × Number blocks = 21 × 256 = 5376 bits
Question 24 |
3 | |
8 | |
129 | |
216 |
So number of sets = 16 / 4 = 4
Given main memory blocks will be mapped to one of the 4 sets.
The given blocks are 0, 255, 1, 4, 3, 8, 133, 159, 216, 129, 63, 8, 48, 32, 73, 92, 155, and they will be mapped to following sets(Since there are 4 sets we get the set number by doing mod 4):
0, 3, 1, 0, 3, 0, 1, 3, 0, 1, 3, 0, 0, 0, 1, 0, 3
The cache mapping and the replacement using LRU can be seen from the below diagram.

We can see that at the end of mapping the given block pattern block number 216 is not in the cache.
Question 25 |
IV only | |
I and IV only
| |
I, III and IV only
| |
I, II, III and IV
|
1st is not always correct as data need not to be exactly same at the same point of time and so write back policy can be used in this instead of write through policy.
2nd is not needed when talking only about L1 and L2, as whether L2 is write through will have impact on any memory higher than L2, not on L1.
For 3rd, associativity can be equal also, so it need not be true.
For 4th statement, L2 cache has to be as large as L1 cache. In most cases L2 cache will be generally larger than L1 cache, but we will never have an L2 cache smaller than L1 cache. So only 4th statement is necessarily true. Hence option A is the answer.
Question 26 |
32Kbits | |
34Kbits | |
64Kbits | |
68Kbits
|
Block size = 16 Bytes = 24 Bytes, so block offset = 4 bits
Number of blocks in the cache = 64KB / 16B = 4K
It is 2-way set associative cache, so each set contains 2 blocks.
So, number of sets = 4K / 2 = 2K = 211
So, we need 11-bits for set indexing.
Since the address is 32 bits long, number of tag bits = 32−11−4 = 17
Total size of the tag directory = No. of tag bits ×Number of cache blocks =17×4K
=68 Kbits
Question 27 |
ARR [0] [4] | |
ARR [4] [0]
| |
ARR [0] [5] | |
ARR [5] [0]
|
Block size = 16 Bytes
Number of blocks in the cache = 64KB / 16B = 4K
It is 2-way set associative cache, so each set contains 2 blocks.
So, number of sets = 4K / 2 = 2K = 211
Each element size = 8B and size of each block = 16 B
No. of elements in one block = 16/8 = 2 We know no. of elements in one row of the array = 1024. So, no. of blocks for one row of the array = 1024/2 = 512
We know there are 211 sets, and each set has two blocks.
For any element to share the same cache index as ARR[0][0], it has to belong to the same set.
ARR[0][0] belongs to set-0. The next element that belongs to set-0 will have block number 2048 because 2048 mod 211 = 0.
Block number 2048 will be from row number 2048/512 = 4, because each row has 512 blocks we divide the block number with 512 to get the row number.
From the given options ARR[4][0] will have the same cache index as ARR[0][0].
Question 28 |
0% | |
25% | |
50% | |
75% |
So in one block 2 elements will be stored.
To store 1024×1024 elements no. of blocks needed = 1024×1024/2 = 220/2 = 219.
In each block the first element will be a miss and second element will be a hit because on every miss that entire block is brought into the cache. So, there will be 219 hits and 219 misses. Total no. of references = no. of elements in the array = 220
⇒hit ratio = No. of hits / Total no. of references
=219/220 = 1/2 = 0.5
=0.5×100=50%
Question 29 |
7, 6, 7 | |
8, 5, 7 | |
8, 6, 6 | |
9, 4, 7 |

Question 30 |
000011000 | |
110001111 | |
00011000 | |
110010101 |
So lets first convert given Hexadecimal no. into binary number,

Question 31 |
9, 6, 5
| |
7, 7, 6 | |
7, 5, 8 | |
9, 5, 6 |
Each line size 64 words, so no. of bits for WORD = 6 bits
Because it is 4-way set associative cache, no. of sets in the cache = 128/4 = 32 = 25
No. of bits for the set number = 5
Because the address is 20-bits long, no. of TAG bits = 20-5-6 = 9
Question 32 |
40 | |
50 | |
56 | |
59 |
= log2 216 = 16 bits
Cache line size is 64 bytes means offset bit required is
log2 26 = 6 bits
No. of lines in cache is 32, means lines bits required is
log2 25 = 5 bits.
So, tag bits = 16 - 6 - 5 = 5 bits
No. of elements in array is
50 × 50 = 2500
and each element is of 1 byte.
So, total size of array is 2500 bytes.
So, total no. of lines it will require to get into the cache,
⌈2500/64⌉ = 40
Now, given starting address of array,

Now we need 40 cache lines to hold all array elements, but we have only 32 cache lines.
Now lets divide 2500 array elements into 40 array lines, each of which will contain 64 of its elements.
Now,

So, if complete array is accessed twice, total no. of cache misses is,

Question 33 |
line 4 to line 11 | |
line 4 to line 12 | |
line 0 to line 7 | |
line 0 to line 8 |
Question 34 |
3 | |
18 | |
20 | |
30 |

So memory block 18 is not in the cache.
Question 35 |
2.4 ns
| |
2.3 ns | |
1.8 ns
| |
1.7 ns
|
Block size = 32 bytes
No. of blocks = 2 (2-way set associative)
No. of combinations = Cache size / (Block size×No. of blocks) = (32×210B) / (32×2) = 29
Index bits = 9
No. of set bits = 5 (∵ cache block size is 32 bytes = 25 bytes)
No. of Tag bits = 32 - 9 - 5 = 18
Hit latency = Multiplexer latency + latency
= 0.6 + (18/10)
= 0.6 + 1.8
= 2.4 ns
Question 36 |
for (i=0; i<512; i++) { for (j=0; j<512; j++) { x += A[i][j]; } } |
for (i=0; i<512; i++) { for (j=0; j<512; j++) { x += A[j][i]; } } |
0 | |
1/16 | |
1/8 | |
16 |
(for every element there would be a miss)
M1/M2 = 16384/262144 = 1/16
Question 37 |
for (i=0; i<512; i++) { for (j=0; j<512; j++) { x += A[i][j]; } } |
for (i=0; i<512; i++) { for (j=0; j<512; j++) { x += A[j][i]; } } |
0 | |
2048 | |
16384 | |
262144 |
[P2] Access A in column major order.
No. of cache blocks=Cache size/Block size = 32KB / 128B = 32×210B / 128B = 215 / 27 = 256
No. of array elements in each block = Block size / Element size = 128B / 8B =16
Total misses for (P1)=Array size * No. of array elements / No. of cache blocks = (512×512) * 16 / 256=16384
Question 38 |
2.4 ns | |
2.3 ns | |
1.8 ns | |
1.7 ns |
For k,

∴ Hit latency = 17/1 = 1.7 ns
Question 39 |
92 ns | |
104 ns | |
172 ns | |
184 ns |
Latency = 80 ns
k = 24
No. of banks are accessed in parallel , then it takes k/2 ns = 24/2 = 12ns
Decoding time = 12ns
Size of each bank C = 2 bytes
Each Bank memory is 2 bytes, and there is 24 banks are there, in one iteration they may get 2*24 = 48 bytes
And getting 64 bytes requires 2 iteration.
T = decoding time + latency = 12+80 = 92
For 2 iterations = 92×2 = 184ns
Question 40 |
A cache line is 64 bytes. The main memory has latency 32 ns and bandwidth 1 GBytes/s. The time required to fetch the entire cache line from the main memory is
32 ns | |
64 ns | |
96 ns | |
128 ns |
→ So, for 64 bytes it will take 64*1/109 = 64ns.
Main memory latency = 32
Total time required to place cache line is
64+32 = 96 ns
Question 41 |

1K × 18-bit, 1K × 19-bit, 4K × 16-bit | |
1K × 16-bit, 1K × 19-bit, 4K × 18-bit | |
1K × 16-bit, 512 × 18-bit, 1K × 16-bit | |
1K × 18-bit, 512 × 18-bit, 1K × 18-bit |
Bits to represent blocks = m
No. of words in a block = 2n
Bits to represent a word = n
Tag bits = (length of physical address of a word) - (bits to represent blocks) - (bits to represent a word)
⇒ Each block will have its own tag bits.
So, total tag bits = no. of blocks × tag bits.
Question 42 |
10, 17
| |
10, 22
| |
15, 17
| |
5, 17
|
So indexing requires 10 bits.
No. of offset bit required to access 32 byte block = 5
So, No. of TAG bits = 32 - 10 - 5 = 17
Question 43 |
10 | |
6.4 | |
1 | |
.64 |
In 106ns refresh 100 times.
Each refresh takes 100ns.
Memory cycle time = 64ns
Refresh time per 1ms i.e., per 106ns = 100 * 100 = 104ns
Refresh time per 1ns = (104)/(106) ns
Refresh time per cycle = (104*64)/(106) = 64ns
Percentage of the memory cycle time is used for refreshing = (64*100)/64 = 1%
Question 44 |
2 | |
3 | |
4 | |
5 |
The given sequernce is 8, 12, 0, 12, 8.

So in total 4 misses is there.
Question 45 |
13.0 ns | |
12.8 ns | |
12.6 ns | |
12.4 ns |
H1 = 0.8, (1 - H1) = 0.2
H2 = 0.9, (1 - H2) = 0.1
T1 = Access time for level 1 cache = 1ns
T2 = Access time for level 2 cache = 10ns
Hm = Hit rate of main memory = 1
Tm = Access time for main memory = 500ns
Average access time = [(0.8 * 1) + (0.2 * 0.9 * 10) + (0.2)(0.1) * 1 * 500]
= 0.8 + 1.8 + 10
= 12.6ns
Question 47 |
(k mod m) of the cache | |
(k mod c) of the cache | |
(k mod 2c) of the cache | |
(k mod 2 cm) of the cache
|
= 2c/c
= c
∴ Cache set no. to which block k of main memory maps to
= (Main memory block no.) mod (Total set in cache)
= k mod c
Question 48 |
A computer system has a 4K word cache organized in block-set-associative manner with 4 blocks per set, 64 words per block. The number of its in the SET and WORD fields of the main memory address format is:
15, 40 | |
6, 4 | |
7, 2 | |
4, 6 |
So we need 4 set bits.
And,
64 words per block means WORD bit is 6-bits.
So, answer is option (D).
Question 49 |
exploit the temporal locality of reference in a program | |
exploit the spatial locality of reference in a program | |
reduce the miss penalty | |
none of the above |
The temporal locality refers to reuse of data elements with a smaller regular intervals.
Question 51 |
True | |
False |