5.4 Segmented virtual memory

advertisement
Chapter 5: Memory Management
5.1 Memory management
Instructions must be in main memory in order to be
fetched and executed by a processor. Any data
operand referenced by an instruction must be in
main memory for the instructions to exec
successfully.
main memory = volatile => power failure:
information stored is lost
disk storage: non-volatile; much system design
effort is concerned with writing sufficient
information safely out to disk to guard against loss
of main memory on a crash.
5.2 The memory hierarchy
figure: CPU registers: fastest but smallest store;
large proportion of machine’s instructions will
access data from CPU registers
cache: small and fast compared with main memory
and acts as a buffer between CPU and memory;
contains copy of most recently used memory
locations
most software exhibits temporal locality of access: it
is likely that the same location will be used again
soon, and if so, the address will be found in cache
=> cache hit
spatial locality: future accesses are often to addresses near to previous ones
cache coherent multiprocessor systems: cache remains transparent to software layer, but may
present non-uniform memory access times if particular sections of physical memory are associated
with separate processors
memory that is subject to a DMA transfer must not be simultaneously held in any cache.
5.3 The address space of a process
5.3.1 Address binding
address used in an instruction can be anywhere within the virtual address space => it must be bound
to a physical memory address if the operation is to be carried out successfully
translators output code for each module as though it would start from address zero in memory. A
linker can take a sequence of such modules and create a single composite module by adjusting the
(relative) addresses in all but the first module
system structure is extended to
include one way of performing
address translation. The PC will
contain a virtual address which must
be translated to a real address before
an instruction is fetched
1 – instructie werkt altijd op register van CPU
2 – er wordt vanuit gegaan dat er op adres nul begonnen
wordt, maar dat is niet altijd zo (virtueel), er lopen nog
andere processen i/h virtueel geheugen
3 – loader moet virtuele adressen omzetten in concrete,
fysieke adressen (15 => 415)
5.3.2 Static binding
OS will give loader a base address from which to load the module, having identified a region of
physical memory that is sufficiently large to contain the program
should the loader adjust all relative addresses in the module, converting them to absolute physical
addresses before loading it? => static binding
if this was done, then once a program was loaded into memory the addresses in it would be fixed
and the code or data at these addresses in it would be fixed and the code or data at these addresses
could not be moved in memory without further relocation
5.3.3 Dynamic binding
a given program can run anywhere in physical
memory and can be moved around by the OS; all
of the addresses that it is using are relative to its
own virtual address space and so it is unaware of
the physical locations at which it happens to have
been placed;
it might be possible to protect processes from
each other and the OS from app processes by
whatever mechanism we employ for isolating the
addresses seen by processes
bind the virtual addresses to physical addresses when the instructions are executed
figure: the process sees an environment in which it uses addresses starting from zero. The real
machine is shared by a number of processes, and we see the virtual memory of the process
occupying a portion of real physical memory. The way in which the virtual address space is mapped
to the physical memory must therefore be changed each time the OS switches from one process to
another
5.3.4 Hardware-assisted relocation and protection
physical address of this first location is base
of process. Suppose an instruction is
fetched and decoded and contains address
reference => reference is relative to base,
so the value of the base must be added to it
in order to obtain the correct physical
address. Simplest form of dynamic
relocation hardware is a single base
register and a memory mgmt. unit (MMU)
to perform the translation. The OS must
load the base register as part of setting up the state of a process before passing control to it.
=> does not provide any protection between processes. Natural to combine relocation and
protection in a single set of registers by introducing a second register: limit register , that delimits
the upper bound of the program => figure: typical instruction exec cycle augmented with protection
checks and address relocation
(base & limit: tussen die grenzen wordt gewerkt, daar moet instructive zitten; moet hardware-matig
gebeuren)
5.4 Segmented virtual memory
(virtueel geheugen opdelen in stukken, verfijnde protectie mogelijk => vb. data kan je niet uitvoeren)
1
2
1 – two processes sharing a compiler
2 – relocation hardware to realize scheme 1
virtual address space of the processes: the most significant bit of an address is now taken as a
segment identifier with 0 indicating the data segment (segment 0) and 1 indicating the code segment
(segment 1). If this bit is 0 then base register 0 is used for relocation, same for 1
the system might support separate read, write and execute rights
(2 – 1 bit gebruikt om segmenten aan te duiden => 2 segmenten => 2*2^31 = 2^32 = 4 gig; als je 4
bits gebruikt: 16 segmenten van 256 mb => hoe meer bits je gebruikt om het segment aan te duiden,
hoe kleiner de segmenten maar het total van virtueel geheugen blijft gelijk. Segment kan niet groter
zijn dan fysiek geheugen.
niet alle segmenten van een proces moeten zich in het geheugen bevinden. Als proces dan verwijst
naar segment met virtueel adres dat nog niet in fysiek geheugen zit, dan kan proces niet verder =>
segment moet eerst worden ingeladen en dan wordt proces weer running (addressing exception)
separate areas for code and data, for example,
code, stack and heap. Language systems have
conventions on how the virtual address space is
arranged and a common scheme is shown in the
figure. We see a code segment, which will not
grow in size, followed by a data area which may
well grow. At the top of the virtual address space
we see the stack growing downwards in memory.
The segment is the unit of protection and sharing,
and the more we have, the more flexible we can
be.
virtual address is split into a segment
number and a byte offset within the
segment.
 Advantages of segmented virtual memory:
* virtual address space of a process is divided into logically distinct units which correspond to
the constituent parts of a process
* segments are the natural units of access control, that is, the process may have different
access rights for different segments
* segments are the natural units for sharing code and data objects with other processes
 disadvantages:
* inconvenient for the OS to manage storage allocation for variable-sized segments
*after the system has been running for a while the free memory can become fragmented =>
external fragmentation – it might be that there is no single area large enough to hold some
segment
* limited segment size: restricted by physical memory and y bits
* hardware support needed
*swapping may result in severe performance problems
5.5 Paged virtual memory
paging is a solution to fragmentation
blocks of a fixed size are used for
memory allocation so that if there is
any free store, it is of the right size.
Memory is divided into page frames
and the user’s program is divided into
pages of the same size.
Typically each page is a reasonably
small size, such as 4 kb. Figure: the
process has still organized its virtual
address space so that it uses a
contiguous block of addresses starting
at 0. All the pages in this example are shown in main memory.
a portion of the disk storage can be used as an extension to main memory and the pages of a
process may be in main memory and/or in this backing store. The OS must manage the two levels
of storage and the transfer of pages between them. It must keep a page table for each process to
record this and other information.
5.5.1 Address translation
figure: virtual to physical address
translation by a MMU. Before a page
can be addressed it must have an entry
set up by the OS in the hardware table
shown. This table, the translation
lookaside buffer (TLB) is searched
associatively as part of every address
reference. The virtual page number is
extracted from the virtual address and
an associative lookup is initiated. If an address reference is made to a page which is present in
main memory but does not have an entry in the TLB: address translation fails. Same if an address
reference is made to a page which is not present in main memory => no match will be found in
the table, and the addressing hardware will raise an exception: page fault
no need for an entire program to be loaded in main memory. Demand paging: When a new page
is addressed, to fetch an instruction or access a data operand, it is brought into main memory on
demand.
5.5.2 copy-on-write paging
often the case that a copy of a writeable data area of one process is required by another.
the data is initially the same but any updates made by a process are not visible to the other
processes that have copies of the data. New page table entries will be set up for these pages for
the process with the copy. The two processes now have separate physical copies of the pages
and separate page table entries for them.
copy-on-write: the process which is to receive the copy of the data is given new page table
entries which point to the original physical pages. Only if either process writes to a page of the
data area, a new copy is actually made.
Three main disadvantages of paging:
* page size is choice made by CPU or OS designer: it may not fit with the size of program data
structures => can lead to internal fragmentation
* there may be no correspondence between page protection settings and app data structures. If
two processes are to share a data structure then they must do so at the level of complete pages
* supporting per-process page tables is likely to require more storage space for the OS data
structures
(paginering: principe ~segmentering, maar blokjes hebben vaste grootte
=> zowel virtueel adres als fysiek geheugen zodanig dat een stukje uit de virtuele adresruimte in
een page frame vh intern geheugen geladen kan worden => geen externe fragmentatie
MAAR gemiddeld gezien is laatste pagina half leeg => interne fragmentering
12 bit offset: 12 bits om binnen een pagina te nummeren; 20 bits over om pagina’s te nummeren
=> ong 1 mioe pagina’s (per proces)
hoe hardware-matig ondersteunen? Nieuwe technologie nodig
page fault: verwijzing naar pagina die nog niet in intern geheugen zit (enkel pagina’s die nodig
zijn w ingeladen)
TLB: soort cache voor reeds gebruikte paginanummers => kan associatief doorzocht worden
(content addressable)
offset blijft ongewijzigd => kan je rechtstreeks overzetten bij virtueel => fysiek geheugen
figuur: moet hardware-matig gebeuren; indien software-matig, wordt voor elke instructie een
programma uitgevoerd, maar zo’n programma is zelf een reeks van instructies die naar adressen
verwijzen => zolang dit niet hardware-matig kan gebeuren, was paginering niet mogelijk
notie van limit-registers speelt geen rol meer omdat alle pagina’s dezelfde grootte hebben (offset
kan nooit groter zijn dan 4 kb)
64 bit: nog 2^52: tabel wordt enkele petabytes groot => past zelfs niet in intern geheugen
LRU: least recently used: kans is groot dat je die dan toch niet snel opnieuw gebruikt => vaak
benaderd met NUR (not used recently) policy (want LRU geeft te veel overhead) => elke keer dat
naar een pagina gerefereerd wordt, verandert de rangschikking van de pagina’s, in welke
volgorde ze gebruikt zijn => hele lijst die je moet bijhouden en constant updaten => NUR is
praktischer)
5.6 Paged segmentation
each segment of a process is divided into pages. This means that the virtual address space of a
process is logically divided into segments but is no longer necessary for each segment to be
contiguous in physical memory.
Figure 1: idea of a simple case of
two segments per process
Figure 2: how a virtual address
might be interpreted when both
segmentation and paging hardware are used.
(combinatie van beide: segmenten pagineren (opnieuw logische structuur in blokken
geen base- en limit-register meer
per processor een register
segmenten niet meer beperkt door grootte vh intern geheugen
door paginering moet niet hele segment in intern geheugen kunnen
nadeel is groot aantal tabellen)
5.7 Memory management data structures
5.7.1 Multi-level page tables
non-segmented 32-bit virtual address space is
mapped to a 32-bit physical address using a twolevel page table and pages of size 4kb. The
physical address of the first-level page table is
given by a page table base register (PTBR). The
first-level mapping is then made using the most
significant 10 bits of the virtual address as an
index into that table, yielding the physical
address of the second-level table to use. The
next 10 bits index that table, yielding the physical frame number which is combined with the last 12
bits. If there are no mappings within a particular second-level table then that table can be removed
and an ‘invalid’ bit set at the top level.
(elk process heeft een hiërachische tabel met niveaus, elk niveau heeft ong 1000 ingangen (10 bits
per niveau) => hier 2 niveaus)
5.8 An example of a MMU
1
2
1 - Basic design of the MIPS R2000 chip. as well as a RISC processor there is also a system control
coprocessor on the chip (TLB: referenties die erin zitten zijn net gebruikt => kans op hit is groot (level
1 cache met 64 ingangen: hoeft maar 256 kb groot te zijn => 64 * 4kb) in moderne architecturen
gebruikt men 2 TLB’s: 1 voor instructies en 1 voor data)
2 – the MMU contains a simple TLB with space for 64 entries. Each non-empty entry in the TLB
indicates a page in virtual memory of a process and its corresponding location in physical memory
the MMU contains 4 registers: enty-hi, entry-lo, index and random. The OS uses them to insert or
replace entries in the TLB; the addressing hardware uses them to perform address translation.
32-bit address is divided into a 20-bit page number and a 12-bit offset. Every time a virtual address is
to be translated, the process number and virtual page number are entered in the entry-hi register.
Matched TLB entry is then read and the physical address of the base of that page is then available in
entry-lo.
arrangement of virtual address space:
the OS occupies half of every process address space. OS
code is in a kernel segment which is cached but not
mapped; device registers and buffers are in a kernel
segment which is neither cached nor mapped. OS data,
such as process page tables and data areas, is both
cached and mapped, as is the user segment.
(overeenkomstige adresruimte
hardwarematig opgedeeld in 4 segmenten
=> als begint met 1: kernel (OS), 0: user
resident: kan niet uitgeswapped worden
zoveel mogelijk dingen proberen te cachen => sneller
beschikbaar – typisch zoveel mogelijk uit user segment
kernel segment 1 kan niet gecached worden
veranderen van registers is onafhankelijk van wat de CPU doet => onzin om te gaan cachen, want
veranderd voortdurend door devices => CPU ziet dan verkeerde waarden in cache
er zijn devices die geen virtueel geheugen
nodig hebben
real-time systemen hebben problemen met
virtueel geheugen (conflicteert vaak) => als je
bv paging gebruikt is response time niet meer
deterministisch en dat is nu net wat het wel
moet zijn bij real time
intern geheugen is gedeeld => bv 100
processen van 4 gig willen allemaal tegelijk het
intern geheugen gebruiken + we gaan naar
processen met groter virtueel geheugen dan 4
gig => kan niet eens helemaal in intern
geheugen;
schijf is bottleneck => er zijn betere swapping devices (hoe groter intern geheugen, hoe minder je
moet swappen – swappen verminderen maar niet vermijden)
sneller geheugen: solid state (SSD), flash (RAM 100 nanosec, schijf 10 millisec, solid state tussen RAM
& schijf ~ 1 millisec
Download