OS复习

期末复习前的知识整理，个人使用。

01 Introduction

OS是资源分配者，是控制程序。总是在运行的程序指kernel。
bootstrap program：在power-up或者reboot的时候加载，是固件。初始化系统，load kernel开始执行。
CPU和device controller通过bus连到shared memory；竞争memory cycle
IO可以和CPU并发执行，每个device controller有一个local buffer。CPU通过buffer和memory互传数据；IO的数据从device到buffer。device controller通过interrupt通知CPU它完成了操作。
interrupt通过interrupt vector传输到相应的interrupt服务程序（service routine）。interrupt vector含有所有服务程序的地址。当别的interrupt正在处理的时候，incoming interrupts会被disabled。Interrupt architecture must save the address of the interrupted instruction. Incoming interrupts are *disabled* while another interrupt is being processed to prevent a lost interrupt.
trap是由error或user request引起的（system call）软件interrupt。
操作系统的运行是一种中断驱动机制。
CPU通过存储regsiter和program counter（指向要执行的指令地址）来保存状态；interrupt来源有轮询（polling，看看有没有interrupt）或者vectored interrupt system（设备自己通知有interrupt），可以以此推断interrupt类型；每种interrupt应该怎么处理的代码是分开的code片段。
IO有同步IO和异步IO。
- 同步IO：IO开始之后，等到IO完成，控制才会返回到用户程序；wait inst会idle掉CPU，直到下一个interrupt；同一时间最多有一个IO激活，不会有很多IO同步运行。
异步IO：不需要等到IO完成，控制就返回用户程序，不用等待。System call给OS请求，允许user等待IO完成；device-status table存储了每个IO device的type，addr，state；OS intexes into这个table确定设备状态，修改以包括interrupt。
DMA：direct memory access。给高速IO设备使用，以memory速度传输数据；从buffer直接传输到主存，不需要CPU参与。每个block一个interrupt而不是每byte一个。
存储结构：主存，CPU能直接访问的；二级存储，提供非易失容量；磁盘。
Multitasking：recent value使用要谨慎；

Multiprocessor：提供cache coherence，使得所有的CPU都有recent value在cache里
Multiprogramming: 不让CPU空闲，每次载入一组process一起做，提高CPU利用率，比如有一个IO要wait的时候就切换到另一个。

Timesharing（multitasking）：是Multiprogramming的自然延申，CPU快速切换任务，用户的交互更好。每一个process都会给user回复。
Dual-mode：OS保护模式；User mode，kernel mode（mode bit区分）。user mode到kernel mode：call system call ->trap, mode bit = 0 -> execute system call -> return, mode bit = 1 -> return from system call。
VM
SYSTEM BOOT

02 os structures

System calls：OS提供的服务接口，用户的调用一般是通过更高层面的API
- system call number；存有一个table
- 类型：process control，file manage，device manage，info maintenance，communication
OS design：separate policy from mechanism
Monolithic structure: everything is done in kernel!：UNIX

Microkernel System Structure: 尽量把kernel的移到user的空间。好处：比较稳定。坏处：performance overhead比较大。

Solaris Modular Approach: 彼此调用的时候不需要message passing，所以会比microkernel性能更好。

03 Processes

进程：program in execution。包括text section（code），program counter，stack（函数参数，local vars，return addr），data section（全局变量），heap（动态分配内存）
进程状态：new，running，waiting，ready，terminated
PCB：进程状态，PC，CPU reg content，CPU调度信息（优先级，指向调度队列的指针），memory管理信息（比如指向mem开始和结束的指针），accounting信息，IO状态信息。
进程调度队列：Job queue（可以找到所有的process），Ready queue（ready 状态的process，相当于CPU的device queue），device queues。进程在queue之间的转换由kernel控制。
scheduler：a piece of process。long-term：选出加载到内存的，现在一般不需要了。short-term：选出在CPU跑的process。

medium term scheduler：当内存挤不下那么多process。借助一个磁盘设备存储进程。这个地方称为swapped-out area.
process可以分成IO bound的和CPU bound的。
Context Switch: 保存旧的进程信息，载入新的；时间比较短，一般是milliseconds。加速：Register Ring，每个扇形一组reg，被一个process所使用。context switch的时候，只要把指针指到另一个reg组就可以了。
Process Creation: UNIX为例，fork()创建新进程，exec()把mem space替换成一个新的program，执行一个新的程序。调用folk之后，child会复制parent的内存映像，parent得到的return value是child的id；child得到的return value是0。所以可以通过这个value判断是parent process还是child process.
Process Termination：exit() ask the OS to delete it. 非正常退出abort：父进程终止掉子进程，或者用abort信号。如果parent terminate，child可能会全部terminate，也有可能被赋给init process(1)，此时这个进程被称为孤儿进程。给parent加一个wait，就会等待child进行完毕之后再退出。
Cooperating Process: 可以影响其他进程的执行，或者被影响。有利于信息共享，计算加速（多CPU）等。
Producer-Consumer Problem：
- unbounded-buffer: the buffer space is not limited. producer不需要等待，consumer需要等待数据。
  
  bounded-buffer: fixed buffer size. 两个可能都需要等待。
- shared mem solution：in == out是空的，in+1（要取模）==out是满的。个buffer space当成一个ring来用，利用一个in指针和out指针。避免size只有size-1的话，加一个counter记一下现在有多少个就好了。但是counter是共享变量，会带来很多麻烦。后面会讨论。
IPC（Interprocess Communication）：Two models for IPC: message passing and shared memory。IPC提供send和receive两种操作，如果要在P和Q之间communicate，首先建立communication link，然后使用send/receive。link的实现有physical的（比如共享memory，bus），也有逻辑性的。
- Shared Memory不需要经过kernel（建立shared memory的时候还是需要的），更加高效，但也需要更加小心。
- direct communication：直接通讯，需要严格指明是谁，send(P, message)指明目标, receive(Q, message)指明从谁发过来。
- Indirect communication: 通过mailboxes (port)进行进程之间的间接通讯。只有在两个进程共享同一个mailbox的时候才能建立link。每个mailbox有唯一的ID。间接通讯不需要指定过程的ID，每个link可以给很多进程用，比直接通讯更加常用。send(A, message) and receive(A, message).
Message passing有阻塞的和非阻塞的。阻塞的是同步的：Blocking send的sender被block起来，直到信息被收到；Blocking receive的receiver被block起来，直到信息可用。非阻塞是异步的，发出去就发出去了，receive也不管是不是available。

04 Threads

同一进程的不同线程之间共享code，data，files，但动态分配的reg和stack不同。
多线程有利于实现交互、资源共享，相比建立进程更节约资源，（多核时）提高了并行能力。
User Threads: thread libraries. lib支持，如果用的是旧系统，kernel没有多线程，可以用library在上层提供多线程（POSIX Pthread，Win32 thread，Java thread）。Kernel Threads: kernel已经提供了多线程（几乎现在所有的OS都有）。
Multithreading models:
- Many to one：本质上没有用到kernel thread（不需要到kernel mode，也不用系统参与），所以一个“kernel thread”对应多个user thread。好处是线程管理更高效，不需要系统参与。缺点是如果有一个线程做了system call的话，整个进程就block掉了；同时对于多核OS来说它不能并行运行。对kernel来说，它的多线程是不可见的，一次block会block掉整个process而不是单个的线程，相当于在retrive data被block的时候，display等等其他功能都会一并block。
- one to one：每一个user thread对应一个kernel thread。好处是并行解决了，也不会在某个线程做system call的时候，整个进程都block掉，唯一的坏处就是每创建一个user thread都需要一个kernel thread。
- many to many, to level, …
fork()的实现有的会dup所有的线程，有的只dup一个；exec()会替换整个进程，调用之后，原来的进程空间就不存在了。
Thread Cancellation（线程完成之前终止）。分为Asynchronous终止（立即终止）may be cancelled in the middle of updating data shared with other threads，may not free a system-wide resource 和Deferred中止（周期性通过flag检查是否需要终止）。at so called cancellation points in Pthread
UNIX用signal来通知一个进程特殊事件的发生。有一个signal handler。信号发给哪些线程取决于信号类型。
Thread pool：如果存在短时间内创建很多个thread的需求：在server启动的时候先创建好大量的thread，存在一个pool(array)里面。当有需求的时候，从pool里面拿一个空闲的，用完释放，这样就不会影响后面响应请求的速度（否则就需要不停的createThread）avoid creation and termination overhead, so that it is faster to service a request；put bound on number of threads, thus limit CPU and memory usage
linux里面，把线程叫tasks。通过clone()创建。允许子task和父task（process）共享地址空间。
同步信号和异步信号？synchronous signals - 1. an illegal memory access, a division by zero 2. delivered to the same process that cause the signal

asynchronous signals
1. a user keystroke (Ctrl-C), a timer expiration
2. typically sent to another process

05 CPU Scheduling

通过multiprogramming提高CPU利用率。
调整CPU burst & I/O burst: 一个CPU一个I/O；每个CPU burst末尾有一个I/O operation。
CPU scheduler会从ready queue上面选择一个进程，给它分配CPU。这里的是short time scheduler。
CPU scheduler可能会发生在：
- 从running state到waiting state（比如遇到I/O）
- 从running state到ready state（interrupt）
- 从waiting state到ready state（I/O完成了，等等）
- terminate
- 有的系统只做第1个和第4个，称为nonpreemptive（非抢占式OS）。只有process主动释放出来才会执行schedule，所以如果有一个程序一直在占用资源，就会一直占用着。
- interrupt就是进入ready state。比如timer可以在某个程序跑太久的时候给它执行interrupt。
Dispatcher: 把CPU控制给scheduler选定的进程。其实是context switch的过程。Dispatch latency: Dispatcher 在停止一个进程并开始另一个的花费时间
- switching context
- switch to user mode
- jump to proper location in user program to restart the program
Scheduling Criteria调度指标：
- CPU utilization (CPU 利用率)，我们希望它尽可能高。
- Throughput (吞吐率) 单位时间完成进程的个数。
- Turnaround time (周转时间) ，执行时间，即提交任务到任务完成的时间。
- Waiting time (等待时间) 在ready queue里的时间，不是waiting state的时间。
- Response time (响应时间) submit request到第一次有response的时间。
优化指标：增大CPU利用率和吞吐率，减少各项时间
调度算法：画gantt chart，计算waiting time和awt。
- FCFS；FIFO?
- SJF：短的先去。有非抢占式和非抢占式两种。抢占式：如果来了一个时间比当前剩下时间少的，就换。理论最优，问题在于，我们无法知道下一个CPU Burst的length。有一个平均的算法。
- Priority：根据Priority Number来判断，先做Priority最高的进程（small integer => highest priority）同样有非抢占式和非抢占式两种。Problem：Starvation，低优先级可能一直不会被执行。Solution：Aging，随着时间提高优先级
- RR round robin：每个进程都有一小段CPU time (time quantum)。假定ready queue里面有n个进程，假定time quantum有q time unit，那么没有进程的waiting time超过 (n-1)q.
- multilevel queue：ready queue拆开，foreground - interactive. 用RR策略。background - batch. 用FCFS策略。time slice。
- multilevel feedback queue：进程可以移动到别的queue。例子：有三个queue：
  
  $Q_0$ - RR with time quantum = 8 ms
  
  $Q_1$ - RR with time quantum = 16ms
  
  $Q_2$ - FCFS
  
  对于一个进程，首先进入 $Q_0$ ，如果还没有执行完，进入 $Q_1$ ，如果还没有结束进入 $Q_3$ 。这个算法是比较合理的，我们可以设想到，一个CPU burst比较长的进程，一般也不需要太好的交互性。
多处理器调度：
- 非对称多处理：有一个processor负责管理system data structure，负责调度，别的processor只做执行任务
- 对称多处理（SMP）：每个Processor自己解决调度问题。比较常见
Real-Time Scheduling：能够保证一个task可以在real time指定时间内完成。
- Hard real-time systems: 在指定的时间里必须完成。
- Soft real-time computing: 把priority给这些没完成的。
Thread Scheduling:
- Local Scheduling: thread library (user mode) decides。many to one只能做这个。
- Global Scheduling: kernel decides, 会把管理的线程一起排。

06 Process Synchronization

Process Synchronization: Concurrent access to shared data may result in data inconsistency. 保证数据的一致性需要保证按照正确顺序来执行进程。
producer-consumer problem: 我们有一个空间是用不到的。所以可以添加一个计数器记录item数量，这样可以利用buffer内的所有空间。这个计数器是共享变量，可能造成race condition，count++实际上是一个三条汇编指令构成的过程，也就有可能被interrupt。于是就有可能出现类似count++和count–的指令交织的过程。
critical section：entry section - critical section - exit section - remainder section
解决需要满足：
- Mutual Exclusion: 互斥性。如果有一个process在critical section里执行，则其他processes都不能在它们的critical sections里执行。
- Progress. 如果没有process在critical section，但是已经有process请求进入，那一定要执行对进入critical section的progress的选择。
- Bounded Waiting. 如果有process请求进入critical section，那它等待其他process进去的次数得是有上界的。假设每个process执行的速度非0，对系统里的N个process，我们没有对其相对速度的假设。
Peterson’s solution：2进程solution，假定LOAD和STORE指令是原子性的，不能被interrupted。share 2 var: int turn（轮到谁进CS）, boolean flag[2]（表示一个进程想进CS）。bound=1
Synchronization Hardware：对于单处理器的OS来说，可以在entry section，disable interrupt；在exit section，enable interrupt。但这样是相对低效的，对多处理器来说就更糟了。所以提供了一些原子的硬件指令（test and set，swap……）。这些指令不能被打断。
- TestAndSet Instruction: set true and return val
- Swap Instruction: swap the 2 values
Semaphore: Semaphore S是一个integer变量。wait()或signal()可以修改S的值。这两个操作是原子性的，不能被打断。wait: 等到S为正的时候执行decrement。signal: 执行increment。
Counting semaphore: 没有限制范围。
binary semaphore: S只能取0或1。也被称为mutex locks。
有一个问题：wait()里面的while过程是busy waiting，这是我们不希望的，这些指令应该都是atomic的。而且我们要保证没有进程的wait()和signal()不能同时执行，它本身就是一个critical section的内容。
without busy waiting：重新实现wait和signal。对于每个Semaphore，相应的有一个waiting queue。提供两个操作：block（把process放到waiting queue；wakeup：把一个process从wait queue放到ready queue）。
- wait：value–，如果小于0，block
- signal：value++，如果小于等于0，wakeup一个进程。
deadlock：互相等。
starvation：一直被block。
monitor：Only one process may be active within the monitor at a time；the other processes may be sleeping within the monitor；high-level abstraction

07 Deadlocks

各自享有一定资源的blocked进程，等待获得对方的资源。（waiting for each other）
System model：Resource type: $R_1, \dots, R_m$ ; 每种有 $W_i$ 个实例。

Protocol: request - use - release
Deadlock Characterization，死锁条件：
- Mutual exclusion: 该资源同一时间仅有一个进程可以使用。e.g. single-core CPU, a byte in Memory, printer.
- Hold & wait: 进程自己拥有一定资源，并且在等待获得其他进程的资源。
- No preemption: 无法抢占其他进程的资源，只能享有者自行放弃。
- Circular wait: 存在成环等待链。
死锁预防
- 对于request for resource做出一定限制。
  - Hold & Wait: 资源只能在进程启动之前分配，或者在进程没有资源的时候分配。（但是效率太低了）
  - No Preemption: 允许Preemption? mmm
  - Circular Wait: 给 Resource type 编号，每个进程从小到大发出request。
死锁避免
- 需要系统预先知道一些信息。（priori information）：进程最多使用多少种 Resource Type；动态检测算法，保证不会有deadlock；Resource-allocation state
- 系统处于safe state：存在一个包含所有进程的序列，每个进程只需要请求它前面的进程的资源或当前可用的资源。按照这个序列进程可以顺利执行完毕。
  - safe state -> no deadlock
  - unsafe state -> possibility of deadlock
  - avoidance: 保证系统永远不会处于unsafe的状态。
- 分配算法：
  - Single instance: 用图
    - Claim edge $P_i \rightarrow R_j$ : 以后可能会请求 $R_j$ 。（虚线）
    - Request edge $P_i \rightarrow R_j$ ：确实请求了 $R_j$ 。（实线）
    - Assignment edge $R_j \rightarrow P_i$ : 现在有 $R_j$ 。（实线）
    这个图如果出现了环路，就不太安全，要防止某个虚线变成实线然后成环。所以可以在分配的时候（request => assignment）防止成环，拒绝对应的request。
  - Multiple instance: Banker’s Algorithm
    - 前提：Multiple instances; 每个进程事先声明最多用多少；request资源之后，进程可能需要等待；进程得到所有它需要的资源之后必须有限时间内释放。
    - 必要的数据结构：n个进程，m种资源。
      - Avaliable[] of length m. Available[j] = k，则 $R_j$ 有 $k$ 个可用instance。
      - Max of n * m. Max[i][j] = k，则 $P_i$ 最多申请 $k$ 个 $R_j$ 类型的资源。
      - Allocation of n * m. Allocation[i][j] = k，则 $P_i$ 已经被分配 $k$ 个 $R_j$ 类型的资源。
      - Need of n * m. Max[i][j] = k，则 $P_i$ 还需要 $k$ 个 $R_j$ 类型的资源才能完成。
      - Need[i][j] = Max[i][j] - Allocation[i][j]。
    - Safety Algorithm: 找到上面safe state说到的安全序列。第一层算法。
      
      Work of length m（表示资源可用的数量，init available）, Finish of length n（表示进程是否可以完成，init false）
      1. 找到 $i$ 满足 Finish[i] = false, Need[i] <= Work。如果不存在，去第三步。
      2. Work = Work + Allocation[i]，Finish[i] = true. 去第一步。
      3. 如果对所有的 $i$ ，Finish[i] = true，那么系统处于安全状态。
    - Resource-Request Algorithm: 第二层算法。
      
      Request[i][j] = k，则 $P_i$ 想要k个 $R_j$ 。
      1. 如果Request[i] <= Need[i]，去下一步。否则报错。（合法性检查）
      2. 如果Request[i] <= Available[i]，说明当前可满足，去下一步。否则 $P_i$ 需要等待，因为资源尚未满足。
      3. 假定现在把资源分配给 $P_i$ 了，修改状态如下：
        
        Available = Available - Request;
        Allocation[i] = Allocation[i] + Request[i];
        Need[i] = Need[i] - Request[i];
        
        调用safety algorithm，检查新的状态是否安全。如果安全，就把资源分配给 $P_i$ ；否则， $P_i$ 还是要等待，状态恢复到原样。
死锁检测：允许系统进入死锁状态，通过算法检测死锁，并建立Recovery scheme。
- Single instance: wait-for graph
  
  $P_i => P_j$ 表示 $P_i$ 在等 $P_j$ 。
  - 周期性调用找环路的算法，找到了就说明有死锁。
  - 找环算法复杂度是 $n^2$ ， $n$ 是顶点数。
- Multiple instance: 类似Banker algorithm。
  
  仍然有Available，Allocation向量，Request向量表示现在在向系统请求多少资源。
  1. 找到一个 $i$ 满足 Finish[i] = false，Request[i] <= Work。找不到去第三步。
  2. Work = Work + Allocation[i], Finish[i] = true。去第一步。
  3. 如果存在某个Finish[i]还是false，说明系统在死锁状态， $P_i$ 被锁住了。
  与前面的区别：只要找到了一个能够满足要求的进程就把它满足。
从死锁恢复：进程回滚：每个disjoint cycle需要回滚一个。

08 Main Memory

CPU可以直接访问的存储。L1 Cache - L2 Cache - Main Memory 存储层级结构。
Base & limit regsters: Base register 存储地址空间的起始地址；limit regster 存储地址空间的大小。这两个寄存器定义了逻辑地址空间。
Binding：编译时刻，装入时刻，执行时刻
前两个方法：逻辑地址和物理地址一样
Logical address和physical address在compile和load的地址绑定阶段相同，在执行时是不同的。logical addr由CPU generate，physical addr是存储单元看到的addr。
**Memory-Management Unit（MMU）。**Maps virtual to physical address。动态重定位：logical address（虚拟地址） + relocation base regster得到物理地址。
**Dynamic Loading：动态装入。**在一个进程启动后，将一个可执行文件或库映射到进程内存空间。一般由program支持，也比较难实现。
Dynamic Linking：动态链接。不对那些组成程序的目标文件进行链接，等到程序要运行时才进行链接，不需要library code出现在process内部，节省了运行时的main memory空间也节省了存储空间。需要os support。
Contiguous Memory Allocation连续内存分配：主存一般分成两个部分，os system with interrupt vector（low addr）和user processes（high addr）。每个process都有一个base regsiter（物理地址）和一个limit register（虚拟地址范围，其实就是地址空间的大小），MMU对虚拟地址进行动态映射。
- 地址保护：>=base && < base + limit，则可以访问memory。
Multiple-partition allocation：os保存已经分配的patition和free patition（hole）的信息，当一个进程arrive的时候，从hole分配一个足够大的memory给进程。
- external fragmentation：有很多个比较小的不相邻的hole。可以通过compaction解决，但也不是所有的进程都可以随便移动，比如不是运行时binding的进程；它也会花很大的代价。
- internal fragmentation：系统分配memory的时候一般会多分配一点。
Paging: 逻辑地址空间可以不连续。physical memory的block称为frame，logical memory的block称为page。两个size相同。map page to frame有一个表，这个表就是page table。
page table实现：Page-table base register (PTBR) 指向page table，Page-table length register (PTLR) 表示page table的size。但如果每次访问memory都要去访问page table，其实相当于每次都要做两次memory access，这样代价很大。所以我们需要一个cache（TLB，translation look-aside buffers，快表）来存。如下图所示，page number输入到TLB会进行并行搜索frame number，如果TLB hit，直接得到f；如果miss还是去page table访问。TLB相比于内存访问的时间几乎忽略不计。计算Effective access time。
Some TLBs store address-space identifiers (ASIDs) in each TLB entry – uniquely identifies each process to provide address-space protection for that process
Valid-invalid bit - mem protection
共享页：
- shared code：copy of a read only code；must appear in same location in the logical address space of all processes
- private code & data：Each process keeps a separate copy of the code and data；The pages for the private code and data can appear anywhere in the logical address space
page table结构
- Page table太长了的时候，要给page table也分页，比如two level page table。
- Hashed Page Table：page table有a chain of elements，在链表里面找
- Inverted Page Table：整个系统只有一个page table。每个frame都有一个entry，对应virtual addr和process信息。（找pid，p，排在表的i，i就是frame number。原来是排在表的p，对应的f就是frame number）
Swapping：进程可以被暂时的swap出memory到backing store。
- roll-out, roll-in
- System maintains a ready queue of ready-to-run processes which have memory images on disk
- UNIX, LINUX, WINDOWS
Segmentation: 从user的角度来看memory，即逻辑地址空间下。不一定是连续的。维护信息： <segment number, offset>。分段和分页是会同时存在的。段页式。
- Segment table(limit, base); STBR; STLR; valid bit
The Intel Pentium
- 段页式。逻辑地址空间分成local和global两块；有logical address，linear address，global address。offset前面的部分叫selector，就是Segment+LDT/GDT+Protection。
liunx global dir middle dir page table offset

09 Virtual Memory

Virtual Memory: physical memory与user logical memory的分离。程序只需要部分在memory中，在没改变response time或者turnaround time的情况下，提高了CPU的使用率和吞吐量。logical address space可以比physical address space大得多；address space可以被不同进程共享；进程创建可以更高效。
实现方式：Demand paging或者Demand segmentation
其他好处：system libraries can be shared；shared memory is enabled；process creation can be speeded up（在用fork()进程创建的时候可以share pages，所以加速）
demand paging：只在需要的时候bring a page into main memory。需要的I/O和memory都更少，response更快，允许的user也更多。
Lazy swapper: 直到page需要的时候才把page swap到memory。
page table做一个valid bit，访问一个标记了invalid的page就产生一个page fault，对OS造成一个trap。标记成i的时候可能是在memory中但无效，也有可能是不在memory中，需要区分：
- 就是invalid reference => abort程序
- not in memory => bring to memory
遇到page fault，如果只是不在memory：get empty frame and swap page into frame，reset tables，set validation bit = v，restart the instruction。
$Page\ fault\ rate = p$ , $EAT = (1 - p) * memory\ access \\ + p * (page\ fault\ overhead + swap\ page\ out + swap\ page\ in + restart\ overhead)$
回到原来的程序执行状态：预先试图access，page fault会发生在什么都没修改的时候；或者用temporary register。
virtual memory允许了其他进程创建时候的好处：copy on write，memory-mapped files
copy on write COW：允许父子进程一开始的时候共享一样的page，copy的是page table而不是page本身。只有当其中一个process修改了page的时候，才真正copy出来。read only的时候一直用share的方式。
- 加速了进程创建的过程。free page是从一个zeroed-out page池分配的。
page replacement：如果没有free frame的话，就要进行page replacement。找到memory里一个没有在使用的page并把它换出去。我们希望有一种算法，使得page fault数量最小。
- 用dirty bit来减少page transfer，只有修改过的page才写到disk。page replacement完成了logical和physical memory的分离，因为large virtual memory可以被一个小一些的physical memory提供。
- 找到一个free frame：
  - 有free frame，直接用
  - 没有free frame，用page replacement algorithm选一个替换掉
  - 把它写到secondary storage，改变page和frame table
Page replacement algorithm
- number of page fault是不会到0的，启动的时候就会产生一部分page fault。
- FIFO first in first out：FIFO算法在某些情况下，frame增加page fault反而更多，这不是我们所期望的。Belady’s Anomaly
- Optimal Algorithm：把之后最不会被用到的page替换掉（平手则退化到FIFO）。这当然是最优的，但是我们没法预测哪个是最不会被用到的。
- LRU：合理的假设：最近最少用到的，在near future也不太可能用的到，把它换掉。
  - counter implementation：每次访问把时间copy到counter；需要搜索哪个对应的时间是最早的。
  - stack implementation：当有一个page被refer就移动到栈顶，栈底被选择替换。需要移动很多链表元素，但不需要搜索。
  - 所以match的多的话应该选前者（后者要移动元素），经常有fault的话选后者。
- LRU近似算法：没必要用counter记录下整个时间，为了简化，可以只用一两个bit。
  - reference bit：每一页关联一个bit，init = 0，referenced，set to 1。把bit=0的换掉（如果存在）
  - second chance，clock replacement：如果clock order下要被替换的页面refer bit=1，就set为0，并尝试替换下一个页面。circular queue。
- counting算法：用一个counter记录被reference的次数
  - LFU Algorithm：用的最少的以后用的也最少，所以换掉
  - MFU Algorithm：用的最少的肯定也是最新的，以后应该会被用到
Allocation of Frames：每个process都需要一定的minimum number of frames。两种主要的分配方式：fixed allocation，priority allocation。fixed分配有equal的，也有根据进程size按比例proportional的。priority分配是按照priority number按比例分配。
如果有page fault，找替换的时候有两种方式：替换自己的frame；替换一个比较low priority的frame。
- global replacement：可以选择别的process的frame替换。更常用。缺点：page fault rate很难预测，无法控制自己的page fault rate。
- local replacement：只能选择process自己的frame替换。缺点：free frame很难利用，容易造成low throughput。
Thrashing（颠簸）：某个process忙于swapping pages in and out的现象。
- 如果有个进程pages不够，page fault rate很高，就会导致Thrashing，进而造成CPU利用率很低，进程queue在paging device那边所以ready queue变空，OS/user就会觉得process可以更多（提高multiprogramming的degree），加更多的process，造成恶性循环。
- demand paging：基于locality model，在一段时间里访问这一块pages，下一段时间访问另一些pages……Process migrates from one locality to another。locality overlap
  - 如果size of locality > total memory size，就会发生thrashing
Working-set model
- $\Delta$ ：working set window: 一个固定的page reference次数
- $WSS_i$ $W S S_{i}$ ：working set size of Process $P_i$ $P_{i}$ 。最近 $\Delta$ $Δ$ 访问过的page数目。
  - $\Delta$ 太小则没办法涵盖一个locality，太大了会涵盖好几个
- 怎么尽量避免Thrashing：尽量不让 $D = \Sigma WSS_i > m$ 。 $D$ 是所有进程的demand frames。如果出现这种情况，suspend one of the processes。
Page-Fault Frequency Scheme：每个进程设定一个可接受的page fault rate，如果太低了可以拿出一些frame，太高了就给一些frame。
Memory-Mapped Files：allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory
Buddy system: Allocates memory from fixed-size segment consisting of physically-contiguous pages；Memory allocated using power-of-2 allocator
slab one or more physically contiguous pages

10 FS interface

文件目录的结构和表示，定义

file：连续的逻辑地址空间；一段数据的sequence
文件信息保存在directory structure，在disk保存
open()：返回指向一个open-file table entry的指针
open file需要：file pointer（上次读写的位置），file open count，disk location of the file，access rights
open file lock：mandatory（强制的），advisory（只是提醒有问题）
MS-DOS，MAC-OS-X：每个file记录creator；UNIX-Magic number
文件access方式：sequential，dirent（random），index
文件目录结构：A collection of nodes containing (management) information about all files。都在disk上。
two-level directory，tree-structured
rm（delete file），mkdir（新建文件夹）
efficient，naming，grouping
acyclic-graph directories：可以有不一样的名字。
- general graph directory：允不允许有环？有环的问题：repeat search，file deletion。怎么保证没有环？link只能指向文件不能到子文件夹，garbage collect，cycle detection
mount：A file system must be mounted before it can be accessed。An un-mounted file system is mounted at a mount point。mount /dev/dsk /users
文件共享：protection scheme；On distributed systems, files may be shared across a network；nNetwork File System (NFS) is a common distributed file-sharing method
- remote file system：client sever模型。NFS-UNIX, CIFS-WIN.
- Consistency semantics ：specify how multiple users are to access a shared file simultaneously。AFS-Andrew，UFS-Unix。lAFS has session semantics，Writes only visible to sessions starting after the file is closed

11 FS implementation

文件系统实现，分配方式，题目

文件结构：逻辑存储单元。File control block - inode in UFS，master file table entry in NTFS
层次文件系统
实现FS：Disk structure，in memory structure
文件目录实现：linear list，hash table（fixed size - chained overflow）
分配方式：contiguous，linked，indexed
- 连续分配：有点：简单，只需要开始位置和长度；支持random access；缺点：浪费空间，file cannot grow。map：Q+start addr，R=offset
- extent-based：linked allocation。disk block的链表。简单，只需要起始地址；没有浪费空间；但不支持random access，reliability很差。map：Qth block，offset = R+1。FAT - MS-DOS, OS/2
- indexed：需要index table，支持random access，没有浪费空间但是需要index block的overhead。Q=displacement into index table，R=offset。可以多层index。
free space管理：bit vector，为1则free。bit map需要空间，但是容易get连续file；linked list很难get连续空间，但是可以work（FAT），没有浪费空间；grouping：存储开始的free block的地址，最后一个block存储下面n个free block的地址。counting：第一个free block的addr和连续free block的数量。
保护：指向free list的指针，bit map（存在disk上，memory和disk要一致，所以先把disk的set 1，再deallocate，最后把memory的set 1）
page cache：caches pages rather than disk blocks。memory-mapped IO使用，routine IO用buffer cache。unified buffer cache：cache both
recovery，log structure，NFS

12 Mass Storage System

寻道算法，不同调度方法，交换分区，RAID

disk结构：track：一个盘片上的一个圈；platter：盘片；sector：一个盘片上的一个扇形区域；track sector：一个track上的一个弧；cluster：连在一起的几个track sector；arm, cylinder
seek time：把arm移动到position disk head的时间
rotational delay：block再head下面rotate的时间
transfer time：move data from/to disk surface的时间
disk scheduling：
- FCFS
- SSTF(shortest seek time first)，可能造成饥饿。common。
- SCAN（elevator）朝一个end移动，再反方向end移动
- C-SCAN：到一个end之后，立刻到另一个end，继续SCAN
- C-LOOK：不移动到end，只到request的end
SCAN和CSCAN在disk load比较heavy的系统表现更好。
swap space management：Virtual memory uses disk space as an extension of main memory.
- Kernel uses swap maps to track swap-space use.
RAID：multiple disk drives provides reliability via redundancy
- RAID1：多一组一样的disk
- RAID2：memory-style Hamming code parity
- RAID3：bit interleaved parity
- RAID4：block interleaved parity
- RAID5：block interleaved distributed parity
- RAID6：P+Q redundancy
RAID 0+1: stripe then mirror；1+0：mirror then stripe

13 IO system

DMA

IO硬件：port，bus，controller……
port reg：data in，data out，status，control
polling流程
interrupt流程
DMA direct memory access流程

复习课

chap1
- 操作系统定义；os定义；interrupt概念；interrupt分类（其中有一个system call），同步异步IO
- device status table；存储层次结构；cache和一致性协议；
- multiprogramming和multitasking；user-mode和kernel-mode
chap2
- os服务：资源分配
chap3
- 进程状态，状态transition diagram，waiting state called blocked state
- PCB contents，process scheduling queue，context switch，producer-consumer
- IPC（Interprocess Communication）model
chap4
- 线程概念，user/kernel thread，multi-threading model（many to one，two level……）
- thread specific data structure
chap5
- CPU调度指标criteria：utilization，throughput，……time*3
- 调度算法；SJF not practical
chap6（今年难度中等）
- 解决CS problem的三个要求；抢占式/非抢占式kernel；同步hardware
- semaphore - counting semaphore（P是wait，V是signal）
- producer-consumer（bounded buffer problem）
- readers-writers problem
chap7
- 死锁的四个条件；死锁预防/避免；banker algorithm
chap8
- binding time，逻辑/物理地址，swapping，动态存储分配（FF/BF/WF）
- external/internal fragmentation，内部外部区分
- paging，paging table，TLB，层次page table，inverted page table，注意计算
- EAT，Segmentation：segment table
chap9
- VM的概念，demand paging，page fault，EAT
- Page replacement algorithm - LRU，OPT，NRU，CLOCK，MFU
- thrashing；working-set model
chap10 & 11
- file system，file operation，open file table，access method
- directory structure，tree structure，绝对相对路径
- FCB，虚拟FS，file分配方法：连续/linked/FAT/indexed，extents，Unix file allocation
- Free space mgmt，log structure FS；linux是混合式的
- 给一个文件的逻辑地址换算成物理地址涉及磁盘块的访问（不同分配方式不一样）
chap12
- time to access a disk block
- disk scheduling algorithm
- RAID concept
chap13
- IO操作访问的device，memory mapped IO，blocking/non blocking IO，device status table
- polling，interrupt，DMA
- caching，buffering，spooling（概念了解一下就可以）
- io request的life cycle
OS lab
- ls，cp，ln，mv，echo，gcc，find，grep……以及basic bash/shell command
- how to write a kernel module
- lab project guide要看一下
- building a kernel；boot process
- PCB（task_struct field）
- process state
- create child processes（fork）
- 进程调度：了解一下
- system call：processing，调用，return from system call，实验内容看一下
- kernel的DS：不考
- 中断：中断向量……实现……跳过了
- kernel memory，addr space是怎么管理的，了解一下，VMA，MM
  - page fault处理的流程，buddy算法，slab分配（大小不一的数据结构统一管理）
- linux FS：linux FCB（混合式的存储结构）：关注VFS，内存里有与FCB对应的VFS，为了得到inode，如果VFS已经mount，要从内存访问相应的磁盘里的FCB，然后去找文件；open，read，实验三guide