6.S081 - Lecture 1 Introduciton

本文最后更新于：2024年4月23日下午

感觉学校的OS课程写的过于草率，加上最近似乎没有什么事情做，那就开一个新坑吧！

Lecture 1: Introduction

操作系统的工作是在多个程序之间共享计算机，并提供比硬件单独支持的更有用的服务集。

管理和抽象低级硬件
在多个程序之间共享硬件
玮程序交互提供受控方式

操作系统通过接口项用户程序提供服务。

简单而狭窄、易于正确实现
提供更多复杂功能
tradeoff：设计依赖于一些机制的界面，组合提供更多的通用性。

操作系统应该提供的功能：

多进程支持
进程间隔离
受控制的进程间通信

xv6：一种在本课程中使用的类UNIX的教学操作系统，运行在RISC-V指令集处理器上，本课程中将使用QEMU模拟器代替
kernel(内核)：为运行的程序提供服务的一种特殊程序。每个运行着的程序叫做进程，每个进程的内存中存储指令、数据和堆栈。一个计算机可以拥有多个进程，但是只能有一个内核

每当进程需要调用内核时，它会触发一个system call（系统调用），system call进入内核执行相应的服务然后返回。

shell：一个普通的程序，其功能是让用户输入命令并执行它们，shell不是内核的一部分，这意味着外壳易于更换

进程与内存

一个 xv6 进程由两部分组成，一部分是用户内存空间（指令，数据，栈），另一部分是仅对内核可见的进程状态。每个进程拥有自己的用户空间内存以及内核空间状态，当进程不再执行时，xv6将存储和这些进程相关的CPU寄存器直到下一次运行这些进程。kernel将每一个进程用一个PID(process identifier)指代。

常用syscall

fork：形式：int fork()。其作用是让一个进程生成另外一个和这个进程的内存内容相同的子进程。在父进程中，fork的返回值是这个子进程的PID，在子进程中，返回值是0
exit：形式：int exit(int status)。让调用它的进程停止执行并且将内存等占用的资源全部释放。需要一个整数形式的状态参数，0代表以正常状态退出，1代表以非正常状态退出

wait：形式：int wait(int *status)。等待子进程退出，返回子进程PID，子进程的退出状态存储到int *status这个地址中。如果调用者没有子进程，wait将返回-1

int pid = fork();
if (pid > 0) {
    printf("parent: child=%d\n", pid);
    pid = wait((int *) 0);
    printf("child %d is done\n", pid);
} else if (pid == 0) {
    printf("child: exiting\n");
    exit(0);
} else {
    printf("fork error\n");
}

前两行输出可能是

1 2	`parent: child=1234 child: exiting`

也可能是

1 2	`child: exiting parent: child=1234`

这是因为在fork了之后，父进程和子进程将同时开始判断PID的值，在父进程中，PID为1234，而在子进程中，PID为0。看哪个进程先判断好PID的值，以上输出顺序才会被决定。

最后一行输出为：

1	`parent: child 1234 is done`

子进程在判断完pid == 0之后将exit，父进程发现子进程exit之后，wait执行完毕，打印输出。

尽管fork了之后子进程和父进程有相同的内存内容，但是内存地址和寄存器是不一样的，也就是说在一个进程中改变变量并不会影响另一个进程。

exec：形式：int exec(char *file, char *argv[])。加载一个文件，获取执行它的参数，执行。如果执行错误返回-1，执行成功则不会返回，而是开始从文件入口位置开始执行命令。文件必须是ELF格式。

xv6 shell使用以上四个system call来为用户执行程序。在shell进程的main中主循环先通过getcmd来从用户获取命令，然后调用fork来运行一个和当前shell进程完全相同的子进程。父进程调用wait等待子进程exec执行完（在runcmd中调用exec）。

/* sh.c */
int
main(void)
{
  static char buf[100];
  int fd;

  // Ensure that three file descriptors are open.
  while((fd = open("console", O_RDWR)) >= 0){
    if(fd >= 3){
      close(fd);
      break;
    }
  }

  // Read and run input commands.
  while(getcmd(buf, sizeof(buf)) >= 0){
    if(buf[0] == 'c' && buf[1] == 'd' && buf[2] == ' '){
      // Chdir must be called by the parent, not the child.
      buf[strlen(buf)-1] = 0;  // chop \n
      if(chdir(buf+3) < 0)
        fprintf(2, "cannot cd %s\n", buf+3);
      continue;
    }
    if(fork1() == 0)
      runcmd(parsecmd(buf));
    wait(0);
  }
  exit(0);
}

I/O 和文件描述符

file descriptor：文件描述符，用来表示一个被内核管理的、可以被进程读/写的对象的一个整数，表现形式类似于字节流，通过打开文件、目录、设备等方式获得。一个文件被打开得越早，文件描述符就越小。

每个进程都拥有自己独立的文件描述符列表，其中0是标准输入，1是标准输出，2是标准错误。shell将保证总是有3个文件描述符是可用的。
1
2
3
4
5
6
while((fd = open("console", O_RDWR)) >= 0){ if(fd >= 3){ close(fd); break; } }

read和write：形式int write(int fd, char *buf, int n)和int read(int fd, char *bf, int n)。从/向文件描述符fd读/写n字节bf的内容，返回值是成功读取/写入的字节数。每个文件描述符有一个offset，read会从这个offset开始读取内容，读完n个字节之后将这个offset后移n个字节，下一个read将从新的offset开始读取字节。write也有类似的offset。

/* essence of cat program */
char buf[512];
int n;

for (;;) {
    n = read(0, buf, sizeof buf);
    if (n == 0)
        break;
    if (n < 0){
        fprintf(2, "read errot\n");
        exit(1);
    }
    if (write(1, buf, n) != n){
        fprintf(2, "write error\n");
        exit(1);
    }
}

close。形式是int close(int fd)，将打开的文件fd释放，使该文件描述符可以被后面的open、pipe等其他system call使用。

使用close来修改file descriptor table能够实现I/O重定向。

/* implementation of I/O redirection,
 * more specifically, cat < input.txt
 */
char *argv[2];
argv[0] = "cat";
argv[1] = 0;

if (fork() == 0) {
    // in the child process
    close(0);  // this step is to release the stdin file descriptor
    open("input.txt", O_RDONLY); // the newly allocated fd for input.txt is 0, since the previous fd 0 is released
    exec("cat", argv); // execute the cat program, by default takes in the fd 0 as input, which is input.txt
}

父进程的fd table将不会被子进程fd table的变化影响，但是文件中的offset将被共享。考虑这个例子

if(fork() == 0) {
    write(1, "hello ", 6);
    exit();
} else {
    wait();
    write(1, "world\n", 6);
}

绑定在文件描述符1上的文件有数据”hello world”，父进程的write会从子进程write结束的地方继续写 (因为wait,父进程只在子进程结束之后才运行write)。这种行为有利于顺序执行的 shell 命令的顺序输出，例如(echo hello; echo world)>output.txt。

dup。形式是int dup(int fd)，复制一个新的fd指向的I/O对象，返回这个新fd值，两个I/O对象(文件)的offset相同。
1
2
3
4
fd = dup(1); write(1, "hello ", 6); write(fd, "world\n", 6); // outputs hello world
除了dup和fork之外，其他方式不能使两个I/O对象的offset相同，比如同时open相同的文件。

管道

pipe：管道，暴露给进程的一对文件描述符，一个文件描述符用来读，另一个文件描述符用来写，将数据从管道的一端写入，将使其能够被从管道的另一端读出

pipe是一个system call，形式为int pipe(int p[])，p[0]为读取的文件描述符，p[1]为写入的文件描述符。

/* run the program wc with stdin connected to the read end of pipe, parent process able to communicate with child process */
int p[2];
char *argv[2];

argv[0] = "wc";
argv[1] = 0;

pipe(p); // read fd put into p[0], write fd put into p[1]
if (fork() == 0) {
    close(0);
    dup(p[0]); // make the fd 0 refer to the read end of pipe
    close(p[0]); // original read end of pipe is closed
    close(p[1]); // fd p[1] is closed in child process, but not closed in the parent process. 注意这里关闭p[1]非常重要，因为如果不关闭p[1]，管道的读取端会一直等待读取，wc就永远也无法等到EOF
    exec("/bin/wc", argv); // by default wc will take fd 0 as the input, which is the read end of pipe in this case
} else {
    close(p[0]); // close the read end of pipe in parent process will not affect child process
    write(p[1], "hello world\n", 12); 
    close(p[1]); // write end of pipe closed, the pipe shuts down
}

这段程序调用pipe，创建一个新的管道并且将读写描述符记录在数组p中。

在fork之后，父进程和子进程都有了指向管道的文件描述符。子进程将管道的读端口拷贝在描述符0上，关闭p中的描述符，然后执行wc。当wc从标准输入读取时，它实际上是从管道读取的。父进程向管道的写端口写入然后关闭它的两个文件描述符。

xv6中的实现和上述的类似：

case PIPE:
pcmd = (struct pipecmd*)cmd;
if(pipe(p) < 0)
    panic("pipe");
if(fork1() == 0){
    // in child process
    close(1); // close stdout
    dup(p[1]); // make the fd 1 as the write end of pipe
    close(p[0]);
    close(p[1]);
    runcmd(pcmd->left); // run command in the left side of pipe |, output redirected to the write end of pipe
}
if(fork1() == 0){
    // in child process
    close(0); // close stdin
    dup(p[0]); // make the fd 0 as the read end of pipe
    close(p[0]);
    close(p[1]);
    runcmd(pcmd->right); //  run command in the right side of pipe |, input redirected to the read end of pipe
}
close(p[0]);
close(p[1]);
wait(0); // wait for child process to finish
wait(0); // wait for child process to finish
break;

这里我有点没搞懂为什么要在写入/读取之前执行一些close操作（包括wc和Lab中的pingpong），在网上查了资料之后，我大概理解了。先来看看管道是如何实现进程之间的通信的：

父进程创建管道，得到两个文件描述符指向管道的两端
父进程fork出子进程，子进程也有两个文件描述符指向同⼀管道
父进程关闭fd[0],子进程关闭fd[1]，即父进程关闭管道读端,子进程关闭管道写端（因为管道只支持单向通信）。父进程可以往管道⾥写，子进程可以从管道里读,管道是⽤环形队列实现的,数据从写端流⼊从读端流出,这样就实现了进程间通信。

所以两个进程要相互通信，可以创建两个pipe分别关闭读写端，两个单向就变成了双向了。

文件系统

xv6文件系统包含了文件(byte arrays)和目录(对其他文件和目录的引用)。目录生成了一个树，树从根目录/开始。对于不以/开头的路径，认为是是相对路径

mknod：创建设备文件，一个设备文件有一个major device #和一个minor device #用来唯一确定这个设备。当一个进程打开了这个设备文件时，内核会将read和write的system call重新定向到设备上。
一个文件的名称和文件本身是不一样的，文件本身，也叫inode，可以有多个名字，也叫link，每个link包括了一个文件名和一个对inode的引用。一个inode存储了文件的元数据，包括该文件的类型(file, directory or device)、大小、文件在硬盘中的存储位置以及指向这个inode的link的个数
fstat。一个system call，形式为int fstat(int fd, struct stat *st)，将inode中的相关信息存储到st中。
link。一个system call，将创建一个指向同一个inode的文件名。unlink则是将一个文件名从文件系统中移除，只有当指向这个inode的文件名的数量为0时这个inode以及其存储的文件内容才会被从硬盘上移除

注意：Unix提供了许多在用户层面的程序来执行文件系统相关的操作，比如mkdir、ln、rm等，而不是将其放在shell或kernel内，这样可以使用户比较方便地在这些程序上进行扩展。但是cd是一个例外，它是在shell程序内构建的，因为它必须要改变这个calling shell本身指向的路径位置，如果是一个和shell平行的程序，那么它必须要调用一个子进程，在子进程里起一个新的shell，再进行cd，这是不符合常理的。

Lab：Xv6 and Unix utilities

Boot xv6

$ git clone git://g.csail.mit.edu/xv6-labs-2021
Cloning into 'xv6-labs-2021'...
...
$ cd xv6-labs-2021
$ git checkout util
Branch 'util' set up to track remote branch 'util' from 'origin'.
Switched to a new branch 'util'

$ make qemu
riscv64-unknown-elf-gcc    -c -o kernel/entry.o kernel/entry.S
riscv64-unknown-elf-gcc -Wall -Werror -O -fno-omit-frame-pointer -ggdb -DSOL_UTIL -MD -mcmodel=medany -ffreestanding -fno-common -nostdlib -mno-relax -I. -fno-stack-protector -fno-pie -no-pie   -c -o kernel/start.o kernel/start.c
...  
riscv64-unknown-elf-ld -z max-page-size=4096 -N -e main -Ttext 0 -o user/_zombie user/zombie.o user/ulib.o user/usys.o user/printf.o user/umalloc.o
riscv64-unknown-elf-objdump -S user/_zombie > user/zombie.asm
riscv64-unknown-elf-objdump -t user/_zombie | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$/d' > user/zombie.sym
mkfs/mkfs fs.img README  user/xargstest.sh user/_cat user/_echo user/_forktest user/_grep user/_init user/_kill user/_ln user/_ls user/_mkdir user/_rm user/_sh user/_stressfs user/_usertests user/_grind user/_wc user/_zombie 
nmeta 46 (boot, super, log blocks 30 inode blocks 13, bitmap blocks 1) blocks 954 total 1000
balloc: first 591 blocks have been allocated
balloc: write bitmap block at sector 45
qemu-system-riscv64 -machine virt -bios none -kernel kernel/kernel -m 128M -smp 3 -nographic -drive file=fs.img,if=none,format=raw,id=x0 -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0

xv6 kernel is booting

hart 2 starting
hart 1 starting
init: starting sh
$ 

$ ls
.              1 1 1024
..             1 1 1024
README         2 2 2059
xargstest.sh   2 3 93
cat            2 4 24256
echo           2 5 23080
forktest       2 6 13272
grep           2 7 27560
init           2 8 23816
kill           2 9 23024
ln             2 10 22880
ls             2 11 26448
mkdir          2 12 23176
rm             2 13 23160
sh             2 14 41976
stressfs       2 15 24016
usertests      2 16 148456
grind          2 17 38144
wc             2 18 25344
zombie         2 19 22408
console        3 20 0

sleep（easy）

简单包装系统调用sys_sleep()。

注意：如果没有传入参数，需要打印错误信息。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int main(int argc, char *argv[])
{
    if (argc <= 1) {
        fprintf(2, "sleep: time argument required\n");
        exit(1);
    }
    int time;
    time = atoi(argv[1]);
    sleep(time);
    exit(0);
}

pingpong（easy）

开两个pipe，一个pipe负责子进程写父进程读，另一个pipe负责父进程写子进程读。

注意最后要把所有的pipe fd关闭掉。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int main(int argc, char *argv[])
{
    int p1[2];
    int p2[2];
    int pid;
    char buf;

    pipe(p1);
    pipe(p2);

    pid = fork();
    if (pid < 0)
    {
        exit(1);
    } 
    else if (pid == 0)
    {
        // children
        close(p1[1]);
        close(p2[0]);
        read(p1[0], &buf, 1);
        printf("%d: received ping\n", pid);
        write(p2[1], " ", 1);
        close(p1[0]);
        close(p2[1]);
        exit(0);
    }
    else
    {
        // parent
        close(p1[0]);
        close(p2[1]);
        write(p1[1], " ", 1);
        read(p2[0], &buf, 1);
        printf("%d: received pong\n", pid);
        close(p1[1]);
        close(p2[0]);
        exit(0);
    }
}

prime（moderate/hard）

素数筛法：将一组数feed到一个进程里，先print出最小的一个数，这是一个素数，然后用其他剩下的数依次尝试整除这个素数，如果可以整除，则将其drop，不能整除则将其feed到下一个进程中，直到最后打印出所有的素数。

采用递归，每次先尝试从左pipe中读取一个数，如果读不到说明已经到达终点，exit，否则再创建一个右pipe并fork一个子进程，将筛选后的数feed进这个右pipe。

注意最开始的父进程要等待所有子进程exit才能exit。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

#define MAX 35
#define READ 0
#define WRITE 1

void child(int *);

int main(int argc, char *argv[])
{
    int p[2];
    pipe(p);
    if (fork() == 0)
    {
        child(p);
    }
    else
    {
        close(p[READ]);
        for (int i = 2; i <= MAX; i++)
        {
            write(p[WRITE], &i, sizeof(int));
        }
        close(p[WRITE]);
        wait(0);
    }
    exit(0);
}

void child(int *pp)
{
    int q[2];
    int n;
    int prime;
    close(pp[WRITE]);
    int res = read(pp[READ], &n, sizeof(int));
    if (res == 0)
        exit(0);
    
    pipe(q);
    if (fork() == 0)
    {
        child(q);
    }
    else
    {
        close(q[READ]);
        printf("prime %d\n", n);
        prime = n;
        while (read(pp[READ], &n, sizeof(int)) != 0)
        {
            if (n % prime)
            {
                write(q[WRITE], &n, sizeof(int));
            }
        }
        close(q[WRITE]);
        wait(0);
        exit(0);
    }
}

find（moderate）

参照ls的实现即可。注意递归查找时忽略.和..。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/fs.h"

void find(char *, char *);

int main(int argc, char *argv[])
{
    if (argc != 3)
    {
        fprintf(2, "find: wrong arguments count");
        exit(1);
    }
    char *path = argv[1];
    char *file = argv[2];
    find(path, file);
    exit(0);
}

void find(char *path, char *target)
{
    char buf[512], *p;
    int fd, fd1;
    struct dirent de;
    struct stat st, st1;

    if ((fd = open(path, 0)) < 0)
    {
        fprintf(2, "path error\n");
        return;
    }
    if (fstat(fd, &st) < 0)
    {
        fprintf(2, "path stat failed\n");
        close(fd);
        return;
    }

    switch (st.type)
    {
        case T_FILE:
            fprintf(2, "path_error\n");
            return;
        case T_DIR:
            strcpy(buf, path);
            p = buf + strlen(buf);
            *(p++) = '/';
            while (read(fd, &de, sizeof(de)) == sizeof(de))
            {
                if (de.inum == 0)
                    continue;
                if (!strcmp(de.name, ".") || !strcmp(de.name, ".."))
                    continue;
                memmove(p, de.name, DIRSIZ);
                if ((fd1 = open(buf, 0)) >= 0)
                {
                    if (fstat(fd1, &st1) >= 0)
                    {
                        switch (st1.type) {
                            case T_FILE:
                                if (!strcmp(de.name, target))
                                printf("%s\n", buf);
                                close(fd1);
                                break;
                            case T_DIR:
                                close(fd1);
                                find(buf, target);
                                break;
                            case T_DEVICE:
                                close(fd1);
                                break;
                        }
                    }
                }
            }
            break;
    }
}

xargs（moderate）

xargs指令的含义可以参照这里。

使用fork起一个子进程，在子进程中用exec执行相应的命令。父进程wait。对标准输入每次读一个char，若读到\n需要执行命令。

#include "kernel/types.h"
#include "user/user.h"
#include "kernel/param.h"
#define MAX_LEN 100

int main(int argc, char *argv[])
{
	char *command = argv[1];
	char bf;
	char paramv[MAXARG][MAX_LEN];
	char *m[MAXARG];

	while (1)
    {
		int count = argc - 1;
		memset(paramv, 0, MAXARG * MAX_LEN);
		for (int i=1; i<argc; i++)
        {
			strcpy(paramv[i-1], argv[i]);
		}

		int cursor = 0;
		int flag = 0;
		int read_result;

		while (((read_result = read(0, &bf, 1))) > 0 && bf != '\n')
        {
			if (bf == ' ' && flag == 1)
            {
				count++;
				cursor = 0;
				flag = 0;
			}
			else if (bf != ' ')
            {
				paramv[count][cursor++] = bf;
				flag = 1;
			}
		}
		if (read_result <= 0)
        {
			break;
		}
		for (int i = 0; i < MAXARG - 1; i++)
        {
			m[i] = paramv[i];
		}
		m[MAXARG - 1] = 0;
		if (fork() == 0)
        {
			exec(command, m);
			exit(0);
		} 
        else
        {
			wait((int *) 0);
		}
	}
	exit(0);
}

课外拓展

#操作系统 #cs自学指南

6.S081 - Lecture 1 Introduciton

https://galaxy-jewxw.github.io/2024/04/23/6-S081-Lecture-1-Introduciton/

作者

Traumtänzer aka 'Jew1!5!'

发布于

2024年4月23日

许可协议

6.S081 - Lecture 2 Operating System Organization 上一篇

OS实验——Lab0串讲下一篇