参考论文:CARR01.pdf

竞争条件和信号量

如何准确确定程序中出现的竞争条件有一定难度:

问题陈述

假设两个进程 A 和 B,各自都由若干并发执行的线程组成,每个线程都包含一个无限循环,每个循环中有一个要与另一进程中某个线程交换的消息。每个消息由放在共享全局缓冲区的一个整数构成。

由此产生两个需求:

  1. 进程 A 中线程 A1 的消息对进程 B 中线程 B1 可用后,A1 只有接收到 B1 的消息后才能进行。类似的,B1 消息对 A1 可用后,也要接收到 A1 的消息才能进行下去;
  2. 一旦线程 A1 的消息可用,就必须保证 B 中线程重新获得消息之前,A 中其它线程不能覆盖全局缓冲区。

err_algo 1:简单握手

semaphore a=0,b=0;
int buf_a,buf_b;

thread_A(){
	int var_a;
	...;
	while(1){
		...
		var_a=...;
		semSignal(b);
		semWait(a);
		buf_a=var_a;
		var_a=buf_b;
		...;
	}
}

thread_B(){
	int var_b;
	...
	while(1){
		...
		var_b=...;
		semSignal(a);
		semWait(b);
		buf_b=var_b;
		var_b=buf_a;
		...;
	}
}

一种简单的握手协议,当 A 的线程 A1 准备好交换信息时,向 B 的线程发送信号,等待 B 中线程 B1 准备好;一旦 A 执行 semWait(a)得到信号从 B1 返回,就假定 B1 做好了交换准备。不论哪个线程先准备好,交换都会发生。

但是该算法会导致竞争条件,思考如下事件序列:

Thread A1Thread B1
semSignal (b)
semWait (a)
semSignal (a)
semWait (b)
buf_a=var_a
var_a=buf_b
buf_b=var_b
A1 到达 semWait(a) 时阻塞,B1 到达 semWait(b) 未阻塞,但是在 B1 更新 buf_b 之前就会被交换出去,然而此时 A1 想要读取的 buf_b 还没有被 B1 更新,这是一个竞争条件。

若 A 和 B 中的两个线程都活跃,那么:

Thread A1Thread A2Thread B1Thread B2
semSignal (b)
semWait (a)
semSignal (a)
semWait (b)
semSignal (b)
semWait (a)
buf_b=var_b1
semSignal (a)
buf_a=val_a1
buf_a=var_a2
A1 和 B1 尝试交换消息并完成合适的信号量发信号指令。但是随着两个 semWait 的执行,A2 和 B2 获得信号分别开始执行 semSignal(b),semWait(a)semSignal(a)。这时 A1 或 A2 都可能更新 buf_a,这就产生了竞争条件。

经验:多线程共享一个变量时,除非使用合适的互斥保护,否则可能发生竞争条件。

err_algo 2:信号量 mutex 保护临界区

semaphore a=0,b=0; mutex=1;
int buf_a,buf_b;

thread_A(){
	int var_a;
	...;
	while(1){
		...
		var_a=...;
		semSignal(b);
		semWait(a);
			semWait(mutex);
				buf_a=var_a;
			semSignal(mutex);
		semSignal(b);
		semWait(a);
			semWait(mutex);
				var_a=buf_b;
			semSignal(mutex);
		...;
	}
}

thread_B(){
	int var_b;
	...
	while(1){
		...
		var_b=...;
		semSignal(a);
		semWait(b);
			semWait(mutex);
				buf_b=var_b;
			semSignal(mutex);
		semSignal(a);
		semWait(b);
			semWait(mutex);
				var_b=buf_a;
			semSignal(mutex);
		...;
	}
}

信号量 mutex 保护 buf_a 和 buf_b,试图保证更新以前的数据。但这种保护并不充分,一旦两个线程完成了第一次握手,信号量 a 和 b 的值都是 1,则会发生三种以下情形,这三种情形都会引发竞争条件:

  1. A1 和 B1 完成第一次握手后,继续进行第二个阶段的消息交换;
  2. 另一线程开始它们的第一阶段;
  3. 当前线程中的一个线程和新来的另一线程对中的一个线程进行交换消息
Thread A1Thread A2Thread B1
semSignal (b)
semWait (a)
semSignal (a)
semWait (b)
buf_a=var_a1
buf_b=var_b1
semSignal (b)
semWait (a)
semSignal (a)
semWait (b)
buf_a=var_a2
A1 和 B1 完成第一次握手后,都更新了相应缓冲区;然后 A2 开始第一次握手,紧跟着 B1 开始第二次握手,此时 A2 在 B1 重新获取 A1 存放在 buf_a 中的值之前,更新了 buf_a 中的值,导致原值丢失。这是一个竞争条件。

经验:若变量是一个较长执行序列的一部分,则只保护信号变量可能不够,要保护整个执行序列。

err_algo 3

将临界区扩展为包含整个信息交换——两个线程中每个线程都会更新两缓冲区之一,并从另一个缓冲区读取数据——双缓冲策略:

semaphore aready=1,adone=0,bready=1,bdone=0;
int buf_a,buf_b;

thread_A(){
	int var_a;
	...;
	while(1){
		...
		var_a=...;
		semWait(aready);
			buf_a=var_a;
			semSignal(adone);
			semWait(bdone);
			var_a=buf_b;
		semSignal(aready);
		...;
	}
}

thread_B(){
	int var_b;
	...
	while(1){
		...
		var_b=...;
		semWait(bready);
			buf_b=var_b;
			semSignal(bdone);
			semWait(adone);
			var_b=buf_a;
		semSignal(bready);
		...;
	}
}
  • 信号量 aready 确保 A 中没有其他线程能更新 buf_a,同时 A 中一个线程进入其临界区;adone 确保 B 中没有线程会读取 buf_a,直到已经更新 buf_a。bready 和 bdone 同理;
Thread A1Thread B1
buf_a=var_a;
semSignal (adone);
semWait (bdone);
buf_b=var_b
semSignal (bdone)
semWait (adone)
var_a=buf_b
semSignal (aready)
… loop back
semWait (aready)
buf_a=var_a
var_b=buf_a
这个序列中,A1 和 B1 都进入了各自临界区存储消息,并到达第二个等待,然后 A1 复制来自 B1 的消息并离开 buf_a。接下来有两种可能,都会产生竞争条件、丢失消息:
  • A1 可能返回它的程序,产生新的消息,并存储在 buf_a 中;
  • 另一种可能是,A2 可能产生一个消息放到 buf_a,这样会丢失消息;

经验:若有许多协同运行的线程组,则保证一组线程的互斥可能不会阻止另一组线程的冲突;此外若一个线程可以重复进入临界区,则线程间的协作时间也要适当管理。

err_algo 4

强制保留一个线程在临界区,直到另一个线程重新获取消息:

semaphore aready=1,adone=0,bready=1,bdone=0;
int buf_a,buf_b;

thread_A(){
	int var_a;
	...;
	while(1){
		...
		var_a=...;
		semWait(bready);
			buf_a=var_a;
			semSignal(adone);
			semWait(bdone);
			var_a=buf_b;
		semSignal(aready);
		...;
	}
}

thread_B(){
	int var_b;
	...
	while(1){
		...
		var_b=...;
		semWait(aready);
			buf_b=var_b;
			semSignal(bdone);
			semWait(adone);
			var_b=buf_a;
		semSignal(bready);
		...;
	}
}
  • A1 线程进入临界区,此时 bready 为 0,A 中没有后续线程交换消息,直到 B 中一个线程完成消息交换并将 bready 置为 1.

仍会产生竞争条件:

Thread A1Thread A2Thread B1
semWait (bready)
buf_a=var_a1
semSignal (adone)
semWait (aready)
buf_b=var_b1
semSignal (bdone)
semWait (adone)
var_b=buf_a
semSignal (bready)
semWait (bready)
semWait (bdone)
var_a2=buf_b
  • A1 和 B1 为了交换消息而进入相应的临界区。线程 B1 重新获取其消息并发信号 bready,使得 A 中另一线程 A2 进入临界区,若 A2 比 A1 执行得快,则 A2 会获取发给 A1 的消息;

经验:若实现互斥的信号量不能被其所有者释放,则会产生竞争条件。algo4 中信号量首先被 A 中一个线程锁定,然后由 B 中一个线程解锁,是一种危险的编程实践。

proper_algo

关于缓冲区边界变化的问题,最直接的解决办法是双缓冲策略,一个用于 AB,一个用于 BA,每个缓冲区大小为 1(并发场景中线程释放顺序不确定,若缓冲区大小不为 1,则不能保证消息的正确匹配)

semaphore notFull_A=1,notFull_B=1;
semaphore notEmpty_A=0,notEmpty_B=0;
int buf_a,buf_b;

thread_A(){
	int var_a;
	...;
	while(1){
		...
		var_a=...;
		semWait(notFull_A);
			buf_a=var_a;
			semSignal(notEmpty_A);
		semWait(notEmpty_B);
			var_a=buf_b;
			semSignal(notFull_B);
		...;
	}
}

thread_B(){
	int var_b;
	...
	while(1){
		...
		var_b=...;
		semWait(notFull_B);
			buf_b=var_b;
			semSignal(notEmpty_B);
		semWait(notEmpty_A);
			var_b=buf_a;
			semSignal(notFull_A);
		...;
	}
}
  • 线程组内的消息交换区必须是互斥的。notFull_A 初值为 1,A 中只有一个线程能通过 semWait (notFull_A),直到 B 中线程完成交换时执行 semSignal (notFull_A)发出信号;对 B 中线程同理。
  • 一旦两个线程进入它们的临界区,他们之间交换消息就不会受其他任何线程干扰。A 中其它线程直到 B 中线程完成消息交换才能进入临界区;对 B 中线程同理。
  • 一个线程离开临界区后,同一组中就没有线程能够立即销毁存在的消息。这个条件之所以满足,是因为每个方向用的都是一个插槽的缓冲区,一旦 A 中一个线程执行 semWait (notFull_A)进入临界区,A 中就没有其它线程能够更新 buf_a, 直到 B 中相关线程获取了 buf_a 并发出信号 semSignal (notFull_A)

Barbershop problem

  • 理发店有 3 把椅子、3 名理发师、可供 4 顾客的等候区沙发、其它可供顾客站立的空间。理发店里总共能容纳的人数最多为 20,而理发店要接待 50 名顾客。
  • 若理发店人数已满,则顾客不会进入;一旦进入,顾客首选坐在沙发上,若沙发满了就站着。
  • 理发师空闲时向沙发上等待时间最长的顾客提供服务,同样沙发有空位时由等待最久的站着的顾客填入。
  • 顾客理发结束后,任何一名理发师都可以收钱,但只有一台收款机,因此一次只有一个顾客可以付款。理发师因此将自己的时间划分为理发、收款、在椅子上休息。

unfair solution

/* program barbershop1 */
semaphore max_capacity = 20;
semaphore sofa = 4;
semaphore barber_chair = 3;
semaphore coord = 3;
semaphore cust_ready = 0, finished = 0, leave_b_chair = 0, payment= 0, receipt
= 0;

void customer ()
{
	semWait(max_capacity);
	enter_shop();
	semWait(sofa);
	sit_on_sofa();
	semWait(barber_chair);
	get_up_from_sofa();
	semSignal(sofa);
	sit_in_barber_chair();
	semSignal(cust_ready);
	semWait(finished);
	leave_barber_chair();
	semSignal(leave_b_chair);
	pay();
	semSignal(payment);
	semWait(receipt);
	exit_shop();
	semSignal(max_capacity)
}

void barber()
{
	while (true)
	{
		semWait(cust_ready);
		semWait(coord);
		cut_hair();
		semSignal(coord);
		semSignal(finished);
		semWait(leave_b_chair);
		semSignal(barber_chair);
	}
}

void cashier()
{
	while (true)
	{ 
		semWait(payment);
		semWait(coord);
		accept_pay();
		semSignal(coord);
		semSignal(receipt);
	}
}

void main()
{
	parbegin (customer, . . . 50 times, . . . customer, barber, barber, barber,
	cashier);
}
  • Shop and sofa capacity: The capacity of the shop and the capacity of the sofa are governed by the semaphores max_capacity and sofa, respectively.
    • Every time a customerattempts to enter the shop, the max_capacity semaphore isdecremented by 1; every time a customer leaves, the semaphoreis incremented.
    • If a customer finds the shop full, then that customer’s process is blocked on max_capacity by the semWait function.
    • Similarly, the semWait and semSignal operations surround the actions of sitting on and getting up from the sofa.
  • Barber chair capacity: There are three barber chairs, and care must be taken that they are used properly. The semaphore barber_chair assures that no more than three customers attempt to obtain service at a time, trying to avoid the undignified occurrence of one customer sitting on the lap of another.
    • A customer will not get up from the sofa until at least one chair is free [semWait(barber_chair)], and each barber signals when a customer has left that barber’s chair [semSignal(barber_chair)].
    • Fair access to the barber chairs is guaranteed by the semaphore queue organization: The first customer to be blocked is the first one allowed into an available chair. Note that, in the customer procedure, if semWait(barber_chair) occurred after semSignal(sofa) , each customer would only briefly sit on the sofa then stand in line at the barber chairs, creating congestion and leaving the barbers with little elbow room.
  • Ensuring customers are in barber chair: The semaphore cust_ready provides a wakeup signal for a sleeping barber, indicating that a customer has just taken a chair.
    • Without this semaphore, a barber would never sleep but would begin cutting hair as soon as a customer left the chair; if no new customer had grabbed the seat, the barber would be cutting air.
  • Holding customers in barber chair: Once seated, a customer remains in the chair until the barber gives the signal that the haircut is complete, using the semaphore finished.
  • Limiting one customer to a barber chair: The semaphore barber_chair is intended to limit the number of customers in barber chairs to three. However, by itself, barber_chair does not succeed in doing this.
    • A customer that fails to get the processor immediately after his barber executes semSignal(finished) (i.e., one who falls into a trance or stops to chat with a neighbor) may still be in the chair when the next customer is given the go ahead to be seated.
    • The semaphore leave_b_chair is intended to correct this problem by restraining the barber from inviting a new customer into the chair until the lingering one has announced his departure from it. (In the problems at the end of this chapter, we will find that even this precaution fails to stop the mettlesome customer lap sittings.)
  • Paying and receiving: Naturally, we want to be careful when dealing with money. The cashier wants to be assured that each customer pays before leaving the shop, and the customer wants verification that payment was received (a receipt). This is accomplished, in effect, by a face-to-face transfer of the money.
    • Each customer, upon arising from a barber chair, pays, alerts the cashier that money has been passed over [semSignal(payment)],then waits for a receipt [semWait(receipt)].
    • The cashier process repeatedly takes payments: It waits for a payment to be signaled, accepts the money, then signals acceptance of the money.
    • Several programming errors need to be avoided here. If semSignal(payment) occurred just before the action pay, then a customer could be interrupted after signaling; this would leave the cashier free to accept payment even though none had been offered. An even more serious error would be to reverse the positions of the semSignal(payment) and semWait(receipt) lines. This would lead to deadlock because that would cause all customers and the cashier to block at their respective semWait operators.
  • Coordinating barber and cashier functions: To save money, this barbershop does not employ a separate cashier. Each barber is required to perform that task when not cutting hair. The semaphore coord ensures that barbers perform only one task at a time.
SemaphoreWait OperationSignal Operation
max_capacity顾客等待进入理发店离开的顾客向等待进入理发店的顾客发信号
sofa顾客等待沙发座位离开沙发的顾客向等待沙发的顾客发信号
barber_chair顾客等待空闲理发椅理发椅空出时,理发师向等待理发的顾客发信号
cust_ready理发师等待顾客坐在理发椅上顾客坐在理发椅上后,向理发师发送信号
finished顾客等待理完发理发师理完发后向顾客发送信号
leave_b_chair理发师等待顾客离开理发椅顾客离开理发椅时向理发师发信号
payment收银员等待顾客付款顾客向收银员发已付款信号
receipt顾客等待支付收据收银员发已接收支付信号
coord等待理发师资源空闲,执行理发或收银功能发理发师资源空闲信号

fair solution

unfair solution 中有一个时间问题,会导致不公平地对待顾客:

  • 假设三名顾客同时在理发椅上就座,在这种情况下,顾客会被阻塞在 semWait (finished)上,而由于队列的组织方式,它们会按照进入理发椅的顺序从理发椅离开;
  • 然而,若某名理发师速度很快或某名顾客头发很少时,最先进入椅子的顾客离开时会导致一名顾客被同时撵走,理发只有一半却收取全部费用,而另一名顾客即使理完发也被限制在椅子上。

可以通过设置更多信号量解决:

  • 为每个顾客指定唯一的顾客号码,
  • 信号量 mutex1 保护对全局变量 count 的访问,使每名顾客受到唯一的号码。
  • 信号量 finished 重新定义为 50 个信号量的数组,顾客坐上理发椅后,执行 semWait (finished[custnr])等待自己唯一的信号量;理发师执行 semSignal (finished[b_cust])释放争确的顾客。
  • 顾客通过信号量 cust_ready 发信号给理发师,将号码放在队列 enqueue1 中。理发师准备理发时,dequeue1 (b_cust)从 queue1 删除最顶的顾客号码,并存放在理发师局部变量 b_cust 中:
/* program barbershop2 */
semaphore max_capacity = 20;
semaphore sofa = 4;
semaphore barber_chair = 3, coord = 3;
semaphore mutex1 = 1, mutex2 = 1;
semaphore cust_ready = 0, leave_b_chair = 0, payment = 0, receipt = 0;
semaphore finished [50] = {0};
int count;

void customer()
{
	int custnr;
	semWait(max_capacity);
	enter_shop();
	semWait(mutex1);
	custnr = count;
	count++;
	semSignal(mutex1);
	semWait(sofa);
	sit_on_sofa();
	semWait(barber_chair);
	get_up_from_sofa();
	semSignal(sofa);
	sit_in_barber_chair();
	semWait(mutex2);
	enqueue1(custnr);
	semSignal(cust_ready);
	semSignal(mutex2);
	semWait(finished[custnr]);
	leave_barber_chair();
	semSignal(leave_b_chair);
	pay();
	semSignal(payment);
	semWait(receipt);
	exit_shop();
	semSignal(max_capacity)
}

void barber()
{
	int b_cust;
	while (true)
	{
		semWait(cust_ready);
		semWait(mutex2);
		dequeue1(b_cust);
		semSignal(mutex2);
		semWait(coord);
		cut_hair();
		semSignal(coord);
		semSignal(finished[b_cust]);
		semWait(leave_b_chair);
		semSignal(barber_chair);
	}
}

void cashier()
{
	while (true)
	{
		semWait(payment);
		semWait(coord);
		accept_pay();
		semSignal(coord);
		semSignal(receipt);
	}
}

void main()
{ 
	count := 0;
	parbegin (customer, . . . 50 times, . . . customer, barber, barber, barber, cashier);
}

improve question

  1. fair solution 中,要求理发师完成理发后向顾客收款吗?理发师总是使用同一把理发椅吗?
  2. 改进 fair solution 以修复如下问题:
    • 两名或多名顾客等待付款时,收银员可能接收来自一名顾客的付款而释放另一名顾客;
    • 预想中 leave_b_chair 会阻止对同一把理发椅的多个访问。但实际上并不能在所有情形下成功:加入所有三名理发师都完成了理发并被阻塞在 semWait (leave_b_chair)上,哪位理发师被先释放?由于 leave_b_chair 队列是先进先出的,因此被阻塞的第一位理发师率先释放,但是这名理发师发送的信号吗?若不是,则新顾客就会与旧顾客重叠;
    • 程序要求一名顾客先坐在沙发上,即使理发椅是空闲的。