[libcamera-devel] [PATCH 13/13] gstreamer: Fix race conditions in task pause/resume

Nicolas Dufresne nicolas.dufresne at collabora.com
Thu Jun 30 21:32:05 CEST 2022


Le jeudi 30 juin 2022 à 02:50 +0300, Laurent Pinchart a écrit :
> > This dance is difficult to follow. Perhaps we could start the run function
> > by
> > "pausing" the task? Then we will resume the task if state->processRequest()
> > ==
> > 0, or concurrently if a request completed. Using a doResume bool instead of
> > doPause.
> 
> I've given it a try, and the result is as follows:
> 
> diff --git a/src/gstreamer/gstlibcamerasrc.cpp
> b/src/gstreamer/gstlibcamerasrc.cpp
> index 6ee315cf5efe..324ea457eb4f 100644
> --- a/src/gstreamer/gstlibcamerasrc.cpp
> +++ b/src/gstreamer/gstlibcamerasrc.cpp
> @@ -411,7 +411,9 @@ gst_libcamera_src_task_run(gpointer user_data)
>  		state->wakeup_ = false;
>  	}
>  
> -	bool doPause = true;
> +	bool doResume = false;
> +
> +	gst_task_pause(self->task);
>  
>  	/*
>  	 * Create and queue one request. If no buffers are available the
> @@ -426,7 +428,7 @@ gst_libcamera_src_task_run(gpointer user_data)
>  		 * buffers to create a new one. Don't pause the task to give
> it
>  		 * another try.
>  		 */
> -		doPause = false;
> +		doResume = true;
>  		break;
>  
>  	case -ENOMEM:
> @@ -449,11 +451,8 @@ gst_libcamera_src_task_run(gpointer user_data)
>  	ret = state->processRequest();
>  	switch (ret) {
>  	case 0:
> -		/*
> -		 * Another completed request is available, don't pause the
> -		 * task.
> -		 */
> -		doPause = false;
> +		/* Another completed request is available, resume the task.
> */
> +		doResume = true;
>  		break;
>  
>  	case -EPIPE:
> @@ -466,13 +466,13 @@ gst_libcamera_src_task_run(gpointer user_data)
>  	}
>  
>  	/*
> -	 * Here we need to decide if we want to pause. This needs to happen
> in
> +	 * Here we need to decide if we want to resume. This needs to happen
> in
>  	 * lock step with the requestCompleted callback and the buffer-notify
>  	 * signal handler that resume the task.
>  	 */
> -	if (doPause) {
> +	{
>  		MutexLocker locker(state->lock_);
> -		if (!state->wakeup_)
> -			gst_task_pause(self->task);
> +		if (doResume || state->wakeup_)
> +			gst_task_resume(self->task);
>  	}
>  }
> 
> I'm not sure that's what you meant, as it doesn't seem much easier to
> follow.
> 
> With the mechanism from v1 (and the two fixes mentioned above), the task
> runs ~80 times / second, is resumed ~55 times / second, and paused ~26
> times / second. With the opposite logic, we get ~80 pause/second (that's
> expected, as it's paused at every run) and ~107 resume/second. That
> seems less efficient to me.
> 
> I'll use the original mechanism with the bug fixes in v2, if you meant
> something else than the above, you can explain it in a reply to the new
> patch.

You don't need state->wakeup_ if you use the suggested resume technique, making
the fix less invasive. The reason is run() function is the only function that
may pause the task. Getting an external caller to resume will just lead to run()
to pause the task again. So you don't have to track what is happening in the
outside world. Other threads just resume the task, and the run() function
assumes it will be paused, and then resume() itself as needed.

Nicolas



More information about the libcamera-devel mailing list