[libcamera-devel] Issue allocating many frame buffers on RasPi

Tue Jul 5 10:07:03 CEST 2022

Hi Alan

Some of the Picamera2 examples might interest you. For example this one:
https://github.com/raspberrypi/picamera2/blob/main/examples/capture_circular_stream.py

It's probably a bit more complicated than you need. It sends the video
stream over the network all the time (for monitoring), but will
simultaneously record the h.264 stream to a file whenever it detects motion.

David

On Tue, 5 Jul 2022 at 08:56, Jacopo Mondi via libcamera-devel <
libcamera-devel at lists.libcamera.org> wrote:

> Hi Alan,
>
> On Mon, Jul 04, 2022 at 03:16:22PM -0400, Alan W Szlosek Jr wrote:
> > Thanks Naush and Jacopo,
> >
> > I'm trying to allocate lots of buffers to avoid unnecessary data
> > copying. And I'm hoping to support high frame rates of 30 fps as well
> > so I'll ultimately need to allocate around 60 frame buffers. Basically
> > I plan to have a circular buffer of frame buffers, spanning at least a
> > second or two. On slower machines (like Pi Zeros) I'll likely run a
> > motion detection algorithm every second. On RasPis with more cores
> > I'll probably run it 3 times a second. When motion is detected I'll
> > transcode the frames to h264 using the hardware encoder and save to a
> > file. Hopefully that explains why I want to allocate so many buffers
> > ahead of time.
>
> So basically you want to buffer 1/2 seconds of video streaming to
> periodically run an algorithm on it.
>
> I would start wondering if your algorithm and your use case requires
> a picture every 33ms (or 16ms for the 60FPS use case), but I have no
> idea what kind of motion you're tracking, so this might be perfectly legit.
>
> Second, with the risk of saying something imprecise as I don't know
> the platform so well, you should consider if allocating so much
> memory in the video device is a good idea.
>
> Buffers are generally allocated by V4L2 drivers using the videobuf2
> contigous allocator, which can use the CMA allocator as backend. CMA
> has a configurable number of zones which are reserved by the kernel
> for the purpose of allocating chunks of contigous memory from there.
> The CMA area size is configurable via a Kconfig option or a kernel
> parameter but it's generally limited. I have no idea what the size is on
> RPi to be honest, nor how much it can be increased.
>
> I would explore instead the idea of allocating buffers via a different
> allocator and import them in the video device, specifically by using
> the dmabuf-heaps allocator which, as far as I know, allows to reserve
> a chunk of physical memory for the purpose. One could argue that CMA
> does the same, but by doing this you would have full control over the
> memory area you're using for your buffer pool and will not be
> contending it with other system components.
>
> >
> > Jacopo, to answer your question about dmesg output .... No, there's
> > nothing in dmesg, see the following:
>
> Enabling CONFIG_CMA_DEBUG in your kernel config might help maybe, even
> if I would expect relevant messages like "you're out of memory" to be
> there. Can you double check with 'dmesg | grep -i cma' ?
>
> >
> > [   17.904193] NET: Registered PF_BLUETOOTH protocol family
> > [   17.904202] Bluetooth: HCI device and connection manager initialized
> > [   17.904227] Bluetooth: HCI socket layer initialized
> > [   17.904240] Bluetooth: L2CAP socket layer initialized
> > [   17.904263] Bluetooth: SCO socket layer initialized
> > [   17.923327] Bluetooth: HCI UART driver ver 2.3
> > [   17.923355] Bluetooth: HCI UART protocol H4 registered
> > [   17.923440] Bluetooth: HCI UART protocol Three-wire (H5) registered
> > [   17.923694] Bluetooth: HCI UART protocol Broadcom registered
> > [   18.541155] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > [   18.541179] Bluetooth: BNEP filters: protocol multicast
> > [   18.541200] Bluetooth: BNEP socket layer initialized
> > [   18.561456] NET: Registered PF_ALG protocol family
> > [   20.449306] vc4-drm soc:gpu: [drm] Cannot find any crtc or sizes
> > [   33.761159] cam-dummy-reg: disabling
> >
> > And Naush, when you say CMA do you mean the Contiguous Memory
> > Allocator? Does this mean that when I ask for 20 buffers it's trying
> > to allocate 1 contiguous block of memory behind the scenes, resulting
> > in 1 dmabuf file descriptor? If so, it sounds like I should somehow
> > ask for smaller, separate dmabuf blocks to cover what I need. What do
> > you think? Is that easily doable?
>
> I presume the C in CMA means that the area from where memory pages are
> allocated from is contiguous, but if you ask for 20 buffers you should
> get 20 dmabuf identifiers.
>
> >
> > Thanks to you both for your help.
> >
> > On Mon, Jul 4, 2022 at 4:40 AM Naushir Patuck <naush at raspberrypi.com>
> wrote:
> > >
> > > Hi Alan,
> > >
> > > On Mon, 4 Jul 2022 at 09:31, Jacopo Mondi via libcamera-devel <
> libcamera-devel at lists.libcamera.org> wrote:
> > >>
> > >> Hi Alan,
> > >>
> > >> On Sat, Jul 02, 2022 at 07:48:48AM -0400, Alan W Szlosek Jr via
> libcamera-devel wrote:
> > >> > Hi libcamera, I'm creating a security camera app for RasPis and I'm
> > >> > having trouble allocating 20 frame buffers (would like to alloc even
> > >> > more). Do you know why? Do you have suggestions? I'm currently
> testing
> > >> > on a Raspberry Pi 3B+.
> > >> >
> > >>
> > >> Can I ask why you need to allocate tham many buffers in the video
> > >> device ?
> > >
> > >
> > > Snap.  I was going to ask the same question.  All frame buffers are
> allocated out
> > > of CMA space.  20 x 2MP YUV420 buffers is approx 60 MBytes only for a
> single
> > > set of buffers.  Typically, you ought to get aways with < 10 buffers
> for most video
> > > use cases.
> > >
> > > Naush
> > >
> > >
> > >>
> > >>
> > >> > This is the output I'm getting. The return value from allocate()
> seems
> > >> > to imply that everything is fine ("Allocated 20 buffers for stream")
> > >> > when it's not fine behind the scenes.
> > >> >
> > >> > [1:23:50.594602178] [1217]  INFO Camera camera_manager.cpp:293
> > >> > libcamera v0.0.0+3544-22656360
> > >> > [1:23:50.657034054] [1218]  WARN RPI raspberrypi.cpp:1241 Mismatch
> > >> > between Unicam and CamHelper for embedded data usage!
> > >> > [1:23:50.659149325] [1218]  INFO RPI raspberrypi.cpp:1356 Registered
> > >> > camera /base/soc/i2c0mux/i2c at 1/imx219 at 10 to Unicam device
> /dev/media3
> > >> > and ISP device /dev/media0
> > >> > [1:23:50.660510009] [1217]  INFO Camera camera.cpp:1029 configuring
> > >> > streams: (0) 1640x922-YUV420
> > >> > [1:23:50.661246471] [1218]  INFO RPI raspberrypi.cpp:760 Sensor:
> > >> > /base/soc/i2c0mux/i2c at 1/imx219 at 10 - Selected sensor format:
> > >> > 1920x1080-SBGGR10_1X10 - Selected unicam format: 1920x1080-pBAA
> > >> > Allocated 20 buffers for stream
> > >> > [1:23:50.733980221] [1218] ERROR V4L2 v4l2_videodevice.cpp:1218
> > >> > /dev/video14[14:cap]: Not enough buffers provided by V4L2VideoDevice
> > >> > [1:23:50.734467203] [1218] ERROR RPI raspberrypi.cpp:1008 Failed to
> > >> > allocate buffers
> > >>
> > >> This seems to happen when the pipeline starts and tries to allocate
> > >> buffers for its internal usage. Might it be you simply run out of
> > >> available memory ?
> > >>
> > >> Is there anything on your dmesg output that might suggest that, like a
> > >> message from your CMA allocator ?
> > >>
> > >> Can you try with allocating an increasing number of buffers until you
> > >> don't get to the failure limit ?
> > >>
> > >> > [1:23:50.739962387] [1217] ERROR Camera camera.cpp:528 Camera in
> > >> > Configured state trying queueRequest() requiring state Running
> > >> > [1:23:50.740078898] [1217] ERROR Camera camera.cpp:528 Camera in
> > >> > Configured state trying queueRequest() requiring state Running
> > >> >
> > >> > Here's how I'm compiling:
> > >> >
> > >> > clang++ -g -std=c++17 -o scaffold \
> > >> >     -I /usr/include/libcamera \
> > >> >     -L /usr/lib/aarch64-linux-gnu \
> > >> >     -l camera -l camera-base \
> > >> >     scaffold.cpp
> > >> >
> > >> > And here's the code I'm using. Thank you!
> > >> >
> > >> > #include <iomanip>
> > >> > #include <iostream>
> > >> > #include <memory>
> > >> > #include <thread>
> > >> >
> > >> > #include <libcamera/libcamera.h>
> > >> >
> > >> > using namespace libcamera;
> > >> >
> > >> > static std::shared_ptr<Camera> camera;
> > >> >
> > >> > time_t previousSeconds = 0;
> > >> > int frames = 0;
> > >> > static void requestComplete(Request *request)
> > >> > {
> > >> >     std::unique_ptr<Request> request2;
> > >> >     if (request->status() == Request::RequestCancelled)
> > >> >         return;
> > >> >     const std::map<const Stream *, FrameBuffer *> &buffers =
> request->buffers();
> > >> >
> > >> >     request->reuse(Request::ReuseBuffers);
> > >> >     camera->queueRequest(request);
> > >> >
> > >> >     struct timespec delta;
> > >> >     clock_gettime(CLOCK_REALTIME, &delta);
> > >> >     if (previousSeconds == delta.tv_sec) {
> > >> >         frames++;
> > >> >     } else {
> > >> >         fprintf(stdout, "Frames: %d\n", frames);
> > >> >         frames = 1;
> > >> >         previousSeconds = delta.tv_sec;
> > >> >     }
> > >> > }
> > >> >
> > >> > int main()
> > >> > {
> > >> >     std::unique_ptr<CameraManager> cm =
> std::make_unique<CameraManager>();
> > >> >     cm->start();
> > >> >
> > >> >     if (cm->cameras().empty()) {
> > >> >        std::cout << "No cameras were identified on the system."
> > >> >                  << std::endl;
> > >> >        cm->stop();
> > >> >        return EXIT_FAILURE;
> > >> >     }
> > >> >
> > >> >     std::string cameraId = cm->cameras()[0]->id();
> > >> >     camera = cm->get(cameraId);
> > >> >
> > >> >     camera->acquire();
> > >> >
> > >> >     // VideoRecording
> > >> >     std::unique_ptr<CameraConfiguration> config =
> > >> > camera->generateConfiguration( { StreamRole::VideoRecording } );
> > >> >     StreamConfiguration &streamConfig = config->at(0);
> > >> >     streamConfig.size.width = 1640; //640;
> > >> >     streamConfig.size.height = 922; //480;
> > >> >     // This seems to default to 4, but we want to queue buffers for
> post
> > >> >     // processing, so we need to raise it.
> > >> >     // 10 works ... oddly, but 20 fails behind the scenes. doesn't
> apear
> > >> >     // to be an error we can catch
> > >> >     streamConfig.bufferCount = 20;
> > >> >
> > >> >     // TODO: check return value of this
> > >> >     CameraConfiguration::Status status = config->validate();
> > >> >     if (status == CameraConfiguration::Invalid) {
> > >> >         fprintf(stderr, "Camera Configuration is invalid\n");
> > >> >     } else if (status == CameraConfiguration::Adjusted) {
> > >> >         fprintf(stderr, "Camera Configuration was invalid and has
> been
> > >> > adjusted\n");
> > >> >     }
> > >> >
> > >> >     camera->configure(config.get());
> > >> >
> > >> >     FrameBufferAllocator *allocator = new
> FrameBufferAllocator(camera);
> > >> >
> > >> >     for (StreamConfiguration &cfg : *config) {
> > >> >         // TODO: it's possible we'll need our own allocator for
> raspi,
> > >> >         // so we can enqueue many frames for processing
> > >> >         int ret = allocator->allocate(cfg.stream());
> > >> >         // This error handling doesn't catch a failure to allocate
> 20 buffers
> > >> >         if (ret < 0) {
> > >> >             std::cerr << "Can't allocate buffers" << std::endl;
> > >> >             return -ENOMEM;
> > >> >         }
> > >> >
> > >> >         size_t allocated = allocator->buffers(cfg.stream()).size();
> > >> >         std::cout << "Allocated " << allocated << " buffers for
> > >> > stream" << std::endl;
> > >> >     }
> > >> >
> > >> >
> > >> >     Stream *stream = streamConfig.stream();
> > >> >     const std::vector<std::unique_ptr<FrameBuffer>> &buffers =
> > >> > allocator->buffers(stream);
> > >> >     std::vector<std::unique_ptr<Request>> requests;
> > >> >
> > >> >     for (unsigned int i = 0; i < buffers.size(); ++i) {
> > >> >         std::unique_ptr<Request> request = camera->createRequest();
> > >> >         if (!request)
> > >> >         {
> > >> >             std::cerr << "Can't create request" << std::endl;
> > >> >             return -ENOMEM;
> > >> >         }
> > >> >
> > >> >         const std::unique_ptr<FrameBuffer> &buffer = buffers[i];
> > >> >         int ret = request->addBuffer(stream, buffer.get());
> > >> >         if (ret < 0)
> > >> >         {
> > >> >             std::cerr << "Can't set buffer for request"
> > >> >                     << std::endl;
> > >> >             return ret;
> > >> >         }
> > >> >
> > >> >         requests.push_back(std::move(request));
> > >> >     }
> > >> >
> > >> >     camera->requestCompleted.connect(requestComplete);
> > >> >
> > >> >     // sets fps (via frame duration limts)
> > >> >     // TODO: create ControlList and move to global var
> > >> >     // TODO: is there a raspi-specific implementation of this?
> > >> >     libcamera::ControlList controls(libcamera::controls::controls);
> > >> >     int framerate = 30;
> > >> >     int64_t frame_time = 1000000 / framerate; // in microseconds
> > >> >     controls.set(libcamera::controls::FrameDurationLimits, {
> > >> > frame_time, frame_time });
> > >> >
> > >> >     camera->start(&controls);
> > >> >     for (auto &request : requests)
> > >> >        camera->queueRequest(request.get());
> > >> >
> > >> >     //60 * 60 * 24 * 7; // days
> > >> >     int duration = 10;
> > >> >
> > >> >     for (int i = 0; i < duration; i++) {
> > >> >         std::cout << "Sleeping" << std::endl;
> > >> >
>  std::this_thread::sleep_for(std::chrono::milliseconds(1000));
> > >> >     }
> > >> >
> > >> >
> > >> >     return 0;
> > >> > }
> > >> >
> > >> > --
> > >> > Alan Szlosek
> >
> >
> >
> > --
> > Alan Szlosek
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.libcamera.org/pipermail/libcamera-devel/attachments/20220705/1a277991/attachment.htm>