Tag Archive: gstreamer


During the past month I’ve been working on a new GStreamer element called qtvideosink.  The purpose of this element is to allow painting video frames from GStreamer on any kind of Qt surface and on any platform supported by Qt. A “Qt surface” can be a QWidget, a QGraphicsItem in a QGraphicsView, a QDeclarativeItem in a QDeclarativeView, and even off-screen surfaces like QImage, QPixmap, QGLPixelBuffer, etc… The initial reason for working on this new element was to support GStreamer video in QML, which is something that many people have asked me about in the past. Until now there was only QtMultimedia supporting this, with some code in phonon being in progress as well. But of course, the main disadvantage with both QtMultimedia and phonon is that although they support this feature with GStreamer as the backend, they don’t allow you to mix pure GStreamer code with their QML video item, therefore they are useless in case you need to do something more advanced using the GStreamer API directly. Hence the need for something new.

My idea with qtvideosink was to implement something that would be a standalone GStreamer element, which would not require the developer to use a specific high level API in order to paint video on QML. In the past I have also written another similar element, qwidgetvideosink, which is basically the same idea, but for QWidgets. After looking at the problem a bit more carefully, I realized that in fact qwidgetvideosink and qtvideosink would share a lot of their internal logic and therefore I could probably do one element generic enough to do both painting on QWidgets and on QML and perhaps more surfaces. And so I did.

I started by taking the code of qtgst-qmlsink, a project that was started by a colleague here at Collabora last year, with basically the same intention, but which was never finished properly. This project was initially based on QtMultimedia’s GStreamer backend. As a first step, I did some major refactoring to clean it up from its QtMultimedia dependencies and to make it an independent GStreamer plugin (as it used to be a library). Then I merged it with qwidgetvideosink, so that they can share the common parts of the code and also wrote a unit test for it. Sadly, the unit test proved something that I was suspecting already: the original QtMultimedia code was quite buggy. But I must say I enjoyed fixing it. It was a good opportunity for me to learn a lot of things on video formats and on OpenGL.

How does it work

First of all, you can create the sink with the standard gst_element_factory_make method (or its equivalent in the various bindings). You will notice that this sink provides two signals, an action signal (a slot in Qt terminology) called “paint” and a normal signal called “update”. “update” is emitted every time the sink needs the surface to be repainted. This is meant to be connected directly to QWidget::update() or QGraphicsItem::update() or something similar. The “paint” slot takes a QPainter pointer and a rectangle (x, y, width, height as qreals) as its arguments and paints the video inside the given rectangle using the given painter. This is meant to be called from the widget’s paint event or the graphics item’s paint() function. So, all you need to do is to take care of those two signals and qtvideosink will do everything else.

Getting OpenGL into the game

You may be wondering how this sink does the actually painting. Using QPainter, using OpenGL or maybe something else? Well, there are actually two variants of this video sink. The first one, qtvideosink, just uses QPainter. It is able to handle only RGB data (only a subset of the formats that QImage supports) and does format conversion and scaling in software. The second one, however, qtglvideosink, uses OpenGL/OpenGLES with shaders. It is able to handle both RGB and YUV formats and does format conversion and scaling in hardware. It is used in exactly the same way as qtvideosink, but it requires a QGLContext pointer to be set on its “glcontext” property before its state is set to READY. This of course means that the underlying surface must support OpenGL (i.e. it must be one of QGLWidget, QGLPixelBuffer or QGLFrameBufferObject). To get this working on QGraphicsView/QML, you just need to set a QGLWidget as the viewport of QGraphicsView and use this widget’s QGLContext in the sink.

qtglvideosink uses either GLSL shaders or ARB fragment program shaders if GLSL is not supported. This means it should work on pretty much every GPU/driver combination that exists for linux on both desktop and emebedded systems. In case no shaders are supported, it will fail to change its state to READY and then you can just substitute it with qtvideosink, which is guaranteed to work on all platforms supported by Qt.

qtglvideosink also has an extra feature: it supports the GstColorBalance interface. Color adjustment is done in the shaders together with the format conversion. qtvideosink doesn’t support this, as it doesn’t make sense. Color adjustment would need to be implemented in software and this can be done better by plugging a videobalance element before the sink. No need to duplicate code.

So, which variant to use?

If you are interested in painting video on QGraphicsView/QML, then qtglvideosink is the best choice of all sinks. And if for any reason the system doesn’t support OpenGL shaders, qtvideosink is the next choice. Now if you intend to paint video on normal QWidgets, it is best to use one of the standard GStreamer sinks for your platform, unless you have a reason not to. QWidgets can be transformed to native system windows by calling their winId() method and therefore any sink that implements the GstXOverlay interface can be embedded in them. On X11 for example, xvimagesink is the best choice. However, if you need to do something more tricky and embedding another window doesn’t suit you very well, you could use qtglvideosink in a QGLWidget (preferrably) or qtvideosink / qwidgetvideosink on a standard QWidget.

Note that qwidgetvideosink is basically the same thing as qtvideosink, with the difference that it takes a QWidget pointer in its “widget” property and handles everything internally for painting on this widget. It has no signals. Other than that, it still does painting in software with QPainter, just like qtvideosink. This is just there to keep compatibility with code that may already be using it, as it already exists in QtGStreamer 0.10.1.

This is actually 0.10 stuff… What about GStreamer 0.11/1.0?

Well, if you are interested in 0.11, you will be happy to hear that there is already a partial 0.11 port around. Two weeks ago I was at the GStreamer 1.0 hackfest at Malaga, Spain, and one of the things I did there was porting qtvideosink to 0.11. I must say the port was quite easy to do. However, last week I added some more stuff in the 0.10 version that I haven’t ported yet to 0.11. I’ll get to that soon, it shouldn’t take long.

Try it out

The code lives in the qt-gstreamer repository. The actual video sinks are independent from the qt-gstreamer bindings, but qt-gstreamer itself has some helper classes for using them. Firstly there is QGst::Ui::VideoWidget, a QWidget subclass which will accept qtvideosink, qtglvideosink and qwidgetvideosink just like any other video sink and will transparently do all the required work to paint the video in it. Secondly, there is QGst::Ui::GraphicsVideoWidget and QGst::Ui::GraphicsVideoSurface. Those two are meant to be used together to paint video on a QGraphicsView or QML. You can find more about them at the documentation in graphicsvideosurface.h (this will soon be on the documentation website). Finally, there is a QtGStreamer QML plugin, which exports a “VideoItem” element if you “import QtGStreamer 0.10″. This is also documented in the GraphicsVideoSurface header. All of this will soon be released in the upcoming qt-gstreamer 0.10.2.

QtGStreamer 0.10.1

This weekend I released QtGStreamer 0.10.1, the first stable version of QtGStreamer. This release marks the beginning of the stable 0.10 series of QtGStreamer that will continue for the lifetime of GStreamer 0.10. For those of you that don’t yet know what QtGStreamer is, it is a set of libraries that provide Qt-style C++ bindings for GStreamer, plus extra helper classes and elements for better integration of GStreamer in Qt applications.

I must say thanks a lot to Mauricio, the co-developer of QtGStreamer, who helped me a lot with the design and code, to the GStreamer community, who accepted this project under the GStreamer umbrella with great enthusiasm, to Nokia for sponsoring it, to Collabora for assigning me and Mauricio to work on it and to all those developers who are already using it in their projects and have helped us by providing feedback.

The future

Development of course does not stop here. It just started. We will try to improve the bindings as much as we can by exporting more and more of GStreamer’s functionality, by adding more and more convenience methods/classes and/or gstreamer elements that ease the use of GStreamer in Qt applications and by collecting opinions and ideas from all of you out there that will use this API. This last bit is quite important imho, so, if you have any suggestions to make about things that you don’t like or things that you would like to see implemented, please file a bug to let us know.

Use in KDE

I am quite happy to see that this library already has early adopters in KDE. Apart of course from my telepathy-kde-call-ui (ex kcall), which is the “father” of QtGStreamer, QtGStreamer is also used in kamoso, a cheese-like camera app, whose authors, Alex Fiestas and Aleix Pol, have been very patient waiting for me to release QtGStreamer before they release kamoso and have also been very supportive during all this time (thanks!).

Personal thoughts

I must say this project was fun to develop. During development, I learned a lot about C++ that I didn’t know before and I also learned how GObject works, which I must say is quite interesting, although ugly for my taste. Learning more about C++ was my main source of interest from the beginning of the project, and for some period of time I couldn’t even imagine that this project would ever reach here, but I kept coding it for myself. Obviously, I am more than happy now that this finally evolved into something that is also useful for others and has wide acceptance :)

This week I wrote some exciting (for me) code. Last weekend, while playing with gstreamer, I had this crazy idea to write gstreamer bindings for Qt. So, I started writing it for fun, outside the scope of kcall. It took me about one day to write something usable and I was really excited. Then, I remembered that some days ago, bradh in irc had told me that it would be possible to use solid to autodetect audio/video devices for gstreamer. Being excited with the bindings, I thought about making one library with the 1-1 gstreamer-Qt bindings and one extra library with extra stuff, like device autodetection using solid. So, I started writing this new library as well. I developed those two libraries for about 4 days and I reached a point where they were usable for the purposes of kcall. So, I merged them in kcall and rewrote the part of kcall that handles audio/video streaming to use them. At that point, I also wrote a small telepathy-farsight Qt wrapper (libqtpfarsight), mostly to provide a sane API for it (as the original telepathy-farsight API is really bad) and not to get rid of GObject stuff, but eventually I achieved both. So, now the core kcall code uses only Qt, the GObject ugliness is hidden in the libQtGstreamer and the libqtpfarsight libraries and I have device autodetection using solid :D I think that was worth the effort, although it doesn’t offer any significant functionality to kcall.

And to add to my excitement, there was already interest for my bindings by one guy who is writing a plasmoid that uses a webcam to take photos. He couldn’t use phonon because phonon has no support for video input (yet?), so he started writing it with gstreamer and so he was interested about my work, which he already has started to use. I’m really happy to see my work becoming useful for others :)

Today I spent my day doing debugging, trying to understand why kcall does not receive correctly video from the remote end. I still haven’t reached the answer and I’m really disappointed because everything in the code and the gstreamer logs looks perfect. :(

Sending video is not implemented yet, but with the code as it is now, it is a matter of about 10-20 lines of code to add support for it. I will definitely do this in the following days, possibly tomorrow. I am also going to write a KCM for configuring device preferences, which is mostly done, as the library I mentioned above with the extra stuff that sit on top of QtGstreamer, already has a DeviceChooser widget, which can be used for selecting devices and has also support for saving and loading the selected device using KConfig :D Next weekend this will hopefully be over, and I hope I will also have solved the strange bug regarding receiving video.

The only thing that makes me sad now is that this week of coding essentially sent to the trash the code I wrote two weeks ago, which took me some time to write, but at least I know it was self-educating.

Update on kcall status

So, time to let you know what’s the progress I’ve done in kcall. Unfortunately, nothing exciting has happened the past 2 weeks. I’ve spent about 1.5 week working on gstreamer device configuration. I spent lots of time reading documentation and code from empathy and phonon to understand how it all works, and also spent lots of time designing the code…

I chose a complex design, and I’ll explain you what I mean. Gstreamer provides elements that can be connected with each other to create a pipeline where data streams can flow from one element to the other. Each element is designed to do a specific job. For example, one element may provide audio input, another may apply a filter to the audio that comes from the input, another may encode the audio to vorbis, another may take audio input and provide a video visualization in the output, etc… For audio input and output, gstreamer provides several elements, mostly to support all possible backends (alsa, oss, jack, pulse, quicktime on mac, directshow on windows, etc…). The complexity starts exactly here. I needed a system where a user can configure which backend he wants to use and additionally set properties for this backend. For example, one may want to configure an audio output device. For audio output, gstreamer provides “alsasink”, “osssink”, “jackaudiosink” and many more. I needed a widget that can have a list with all those elements (listed with their proper names of course, i.e “Alsa”, “OSS”, “Jack”) and then allow the user to configure each element’s properties. For example, “alsasink” has a “device” property, where you can set the alsa device name where the output should go. If I chose to hardcode every element and property and create static dialogs in designer, the code would not be very flexible and portable. So, I chose to hardcode as little information as possible and create the dialog dynamically, based on a list of all the possible elements and their properties and by doing some gobject introspection to learn about the types of the properties, their possible values (if the element supports probing for possible values), their default values, etc… The code ended up being very complex and I haven’t committed most of it yet, it’s waiting in a local git-svn branch.

That was the main idea. As a side effect, I also wrote some code to auto-detect which element to use or load the preferred element from a KConfig entry and also load its properties from KConfig. Of course, reading settings is designed to work in cooperation with the configuration dialog, which will save settings. The auto-detection is a copy-paste from the phonon gstreamer backend. It’s not perfect, but it has a nice logic that should work for 99.9% of the users. The tricky thing about this autodetection is that it works better for gnome users, as in gnome there are the “gconfaudiosrc” and “gconfaudiosink” gstreamer elements that internally load the correct element and device based on gnome settings, and these elements also support application categories like phonon does (i.e. audio player, chat, notification, etc…). I wish we had such elements for kde as well… Actually I wish gstreamer was truly cross-desktop and cross-platform, so that it would be easier for me to use, without having to invent all this trickery and without feeling guilty of using gnome stuff. Gstreamer is a really cool framework in my opinion, so it’s a shame being tied up so closely to gnome. :(

Anyway, this work left me a bit behind I think. So, I am leaving it to work on more important stuff. Today, I worked on the call window UI. What I have now is this:

Current call window UI

Current call window UI

The participants dock is not shown by default, as it’s not very useful on two-person calls, but I added it because in the future kcall will probably support conferences between many people. From the tabs, the dialpad is also implemented and supposed to work, but it’s not enabled there because I am doing a call over jabber/jingle, which doesn’t support it (and doesn’t need it of course).

From tomorrow I plan to experiment with video support. I plan to have a small widget above those tabs for showing my local camera and a big one on the left for showing the other person, in two-person calls. For multi-person calls I will probably use separate windows for each participant, but I am not yet sure about it. Ideas and suggestions are always welcome. :)

GSoC Week #4

I skipped a week without blogging, mostly because I was busy last weekend, but now I think it’s time to report my status on kcall again…

Last week I spent about 3 days studying gstreamer and I ended up creating a media handler class using telepathy-farsight and gstreamer, which is able to handle audio calls without problems. The only bug I have there is that the microphone volume control does not work correctly, but I hope I will solve this some time (it’s not urgent anyway). The code is heavily based on andrunko’s telepathy-qt4 media branch, a branch of the telepathy-qt4 library that includes a high level API for handling all this farsight/gstreamer stuff, but as this branch is not ready yet and as I will probably need more control over gstreamer than what this media API gives me, I just copied and adapted this code to work in kcall. The only part I don’t like about this gstreamer stuff is that its dependencies are HUGE. For example, I just need to depend on libxml2 and telepathy-glib because some of the headers I include, include in turn some headers from those libraries… Totally unacceptable imho. Actually, big part of my work here was to create correct cmake scripts that can find and use all those dependencies….

Ok, so after making the media handler, I split the part that handles calls in a separate executable, implementing the telepathy Client.Handler interface. I merged in this executable the kpart I had created, as after reading the telepathy spec about the channel dispatcher, I realized that there is no need to have a kpart. A separate handler process is enough to be reusable by any other program. If another program (for example, kopete) wants to start a media call, it can just request a media channel from the channel dispatcher, and the channel dispatcher will automatically open a handler for media channels, such as this kcall handler. Apart from that, I also created a system tray icon (using the new KNotificationItem API) and an approver class, which shows a popup message (using knotify) when there is an incoming call and allows the user to accept/reject the call.

This week I had an exam on Wednesday, which prevented me a bit from working on kcall. In the time that was left, I started working on improving the call window. I added a dock widget with volume controls and a timer showing call duration, and I also fixed some internal stuff to report correct status to the user and accept incoming calls correctly.

Next thing to do now is to improve the UI of the call window, so that I can also add the video widgets on there and play with video support. I will also need to find some software and protocol that will allow me to test video calls easily. I tried connecting to ekiga.net over SIP yesterday to use its handy 500@ekiga.net echo-test service, but it seems that telepathy-sofiasip has trouble connecting to ekiga.net.

Btw, if any of you out there would like to help me designing a good UI, I would love to hear some ideas and/or see mockups of how the call window UI should be, as I’m really bad at designing GUIs on my own :P The basic idea is that I need some widgets to see video in the middle, plus some list with the participants of the call, plus volume controls for mic & speakers, plus a dial pad… I’m currently thinking of putting all optional stuff (participants list, volume controls, dial pad) in dock widgets and put two video widgets in the middle (one for the remote contact and one for myself)… but now that I think it again, the problem here is that *theoretically* a call can have many participants, so just two video widgets may not be enough. And on the other hand, what should be displayed for audio-only calls? I think you get an image of the situation, so, I would love some ideas here :)

GSoC week #2

This week went a bit out of plan. I didn’t work much on kcall as I was busy with other things. On Tuesday I had two exams (fortunately, quite easy ones), which kept me busy for both Monday and Tuesday. Then from Wednesday I started packaging KDE 4.3 beta2 for debian, which was quite challenging and kept me busy for 3 days (Wednesday-Friday). I packaged only the basics (kdelibs, kdepimlibs, kdebase-runtime, kdebase-workspace and some kdesupport dependencies) and of course they are not of release quality yet (so don’t expect 4.3 beta2 packages in debian).

In the meanwhile, despite being busy with other stuff, I took some time to study a bit more the “call example” from the TelepathyQt4 examples, which is essentially a simple version of what I am developing, and I wrote some code for a “call window”, which is an object that in the future it will be able to handle a call and display a nice window with status info, the video widget, audio/video controls, etc… Yesterday (Saturday), I polished a bit the API of this object and I implemented some really basic functionality. While I was looking at the code, I thought it may be better to develop this window as a kpart, which will make it possible to be reused later in other projects, like kopete for example (when it is ported to telepathy, if this ever happens). So, late yesterday afternoon, I ported this window to use kparts. However a linker issue (telepathy bug 21340) stopped me from finishing it. Today I managed to fix this issue and I am now working on finishing the kpart. Unfortunately I don’t have much time to work on it today, but I promise it will be ready by late night today or tomorrow morning.

Now, the next step is to implement an object that will do the encoding/decoding of the audio/video. As an exception to the general design of telepathy, audio/video handling is specified to be done by the application itself and not from the connection manager that connects to the protocol. To handle this, telepathy developers have designed a library called telepathy-farsight, which internally communicates with the connection manager and handles the audio/video streaming part. To do the actual encoding/decoding, gstreamer must be used. Gstreamer is a library that resembles phonon a lot. It uses a similar pipelined architecture. From what I understand, telepathy-farsight provides a gstreamer source and a sink, which can be connected to other gstreamer objects that will do encoding/decoding, grab source from the mic or camera, output to alsa and some video widget, etc… Unfortunately, farsight and gstreamer are the only way to go here. This is how the telepathy specification is designed, and while I bet it would be possible to write something similar to farsight that will do the same job using Qt and phonon, this is too much work to do and if this ever happens, that will take a few years. So, I will have to spend this week learning the glib/gobject and gstreamer basics, so that I will be able to write this part of kcall. The “call example” I mentioned earlier provides a sample implementation of this object, but although I could just copy it, I need to understand what it does so that I will be able to extend it.

I hope this week I will work more and I will manage to make a simple call :) My current plan is to have audio/video fully working (with controls and options) before July 6th (the middle of the gsoc period), so that I can spend the rest of the period doing UI/usability improvements and implementing secondary features that may be needed. (Notice: The author of this post has the authority to change this plan without previous notice! :P)

Follow

Get every new post delivered to your Inbox.