Tag Archive: gsoc

Video calls in KDE-Telepathy

Well, I think I owed you this one ;) Remember back in 2009 when I was working on KCall as part of the GSoC program? Well, it may have taken 2.5 years more, but I’m now pleased to announce that it’s finally in a ready-to-use state \o/ Don’t expect it to be perfect, of course. It still has a long way to go.

Here is the obligatory screenshot. Me on my desktop, calling myself on my laptop :)

Screenshot of ktp-call-ui

The KDE-Telepathy call-ui in action

A little bit of history

When my GSoC finished in 2009, there were 2 main problems with KCall. The first one was that the bits of the telepathy specification for doing calls (i.e. the “StreamedMedia” channel type) were problematic, not to mention that the API of the telepathy-farsight library, which was the only way to use StreamedMedia, was also weird and it took me too many tries to finally understand it (in late 2010…), which in simple words means that KCall was very unstable beacause it used the API in the wrong way (if there really was a right way to use it…). The second problem was that there was no telepathy integration in the KDE desktop, so KCall would need to have a proper contact list, account manager and other stuff that it shouldn’t have to implement.

In late 2010, the KDE-Telepathy project started evolving and we finally managed to make a first release last summer with the necessary components to use telepathy on the KDE desktop. At about the same time, work began on a new API for doing calls in telepathy, the so-called “Call” channel type, plus telepathy-farstream, the new and enhanced version of telepathy-farsight. It took a little longer than expected, but finally a few weeks ago, thanks to the awesome work of my colleagues at Collabora who engineered the whole thing, the “Call” API and telepathy-farstream were finished and released. Fortunately, last year I had already worked on porting the call-ui to the draft Call API, using the draft telepathy-qt Call bindings that used to be in the telepathy-qt4-yell module. So, now I only had to first update the telepathy-qt bindings to the latest and greatest API specification and then do the same with the call-ui, plus fix a bit the UI, which was way too ugly. And so I did.

The present and the future

The UI is far from perfect at the moment, but the engine seems to work reliably. I have many additions and improvements in mind. However, since I suck at UI design, I’d love having mockups of ideas from people that can actually design UIs. And I’d also love having other people to implement those ideas, since I’m a lazy man… :P (ok, I don’t really mean that). So, if you feel like helping (either way), this is your chance to get involved ;)

The current UI will be included in the next KDE-Telepathy release, 0.4, which is scheduled for next month. Be prepared.

Try it

So, if you can’t wait for the next KDE-Telepathy release and want to try this now, what you need is the latest ktp-call-ui from git master with all of its dependencies. To make a call, simply right click one of your contacts in the contact list and click “audio call” or “video call”. Alternatively, you can do this directly from the text-ui or the contact plasmoid. Note that older versions of those components also have audio/video call buttons, but they will try to start StreamedMedia calls instead, which will fail. Also note that calls require XMPP (jabber, google talk) at the moment, but SIP support is also on its way upstream.

On Monday, GSoC is officially over, so I thought I should make a post describing what I accomplished, what I didn’t and what’s the current status of my project, KCall.

Currently, KCall supports quite well audio/video calls over Jabber/GTalk. The fact that it supports only jabber, though, is not my fault, but it’s the fact that no other connection manager apart from telepathy-gabble (the connection manager for jabber) supports that well doing calls. In fact, there are only two connection managers that support calls, as far as I know: telepathy-gabble (for jabber) and telepathy-sofiasip (for sip). Unfortunately sofiasip does not support certain features that are needed by KCall (yet), so although it may work, there are things that you can’t do, like for example calling other people, which is a crucial feature. So, with sofiasip you can only receive calls and sometimes even this doesn’t work as expected. Streams may not get connected, you may have only one-directional audio and things like that… Don’t ask me why, I have no idea.

If you try KCall, you may notice that the quality of the UI and the general behavior of the application is not as good as possible. There are two issues that prevented me from improving it further:

  1. KCall is a telepathy client that needs to comply with the StreamedMediaChannel specification. However, this specification is not well-defined and there are many problems with it. For many things to work, some assumptions that are not part of the specification have to be made and these assumptions make it extremely difficult to implement advanced features like multi-user conference. In fact I had to assume that the channel can only have 2 participants and implement hacks on top of this… Very ugly, but required. And these assumptions limit the behavior and functionality as well. The problem though lies in the specification, so other clients and connection managers also have the same problem. Empathy works in a similar way. If you take a close look, KCall resembles empathy a lot in functionality and behavior; this is the reason why. Hopefully, the specification will be fixed and I will rewrite parts of KCall to fix it and improve it.
  2. The contact list currently looks awful. The reason I did not improve it is that in the near future the way of managing the contact list in telepathy will change. It is planned to use nepomuk to store relations between contacts, so that we can have metacontacts like in kopete and possibly associate them with the kaddressbook contacts as well. So, if I try to improve the contact list now, it will just be wasted effort, since I will have to rewrite it anyway.

So, from my point of view, KCall is as good as it can be. Sorry that it can’t be better yet.

If you want to try KCall, you can check it out from svn (svn://anonsvn.kde.org/home/kde/trunk/playground/network/kcall) and follow the instructions in the README file that I wrote a few days ago. Note that it still has some important bugs and I don’t think it is ready for general purpose usage, however, I would like it to get some testing. Unfortunately, it has a lot of ugly dependencies. :( Most of them can be found in packages, except perhaps telepathy-qt4 that is not a stable library yet. You can find it here (check the URL field there to get the git url). This is the master branch, which builds with autotools, but there is also a cmake branch in this repo (Just note that the cmake branch has to be built with -j1 and it may not be up-to-date always). I won’t go much into details, I expect you to know how to use git already :) Note that some distributions may be shipping a library called telepathy-qt, but note that this may be the (too) old telepathy-qt library from kdesupport, which is now completely rewritten.

PS: If you are a debian/ubuntu user familiar with debian packaging, you might be interested in this repo.

PS2: I may make an unofficial debian/ubuntu kcall package later this week… I’ll think about it…

PS3: I will write this, but don’t flame me… Just FYI: Storing passwords in the telepathy account manager is not safe, as the passwords are later exported on dbus. I would recommend you to use a dummy password for your jabber account, or even make a new testing account with a dummy password, or at least don’t use the same password for your root and/or regular system users and for your jabber account. And kill mission-control-5 when you don’t need it. Also note that the console log of kcall also prints the password in plain text somewhere… (telepathy-qt4 is to blame…). So, don’t put the log in pastebins without removing the password! I am going to email the right persons about this, as it is totally unacceptable imho. Just watch out until it is fixed.

This week I implemented complete webcam support in kcall. Both video input and output are working :D Screenshot:

Screenshot of kcall in an audio/video session

Screenshot of kcall in an audio/video session

On the left side you can see the incoming video from the remote end, which in this case is my laptop, capturing myself through its webcam, and on the right side you can see the video that is being sent to the other end. Here, because I don’t have a webcam on my desktop computer as well, I am using gstreamer’s “videotestsrc” element as a video input, just for testing. Also, on the bottom right you can see the video controls for the video input (which is shown above them). For some reason, the video coming from my laptop has wrong colors there (my shirt is actually blue!), but that seems to be a bug in empathy (which is used as the remote client there). The preview in empathy also shows wrong colors, so… ;)

Currently audio/video calls are working only with empathy or kcall on the remote side, using the jabber protocol. I also tried to test it with google’s web client (through windows/ie/gmail), but it doesn’t work. This is probably some bug in one of the underlying subsystems (in telepathy-gabble perhaps), but I don’t really care about it at the moment. There are still bugs, though. Sometimes I experience weird deadlocks in gstreamer threads and also sometimes the video stream is not sent correctly and the other side doesn’t receive anything. Some other times it works fine, though, which makes it really difficult to debug… I’m trying to debug those today, but with this extreme heat here in Crete, it is really difficult to work (today temperature reaches 41°C !!!).

Ok, I think I’ll give up for today and go to the beach… :D

This week I wrote some exciting (for me) code. Last weekend, while playing with gstreamer, I had this crazy idea to write gstreamer bindings for Qt. So, I started writing it for fun, outside the scope of kcall. It took me about one day to write something usable and I was really excited. Then, I remembered that some days ago, bradh in irc had told me that it would be possible to use solid to autodetect audio/video devices for gstreamer. Being excited with the bindings, I thought about making one library with the 1-1 gstreamer-Qt bindings and one extra library with extra stuff, like device autodetection using solid. So, I started writing this new library as well. I developed those two libraries for about 4 days and I reached a point where they were usable for the purposes of kcall. So, I merged them in kcall and rewrote the part of kcall that handles audio/video streaming to use them. At that point, I also wrote a small telepathy-farsight Qt wrapper (libqtpfarsight), mostly to provide a sane API for it (as the original telepathy-farsight API is really bad) and not to get rid of GObject stuff, but eventually I achieved both. So, now the core kcall code uses only Qt, the GObject ugliness is hidden in the libQtGstreamer and the libqtpfarsight libraries and I have device autodetection using solid :D I think that was worth the effort, although it doesn’t offer any significant functionality to kcall.

And to add to my excitement, there was already interest for my bindings by one guy who is writing a plasmoid that uses a webcam to take photos. He couldn’t use phonon because phonon has no support for video input (yet?), so he started writing it with gstreamer and so he was interested about my work, which he already has started to use. I’m really happy to see my work becoming useful for others :)

Today I spent my day doing debugging, trying to understand why kcall does not receive correctly video from the remote end. I still haven’t reached the answer and I’m really disappointed because everything in the code and the gstreamer logs looks perfect. :(

Sending video is not implemented yet, but with the code as it is now, it is a matter of about 10-20 lines of code to add support for it. I will definitely do this in the following days, possibly tomorrow. I am also going to write a KCM for configuring device preferences, which is mostly done, as the library I mentioned above with the extra stuff that sit on top of QtGstreamer, already has a DeviceChooser widget, which can be used for selecting devices and has also support for saving and loading the selected device using KConfig :D Next weekend this will hopefully be over, and I hope I will also have solved the strange bug regarding receiving video.

The only thing that makes me sad now is that this week of coding essentially sent to the trash the code I wrote two weeks ago, which took me some time to write, but at least I know it was self-educating.

Update on kcall status

So, time to let you know what’s the progress I’ve done in kcall. Unfortunately, nothing exciting has happened the past 2 weeks. I’ve spent about 1.5 week working on gstreamer device configuration. I spent lots of time reading documentation and code from empathy and phonon to understand how it all works, and also spent lots of time designing the code…

I chose a complex design, and I’ll explain you what I mean. Gstreamer provides elements that can be connected with each other to create a pipeline where data streams can flow from one element to the other. Each element is designed to do a specific job. For example, one element may provide audio input, another may apply a filter to the audio that comes from the input, another may encode the audio to vorbis, another may take audio input and provide a video visualization in the output, etc… For audio input and output, gstreamer provides several elements, mostly to support all possible backends (alsa, oss, jack, pulse, quicktime on mac, directshow on windows, etc…). The complexity starts exactly here. I needed a system where a user can configure which backend he wants to use and additionally set properties for this backend. For example, one may want to configure an audio output device. For audio output, gstreamer provides “alsasink”, “osssink”, “jackaudiosink” and many more. I needed a widget that can have a list with all those elements (listed with their proper names of course, i.e “Alsa”, “OSS”, “Jack”) and then allow the user to configure each element’s properties. For example, “alsasink” has a “device” property, where you can set the alsa device name where the output should go. If I chose to hardcode every element and property and create static dialogs in designer, the code would not be very flexible and portable. So, I chose to hardcode as little information as possible and create the dialog dynamically, based on a list of all the possible elements and their properties and by doing some gobject introspection to learn about the types of the properties, their possible values (if the element supports probing for possible values), their default values, etc… The code ended up being very complex and I haven’t committed most of it yet, it’s waiting in a local git-svn branch.

That was the main idea. As a side effect, I also wrote some code to auto-detect which element to use or load the preferred element from a KConfig entry and also load its properties from KConfig. Of course, reading settings is designed to work in cooperation with the configuration dialog, which will save settings. The auto-detection is a copy-paste from the phonon gstreamer backend. It’s not perfect, but it has a nice logic that should work for 99.9% of the users. The tricky thing about this autodetection is that it works better for gnome users, as in gnome there are the “gconfaudiosrc” and “gconfaudiosink” gstreamer elements that internally load the correct element and device based on gnome settings, and these elements also support application categories like phonon does (i.e. audio player, chat, notification, etc…). I wish we had such elements for kde as well… Actually I wish gstreamer was truly cross-desktop and cross-platform, so that it would be easier for me to use, without having to invent all this trickery and without feeling guilty of using gnome stuff. Gstreamer is a really cool framework in my opinion, so it’s a shame being tied up so closely to gnome. :(

Anyway, this work left me a bit behind I think. So, I am leaving it to work on more important stuff. Today, I worked on the call window UI. What I have now is this:

Current call window UI

Current call window UI

The participants dock is not shown by default, as it’s not very useful on two-person calls, but I added it because in the future kcall will probably support conferences between many people. From the tabs, the dialpad is also implemented and supposed to work, but it’s not enabled there because I am doing a call over jabber/jingle, which doesn’t support it (and doesn’t need it of course).

From tomorrow I plan to experiment with video support. I plan to have a small widget above those tabs for showing my local camera and a big one on the left for showing the other person, in two-person calls. For multi-person calls I will probably use separate windows for each participant, but I am not yet sure about it. Ideas and suggestions are always welcome. :)

GSoC Week #4

I skipped a week without blogging, mostly because I was busy last weekend, but now I think it’s time to report my status on kcall again…

Last week I spent about 3 days studying gstreamer and I ended up creating a media handler class using telepathy-farsight and gstreamer, which is able to handle audio calls without problems. The only bug I have there is that the microphone volume control does not work correctly, but I hope I will solve this some time (it’s not urgent anyway). The code is heavily based on andrunko’s telepathy-qt4 media branch, a branch of the telepathy-qt4 library that includes a high level API for handling all this farsight/gstreamer stuff, but as this branch is not ready yet and as I will probably need more control over gstreamer than what this media API gives me, I just copied and adapted this code to work in kcall. The only part I don’t like about this gstreamer stuff is that its dependencies are HUGE. For example, I just need to depend on libxml2 and telepathy-glib because some of the headers I include, include in turn some headers from those libraries… Totally unacceptable imho. Actually, big part of my work here was to create correct cmake scripts that can find and use all those dependencies….

Ok, so after making the media handler, I split the part that handles calls in a separate executable, implementing the telepathy Client.Handler interface. I merged in this executable the kpart I had created, as after reading the telepathy spec about the channel dispatcher, I realized that there is no need to have a kpart. A separate handler process is enough to be reusable by any other program. If another program (for example, kopete) wants to start a media call, it can just request a media channel from the channel dispatcher, and the channel dispatcher will automatically open a handler for media channels, such as this kcall handler. Apart from that, I also created a system tray icon (using the new KNotificationItem API) and an approver class, which shows a popup message (using knotify) when there is an incoming call and allows the user to accept/reject the call.

This week I had an exam on Wednesday, which prevented me a bit from working on kcall. In the time that was left, I started working on improving the call window. I added a dock widget with volume controls and a timer showing call duration, and I also fixed some internal stuff to report correct status to the user and accept incoming calls correctly.

Next thing to do now is to improve the UI of the call window, so that I can also add the video widgets on there and play with video support. I will also need to find some software and protocol that will allow me to test video calls easily. I tried connecting to ekiga.net over SIP yesterday to use its handy 500@ekiga.net echo-test service, but it seems that telepathy-sofiasip has trouble connecting to ekiga.net.

Btw, if any of you out there would like to help me designing a good UI, I would love to hear some ideas and/or see mockups of how the call window UI should be, as I’m really bad at designing GUIs on my own :P The basic idea is that I need some widgets to see video in the middle, plus some list with the participants of the call, plus volume controls for mic & speakers, plus a dial pad… I’m currently thinking of putting all optional stuff (participants list, volume controls, dial pad) in dock widgets and put two video widgets in the middle (one for the remote contact and one for myself)… but now that I think it again, the problem here is that *theoretically* a call can have many participants, so just two video widgets may not be enough. And on the other hand, what should be displayed for audio-only calls? I think you get an image of the situation, so, I would love some ideas here :)

GSoC week #2

This week went a bit out of plan. I didn’t work much on kcall as I was busy with other things. On Tuesday I had two exams (fortunately, quite easy ones), which kept me busy for both Monday and Tuesday. Then from Wednesday I started packaging KDE 4.3 beta2 for debian, which was quite challenging and kept me busy for 3 days (Wednesday-Friday). I packaged only the basics (kdelibs, kdepimlibs, kdebase-runtime, kdebase-workspace and some kdesupport dependencies) and of course they are not of release quality yet (so don’t expect 4.3 beta2 packages in debian).

In the meanwhile, despite being busy with other stuff, I took some time to study a bit more the “call example” from the TelepathyQt4 examples, which is essentially a simple version of what I am developing, and I wrote some code for a “call window”, which is an object that in the future it will be able to handle a call and display a nice window with status info, the video widget, audio/video controls, etc… Yesterday (Saturday), I polished a bit the API of this object and I implemented some really basic functionality. While I was looking at the code, I thought it may be better to develop this window as a kpart, which will make it possible to be reused later in other projects, like kopete for example (when it is ported to telepathy, if this ever happens). So, late yesterday afternoon, I ported this window to use kparts. However a linker issue (telepathy bug 21340) stopped me from finishing it. Today I managed to fix this issue and I am now working on finishing the kpart. Unfortunately I don’t have much time to work on it today, but I promise it will be ready by late night today or tomorrow morning.

Now, the next step is to implement an object that will do the encoding/decoding of the audio/video. As an exception to the general design of telepathy, audio/video handling is specified to be done by the application itself and not from the connection manager that connects to the protocol. To handle this, telepathy developers have designed a library called telepathy-farsight, which internally communicates with the connection manager and handles the audio/video streaming part. To do the actual encoding/decoding, gstreamer must be used. Gstreamer is a library that resembles phonon a lot. It uses a similar pipelined architecture. From what I understand, telepathy-farsight provides a gstreamer source and a sink, which can be connected to other gstreamer objects that will do encoding/decoding, grab source from the mic or camera, output to alsa and some video widget, etc… Unfortunately, farsight and gstreamer are the only way to go here. This is how the telepathy specification is designed, and while I bet it would be possible to write something similar to farsight that will do the same job using Qt and phonon, this is too much work to do and if this ever happens, that will take a few years. So, I will have to spend this week learning the glib/gobject and gstreamer basics, so that I will be able to write this part of kcall. The “call example” I mentioned earlier provides a sample implementation of this object, but although I could just copy it, I need to understand what it does so that I will be able to extend it.

I hope this week I will work more and I will manage to make a simple call :) My current plan is to have audio/video fully working (with controls and options) before July 6th (the middle of the gsoc period), so that I can spend the rest of the period doing UI/usability improvements and implementing secondary features that may be needed. (Notice: The author of this post has the authority to change this plan without previous notice! :P)

Hello planet!

Hello planet KDE!

This is my first post on planet kde, so I’ll first introduce myself. My name is George Kiagiadakis (gkiagia on irc) and I am a 20-year-old student from Greece. I was born and live in Iraklio, on the island of Crete and I study computer science at the Computer Science Department of the University of Crete. Some of you may already know me as I have been involved in KDE for a few months now. I first started my involvement from the KDE Bugsquad, doing bug triaging, and then I started fixing some bugs as well. I got an svn account in October and I first fixed some issues with kwrited (a daemon that sits in the background and listens for messages sent with write(1) or wall(1) on the same computer). During the period of the KDE 4.2 pre-releases, I also joined the Debian Qt/KDE Maintainers to help them with packaging, which I still do. Later, I started helping Darío Andrés with the development of a new version of drkonqi, the KDE crash handler, which will probably appear in KDE 4.3.

Being a student, I have now entered GSoC. At first I wasn’t sure if I should apply at all, and I probably wouldn’t have done it if it was not grundleborg to encourage and help me (many thanks for that, George). To my surprise, my proposal was accepted :D So, this summer I will be working on reviving KCall and creating a usable VoIP client for KDE, using telepathy. You can have a look at the complete proposal here. I like this idea, because currently there is no voip software available for KDE and also there is no KDE software available to the users that makes use of the telepathy framework, so I’ll be doing something really useful. Hopefully, at the end of the GSoC period, KCall will be ready to use to make calls to your friends! Well, I don’t expect it to be perfect, but it should work… ;)

I will keep you informed of my progress through this blog (or at least, I’ll try to do so)… Stay tuned! ;)


Get every new post delivered to your Inbox.