All roads lead to Rome, but PulseAudio is not far behind! In fact, how the PulseAudio client library determines how to try to connect to the PulseAudio server has no less than 13 different steps. Here they are, in priority order:
1) As an application developer, you can specify a server string in your call to pa_context_connect. If you do that, that’s the server string used, nothing else.
2) If the PULSE_SERVER environment variable is set, that’s the server string used, and nothing else.
3) Next, it goes to X to check if there is an x11 property named PULSE_SERVER. If there is, that’s the server string, nothing else. (There is also a PulseAudio module called module-x11-publish that sets this property. It is loaded by the start-pulseaudio-x11 script.)
4) It also checks client.conf, if such a file is found, for the default-server key. If that’s present, that’s the server string.
So, if none of the four methods above gives any result, several items will be merged and tried in order.
First up is trying to connect to a user-level PulseAudio, which means finding the right path where the UNIX socket exists. That in turn has several steps, in priority order:
5) If the PULSE_RUNTIME_PATH environment variable is set, that’s the path.
6) Otherwise, if the XDG_RUNTIME_DIR environment variable is set, the path is the “pulse” subdirectory below the directory specified in XDG_RUNTIME_DIR.
7) If not, and the “.pulse” directory exists in the current user’s home directory, that’s the path. (This is for historical reasons – a few years ago PulseAudio switched from “.pulse” to using XDG compliant directories, but ignoring “.pulse” would throw away some settings on upgrade.)
8) Failing that, if XDG_CONFIG_HOME environment variable is set, the path is the “pulse” subdirectory to the directory specified in XDG_CONFIG_HOME.
9) Still no path? Then fall back to using the “.config/pulse” subdirectory below the current user’s home directory.
Okay, so maybe we can connect to the UNIX socket inside that user-level PulseAudio path. But if it does not work, there are still a few more things to try:
10) Using a path of a system-level PulseAudio server. This directory is /var/run/pulse on Ubuntu (and probably most other distributions), or /usr/local/var/run/pulse in case you compiled PulseAudio from source yourself.
11) By checking client.conf for the key “auto-connect-localhost”. If so, also try connecting to tcp4:127.0.0.1…
12) …and tcp6:[::1], too. Of course we cannot leave IPv6-only systems behind.
13) As the last straw of hope, the library checks client.conf for the key “auto-connect-display”. If it’s set, it checks the DISPLAY environment variable, and if it finds a hostname (i e, something before the “:”), then that host will be tried too.
To summarise, first the client library checks for a server string in step 1-4, if there is none, it makes a server string – out of one item from steps 5-9, and then up to four more items from steps 10-13.
And that’s all. If you ever want to customize how you connect to a PulseAudio server, you have a smorgasbord of options to choose from!
2.1 surround sound is (by a very unscientific measure) the third most popular surround speaker setup, after 5.1 and 7.1. Yet, ALSA and PulseAudio has since a long time back supported more unusual setups such as 4.0, 4.1 but not 2.1. It took until 2015 to get all pieces in the stack ready for 2.1 as well.
So what made adding 2.1 surround more difficult than other setups? Well, first and foremost, because ALSA used to have a fixed mapping of channels. The first six channels were decided to be:
1. Front Left
2. Front Right
3. Rear Left
4. Rear Right
5. Front Center
6. LFE / Subwoofer
Thus, a four channel stream would default to the first four, which would then be a 4.0 stream, and a three channel stream would default to the first three. The only way to send a 2.1 channel stream would then be to send a six channel stream with three channels being silence.
This was not good enough, because some cards, including laptops with internal subwoofers, would only support streaming four channels maximum.
(To add further confusion, it seemed some cards wanted the subwoofer signal on the third channel of four, and others wanted the same signal on the fourth channel of four instead.)
ALSA channel map API
The first part of the solution was a new alsa-lib API for channel mapping, allowing drivers to advertise what channel maps they support, and alsa-lib to expose this information to programs (see snd_pcm_query_chmaps, snd_pcm_get_chmap and snd_pcm_set_chmap).
The second step was for the alsa-lib route plugin to make use of this information. With that, alsa-lib could itself determine whether the hardware was 5.1 or 2.1, and change the number of channels automatically.
PulseAudio bass / treble filter
With the alsa-lib additions, just adding another channel map was easy.
However, there was another problem to deal with. When listening to stereo material, we would like the low frequencies, and only those, to be played back from the subwoofer. These frequencies should also be removed from the other channels. In some cases, the hardware would have a built-in filter to do this for us, so then it was just a matter of setting enable-lfe-remixing in daemon.conf. In other cases, this needed to be done in software.
Therefore, we’ve integrated a crossover filter into PulseAudio. You can configure it by setting lfe-crossover-freq in daemon.conf.
If you have a laptop with an internal subwoofer, chances are that it – with all these changes to the stack – still does not work. Because the HDA standard (which is what your laptop very likely uses for analog audio), does not have much of a channel mapping standard either! So vendors might decide to do things differently, which means that every single hardware model might need a patch in the kernel.
If you don’t have an internal subwoofer, but a separate external one, you might be able to use hdajackretask to reconfigure your headphone jack to an “Internal Speaker (LFE)” instead. But the downside of that, is that you then can’t use the jack as a headphone jack…
Do I have it?
In Ubuntu, it’s been working since the 15.04 release (vivid). If you’re not running Ubuntu, you need alsa-lib 1.0.28, PulseAudio 7, and a kernel from, say, mid 2014 or later.
Takashi Iwai wrote the channel mapping API, and also provided help and fixes for the alsa-lib route plugin work.
The crossover filter code was imported from CRAS (but after refactoring and cleanup, there was not much left of that code).
Hui Wang helped me write and test the PulseAudio implementation.
PulseAudio upstream developers, especially Alexander Patrakov, did a thorough review of the PulseAudio patch set.
This is a technical post about PulseAudio internals and the upcoming protocol improvements in the upcoming PulseAudio 6.0 release.
PulseAudio memory copies and buffering
PulseAudio is said to have a “zero-copy” architecture. So let’s look at what copies and buffers are involved in a typical playback scenario.
When PulseAudio server and client runs as the same user, PulseAudio enables shared memory (SHM) for audio data. (In other cases, SHM is disabled for security reasons.) Applications can use pa_stream_begin_write to get a pointer directly into the SHM buffer. When using pa_stream_write or through the ALSA plugin, there will be one memory copy into the SHM.
Server resampling and remapping
On the server side, the server might need to convert the stream into a format that fits the hardware (and potential other streams that might be running simultaneously). This step is skipped if deemed unnecessary.
First, the samples are converted to either signed 16 bit or float 32 bit (mainly depending on resampler requirements).
In case resampling is necessary, we make use of external resampler libraries for this, the default being speex.
Second, if remapping is necessary, e g if the input is mono and the output is stereo, that is performed as well. Finally, the samples are converted to a format that the hardware supports.
So, in worst case, there might be up to four different buffers involved here (first: after converting to “work format”, second: after resampling, third: after remapping, fourth: after converting to hardware supported format), and in best case, this step is entirely skipped.
Mixing and hardware output
PulseAudio’s built in mixer multiplies each channel of each stream with a volume factor and writes the result to the hardware. In case the hardware supports mmap (memory mapping), we write the mix result directly into the DMA buffers.
The best we can do is one copy in total, from the SHM buffer directly into the DMA hardware buffer. I hope this clears up any confusion about what PulseAudio’s advertised “zero copy” capabilities means in practice.
However, memory copies is not the only thing you want to avoid to get good performance, which brings us to the next point:
Protocol improvements in 6.0
PulseAudio does pretty well CPU wise for high latency loads (e g music playback), but a bit worse for low latency loads (e g VOIP, gaming). Or to put it another way, PulseAudio has a low per sample cost, but there is still some optimisation that can be done per packet.
For every playback packet, there are three messages sent: from server to client saying “I need more data”, from client to server saying “here’s some data, I put it in SHM, at this address”, and then a third from server to client saying “thanks, I have no more use for this SHM data, please reclaim the memory”. The third message is not sent until the audio has actually been played back.
For every message, it means syscalls to write, read, and poll a unix socket. This overhead turned out to be significant enough to try to improve.
So instead of putting just the audio data into SHM, as of 6.0 we also put the messages into two SHM ringbuffers, one in each direction. For signalling we use eventfds. (There is also an optimisation layer on top of the eventfd that tries to avoid writing to the eventfd in case no one is currently waiting.) This is not so much for saving memory copies but to save syscalls.
From my own unscientific benchmarks (i e, running “top”), this saves us ~10% – 25% of CPU power in low latency use cases, half of that being on the client side.
Headsets come in many sorts and shapes. And laptops come with different sorts of headset jacks – there is the classic variant of one 3.5 mm headphone jack and one 3.5 mm mic jack, and the newer (common on smartphones) 3.5 mm headset jack which can do both. USB and Bluetooth headsets are also quite common, but that’s outside the scope for this article, which is about different types of 3.5 mm (1/8 inch) jacks and how we support them in Ubuntu 14.04.
You’d think this would be simple to support, and for the classic (and still common) version of having one headphone jack and one mic jack that’s mostly true, but newer hardware come in several variants.
If we talk about the typical TRRS headset – for the headset itself there are two competing standards, CTIA and OMTP. CTIA is the more common variant, at least in the US and Europe, but it means that we have laptop jacks supporting only one of the variants, or both by autodetecting which sort has been plugged in.
Speaking of autodetection, hardware differs there as well. Some computers can autodetect whether a headphone or a headset has been plugged in, whereas others can not. Some computers also have a “mic in” mode, so they would have only one jack, but you can manually retask it to be a microphone input.
Finally, a few netbooks have one 3.5 mm TRS jack where you can plug in either a headphone or a mic but not a headset.
So, how would you know which sort of headset jack(s) you have on your device? Well, I found the most reliable source is to actually look at the small icon present next to the jack. Does it look like a headphone (without mic), headset (with mic) or a microphone? If there are two icons separated by a slash “/”, it means “either or”.
For the jacks where the hardware cannot autodetect what has been plugged in, the user needs to do this manually. In Ubuntu 14.04, we now have a dialog:
In previous versions of Ubuntu, you would have to go to the sound settings dialog and make sure the correct input and output were selected. So still solvable, just a few more clicks. (The dialog might also be present in some Ubuntu preinstalls running Ubuntu 12.04.)
So in userspace, we should be all set. Now let’s talk about kernels and individual devices.
Quite common with Dell machines manufactured in the last year or so, is the version where the hardware can’t distinguish between headphones and headsets. These machines need to be quirked in the kernel, which means that for every new model, somebody has to insert a row in a table inside the kernel. Without that quirk, the jack will work, but with headphones only.
So if your Dell machine is one of these and not currently supporting headset microphones in Ubuntu 14.04, here’s what you can do:
- Check which codec you have: We currently can enable this for ALC255, ALC283, ALC292 and ALC668. “grep -r Realtek /proc/asound/card*” would be the quickest way to figure this out.
- Try it for yourself: edit /etc/modprobe.d/alsa-base.conf and add the line “options snd-hda-intel model=dell-headset-multi”. (A few instead need “options snd-hda-intel model=dell-headset-dock”, but it’s not that common.) Reboot your computer and test.
- Regardless of whether you manage to resolve this or not, feel free to file a bug using the “ubuntu-bug audio” command. Please remove the workaround from the previous step (and reboot) before filing the bug. This might help others with the same hardware, as well as helping us upstreaming your fix to future kernels in case the workaround was successful. Please keep separate machines in separate bugs as it helps us track when a specific hardware is fixed.
Notes for people not running Ubuntu
- Kernel support for most newer devices appeared in 3.10. Additional quirks have been added to even newer kernels, but most of them are with CC to stable, so will hopefully appear in 3.10 as well.
- PulseAudio support is present in 4.0 and newer.
- The “what did you plug in”-dialog is a part of unity-settings-daemon. The code is free software and available here.
Up until now, we’ve been using Android’s AudioFlinger for playing back and recording audio. Starting with tomorrow’s image, that is no longer true. Instead we’re talking directly from PulseAudio to ALSA, or the Android audio HAL when necessary.
In short, here’s how PulseAudio now works:
- For normal playback and recording, PulseAudio talks directly to alsa-lib, just as on the desktop.
- For detecting whether a headphone/headset is plugged in or not, PulseAudio now has code for reading that from the Android kernel, through the “switch” interface.
- For normal mixer setup, we use ALSA UCM mixer files.
- For setting up voice calls, we talk to the Android Audio HAL through a PulseAudio module.
This provides somewhat of a compromise between features and porting effort: By using the ALSA library whenever we can, we can access PulseAudio’s timer scheduling and dynamic latency features. Having the straightest path possible for playing back music should help efficiency (and in extension, battery life). At least in theory – we haven’t actually done measurements.
Using the Audio HAL for everything mixer related would have been optimal, but it turns out that the audio HAL is too smart: it refuses to set up the mixer, unless PCM data is also sent to it, which is what we wanted to avoid. So then we had to set up the mixer manually too. However, we still could not avoid using the Audio HAL altogether: when starting and stopping voice calls, the Audio HAL talks to the modem and other components in the kernel to route the voice call between the modem and the sound card. Hence we ended up with this compromise approach.
At the time of this writing, this is working best on Nexus 4. The Galaxy Nexus works for the most part, except for bug 1217072. I intend to add Nexus 7 support shortly. If anyone wants to help testing Nexus 10, let me know.
For porters: if you need to do the same
Unfortunately, this means some additional work for porters, because you need to write UCM mixer files. What’s worse, UCM is lacking good documentation. For that reason, I hesitated somewhat before deciding to actually use UCM at all, but it’s the closest we have to a standard for setting up mixers on embedded devices right now.
But to give you a two-minute crash course in UCM and how it’s used in Ubuntu Touch – start by having a look in /usr/share/alsa/ucm/apq8064-tabla-snd-card/ directory. You’ll need to create a similar directory for your device. You’ll find the right directory name if you look in /proc/asound/cards.
Second, look at apq8064-tabla-snd-card.conf. Rename and copy into your own UCM directory. If you’re making a tablet image (that can’t make voice calls), you can remove the VoiceCall part (and the corresponding file).
Third, look at the HiFi file. This is where all fun happens. Notice the device names, which are hardcoded into telepathy-ofono and need to match: “Speaker”, “Earpiece” and “Headphone” for playback, plus “Handset” and “Headset” for recording.
Fourth, if you need voice calls, also look at the VoiceCall file. Btw, the verb names “HiFi” and “VoiceCall” also need to match.) This is largely empty, because the mixer setup is handled by the Audio HAL, but there is a twist here that took a while to get right: For PulseAudio’s UCM to work, it needs to open a PCM device. However, at the time where UCM tests this, the voice call is not yet set up. So, you might need to set up the mixer just a little, so that the PCM can open. (On desktops, PCM can always open, regardless of mixer state. This is not always true on embedded devices, that are using ASoC.) It’s a bonus if you can find a PCM that actually plays back audio, because then you can get notification sounds while on the phone.
And this concludes the two minute crash course – happy porting!
(Side note: Sorry if the permalink – or comment pages etc – to this blog leads you to a blank page. I’ve reported the bug to the relevant team in Canonical, but at the time of this posting, they are looking into it but have not yet fixed it.)
If you just want to play the game, here’s where you find it. Or, in a terminal window write:
sudo add-apt-repository ppa:diwic/theblobgame
sudo apt-get update
sudo apt-get install theblobgame
Then just search for “blob” in the Dash.
The rest of this blog post is mostly directed towards game developers.
Background and motivation
Ten years ago I finished a game called “The Blob Game”. It was a 2D platform style game, more cute than violent. My cousin had made the graphics, a level editor, and part of the game engine. I made seven levels, music, and completed the code. Back then, I was still working for a company doing closed source software for Windows, so naturally this game was a Windows game.
A while ago I decided to try to port this game to Ubuntu. My main motivations were:
- To see how easy (or hard) it was, given my current level of experience. Also because we currently have some undergoing efforts to make Ubuntu a better gaming platform, and what better way to do that, than to become a game developer yourself?
- People complain that there are not enough games available in Ubuntu – I wanted to make my small contribution to help even that out.
- Nostalgia purposes – after all, most of the work with the game was to make the levels and the artwork, rather than actual code. All of this can be reused, and it would be nice if this game could entertain a new audience.
Overall, the experience has been good. Sure, there has been some work to do and new things to learn and conquer, but there has been very little of frustration. A fun side project!
One of the strength and weaknesses of the Linux ecosystem is all the choices you can, and have to, make. What components do you choose to build your software on? That can be bewildering, especially if you’re new to Linux and unfamiliar what components are suboptimal for one reason or another. I would therefore like to talk you through the choices I made and why I made them. As usual, these are my own opinions, not my employer’s.
To give some background, my cousin and I started programming twenty years ago. Back then, my father introduced me to Turbo Pascal, which was more powerful than the QBasic that came with DOS 5.0. Ten years later, I was using Delphi (the Windows continuation of Pascal) at my work. So the game was written in Delphi, mixed with some hand written assembly code (!).
Was it possible to reuse the code written? I looked at the available compilers:
- GNU Pascal. This project seems mostly abandoned, so not a stable choice for the future.
- FreePascal. This was the main alternative, but it has its own code generator (not integrated with GCC or LLVM), so chances are it won’t keep up in the future. Also, if you need to link to a library, chances are you have to translate headers yourself.
So; rewrite the code. In what language? My choice fell on C, for the following reasons:
- It is a very popular language, maybe even the most popular one. It is likely to be around for a long time, and have support for new processors and architectures as they come to market.
- It is compatible with everything. In fact, it’s what every other language tries to be compatible with (Java has JNI, Python has C extensions, etc).
- It is what I use at work, so I know the language very well. (Both Linux and PulseAudio are written in C.)
- I like the low memory footprint and predictability of C – you want a game to run fluently, the audio to run at a reasonably low latency, and so on. With the exception of calling functions you don’t know about, you can almost see how quick your code executes. I’m not sure how well Java, Python, and the other garbage collecting languages do in this area; so my fear might be unfounded, but at least I know C does well.
Gaming library: libSDL
Fortunately, the cross-platform toolkit libSDL is very well supported under Linux. I have almost only positive experiences of this library: it seems very stable. It handles graphics (window setup, fullscreen etc), input events (keyboard, mouse, gamepads just work), audio, and more. The documentation is extensive and there are plenty of examples out there. And because libSDL is already used by so many games already, most distributions make sure libSDL works on their software and hardware.
What about OpenGL? Well, this is a 2D game, so I don’t need 3D. It is possible I could use some hardware acceleration for the scaling (the frame is rendered in 320×200, then upscaled to the screen’s resolution), but my very simple scaler seems to perform well enough. As such, there was no real need to bring in a dependency on OpenGL.
Music library: FluidSynth
First I’m a more than a bit biased on this dependency, as I’m one of the FluidSynth developers. However, the reason I first got involved with FluidSynth was that I wanted to use it in a game, and needed to fix something in the library…
FluidSynth is a softsynth – it takes MIDI data and a soundfont, and gives you rendered audio as a result. This has the drawback that you need to download a soundfont too, and the only one available in the Ubuntu archive is > 100 MB. The good thing is that FluidSynth is very embeddable into many different kinds of applications, so taking the audio output, mixing it with sound effects, and then send it to the sound card (with libSDL) was easy. It is also easy to manipulate the MIDI in real-time – in this game I’ve used it to pitch down the audio when you die.
GUI toolkit library: glib and GTK3
Let’s first admit it; in the Linux world we don’t have anything as stable as the Win32 API for creating windows. The two main contesters, GTK and QT, are both rewritten every five years or so. So if I, ten years from now, need to run this game again, chances are that I have to rewrite this part. The GUI is just used in the beginning though (to setup the game), so it shouldn’t be too much work.
I chose GTK over QT here because
- I had previous experience with GTK
- QT adds complexities to your build system, as you need not only a C++ compiler, but also a special preprocessor to turn some special QT constructs into valid C++ code.
Build system: simple Makefile
In my case, my application takes a few seconds to compile. For really simple applications like this one, I find build systems such as autotools or CMake to be more trouble than what they solve. In both cases, you’ll have to learn an additional language just to specify your build dependencies. (Autotools also has a few nuisances, such as requiring some extra files to be present, and in all upper-case, such as AUTHORS or NEWS.)
For larger projects, they make more sense though as they might help you create libraries for different platforms, give nice error messages when build dependencies can not be found, etc.
Licensing: LGPL-2 + CC-BY-SA
This is always a tricky and controversial subject, and I’d like to reiterate that this a layman’s thoughts on the topic and nothing else.
- GPL is the strongest license when it comes to free software. But that also makes it a very incompatible license, it is essentially impossible to link GPL code with everything that’s not extremely weak (e g BSD) or explicitly made to fit with GPL (e g LGPL). One example of an incompatible license is MPL 1.1 – this was a quite popular license in the Delphi community, and the incompatibility with GPL was a real pain.
- It is not obvious where the boundaries between GPL and non-GPL code can be, causing confusion in court from time to time. FSF might offer some advice on how to interpret the license; but they’re not a neutral party. In this case, I find the LGPL, the second strongest license, more clear.
- So LGPL-2, LGPL-3 or LGPL2+? Well, to me, any “at your option, any later version” clause is out of the question; for me, that’s essentially the same as giving your code away to the FSF as they can relicense your code any way they wish. And LGPL-2 is shorter than LGPL-3 because LGPL-3 includes the full GPL-3 license code, too.
- LGPL-2 is suitable for code, but not for data. So music, graphics and levels are licensed under CC-BY-SA. I was considering adding “non-commercial”; but the border between commercial and non-commercial can be fluid, and might also be a problem with the DFSG, so I skipped that part.
Update: Seems I didn’t read closely enough – LGPL-2 has a GPL-2+ clause, now making it possible for the FSF to relicense my code by making a new GPL version.
I also downloaded new sound effects (which is the only part not completely made by myself or my cousin), to make sure there was no licensing problems with those.
Getting it into the Ubuntu Software Center
Unfortunately, this does not currently work for free applications. While adapting my packaging to fit the requirements was relatively straight-forward, and also walking through the dialogs for app submission, when all this work was done, I was met by the following message:
Thank you for submitting a gratis Free Software application through MyApps. At this time we are unable to process this request, as we are working on the implementation of a new app upload process.
A packaging tip
While we’re on the topic of packaging, let me just share a quick tip. If you’re new to Debian packaging (the system used in Ubuntu), and just want to package your own app, one thing to think about is that the packaging system was designed for having coders and packagers belong to separate organisations. So that means, that if you during packaging find a bug in your code, you’re meant to make a temporary patch, send that the coder, who will then make another release of the non-packaged software, which you will then package. In practice, this is a bit heavy-weight if you’re just one person and don’t keep packaging and code apart, so I used the following shortcut:
tar -cjf theblobgame_0.20130202.orig.tar.bz2 --exclude=debian theblobgame-0.20130202
If you fixed something in the code and want to work on the packaging, the command above will create a new “code release” from the packaging system’s point of view.
…and finally, it’s free software!
This means you’re allowed not only download and run this game, but also look at the code, copy-paste parts into your own game (under the terms of the LGPL-2!) or just use for inspiration. You can fix a bug or a feature and send me patches (or publish the modified code yourself), etc. Enjoy!
Update: Someone asked me how to get the source, so here’s a quick howto:
If you have executed the lines in the top of this blog post, just change into a suitable directory and execute “apt-get source theblobgame”. Before you build, you can use “sudo apt-get build-dep theblobgame” to automatically install all build dependencies. Building can be done with “dpkg-buildpackage -b” (from the source code directory). Then install the resulting .deb package (“sudo dpkg -i theblobgame_version.deb”) to test.
If you’re not running Ubuntu/Debian, you can get the source from going to the PPA page, click “View package details”, click one of the arrows on the left side in the table, and download the file ending with “orig.tar.bz2”.
Disclaimer: This does not mean that the source is written by the book, with lots of helpful comments, etc. The game engine is mostly a quick translation of the code as it looked like ten years ago, with new glue code added for interfacing with the libraries I now depend on.
Takashi Iwai, the Linux sound maintainer, is about to merge a patch set of about 150 patches into linux-next. These changes, in short, unify the different HDA codec drivers.
Introduction and background
First, the basics: HDA Intel is the current standard protocol for accessing your built-in soundcard, as well as HDMI/DisplayPort audio, and is used in almost all desktop and laptop computers since about 2005. In the HDA Intel world, there are controllers and codecs. The codecs also have a configuration, which tells which pins of the codecs that are connected to what input or output (this is set by BIOS/UEFI).
Hardware is very diverse: The HDA Intel driver supports at least 50 different controllers, and about 300 different codecs. On top of that, every codec usually support many different configurations.
The codecs come from 10-12 different vendors, and within the same vendor, it is more likely that the codec layout is somewhat like other codecs from the same vendor. As a result, the HDA codec driver is split up in 10-12 codec driver files, one per vendor. These files are to some degree copies of each other, but also contains every vendor’s specials.
Takashi’s patches are solving a long term maintenance problem: as we want to add new features to the kernel drivers, we would previously have to do this once per codec – whereas with the unification of codec drivers, we would just have to add this code once. Or possibly twice, as the HDMI codecs will still have its own codec driver. In addition, new codec hardware is more likely to “just work”, or at least partially work, without explicit support in the kernel. The potential downside, of course, is that as you improve the driver to solve some edge case, you’re more likely to screw some other edge case up.
There’s not much of new features added in this new, generic driver at this point. However, if you have an “unusual” codec chip vendor, you might see minor improvements as these are brought up to feature parity with the more common codecs.
When does it change?
As usual, you can’t know until it’s in there. Takashi’s latest plan is to make the move for 3.9 for at least some of the codecs, and 3.10 for the rest of them.
Update: Some blog, referencing this one, said this feature would come to Ubuntu 13.04. Ubuntu 13.04 uses kernel 3.8, so this feature won’t be in Ubuntu until 13.10.
Update 2: As of 2013-01-23, the new code has been merged, for all codecs, for linux-next (kernel 3.9)!
Regressions and testing
Judging from the database you might have contributed to by yourself by submitting your alsa-info, there are about 6000 different machines out there, and there is not enough manpower to test them all. So we need a different approach. Conveniently, Takashi has written an emulator called hda-emu to test the codec driver code, and I’ve improved that emulator with some scripting, so that hda-emu effectively becomes an automated test suite. The test suite is still very incomplete, but it at least runs a few different tests such as faked playback, S3, and manipulating of volume controls, and checks if hda-emu crashes or reports an error. And sure enough, when running this test suite over all the alsa-infos in the database, a few regressions were discovered that I was able to fix.
So far, so good. If it weren’t for the fact that hardware often does not work exactly as advertised. The parser algorithm for reading the codec layout and creating a working kernel driver out of it, must now take all codecs from all vendors into account. The old vendor-specific parser might have done things in one way and the new parser might do things a different way, causing the audio to be routed differently.
As an example, assume the codec is broken in such a way that it advertises two audio paths, but in practice only one of the paths actually works. The new parser might then route the audio differently from the old one – and as a result it will look like audio should really work, in theory. In practice, there is nothing but silence. Another example could be that maybe the new driver will power down different parts of the codec in different order than the old driver did, causing your speakers to click.
How can I help?
Right now, what’s needed is more testing on real hardware. Takashi has called for testing of his hda-migrate branch of the sound-unstable tree.
If you’re running Ubuntu, I have made packages for 12.04, 12.10 and 13.04 – just download, install it (you might need to install dkms and kernel headers too). Then reboot and test. Simply uninstall the package and reboot when you’ve finished testing.
Update: I will probably close the comments soon due to the amount of spam coming in that way. Please report back – especially if you did see a regression – to the alsa-devel mailing list. Thanks!
Update 2: As the code is now merged into linux-next, please use these instructions to try it out on Ubuntu. Thanks!
As part of the Ubuntu Community Appreciation Day initiative, I’d like to write an appreciation for females – of all ages – that we have within Ubuntu and upstream communities, and why I’d like to see more of the same. To do this, I’ve taken the liberty of generalising a bit based on my own personal experiences.
First off, many women have excellent communication and conflict resolution skills. I envision this could come to very good use, including upstream. You see, we software developers can be really picky – which is a good thing, as long as this helps us prevent bugs. But we also tend to set up rules for ourselves and our processes, and we need a counterweight to that in order not to become rule-following robots, which is no fun. A controversial patch can easily lead to heated, discouraging debates and somebody running off, making a fork of the project, together with half of the squad. Seen from an Ubuntu perspective, better communication and conflict resolution skills might help us to maintain fewer remixes and derivatives – but the remaining ones would be more polished and work better.
Second, a mixed company working place is good for everyone. Before working at Canonical, I had been working at both offices with only men, and with both men and women. My experience was that at the male-only office, discussions tended to be more matcho – coffee break chats were often about sports or women, if I remember correctly. And even if background images of women in bikini and jokes towards the vulgar didn’t offend me, I didn’t particularly enjoy it either.
At the mixed company working place, discussions in general had a friendlier tone, and included a wider area of topics. It was just…better.
(Side note: while discussing this with a female colleague a long time ago, she told me she had been working at a women-only place, which was plagued by gossiping to the extent that she was afraid to become ill – because on the day she would not be at work, they would gossip about her. Judging from that, mixed company is likely good for everyone, not just men.)
Third, women know what women want. Or, at least, are slightly more likely to know. Software is more likely to get new features, bug fixes, packaging, support, advertising blog posts and so on, if there are people with sufficient skill and interest in that particular software. When more women get involved in software development, the end result will be more useful for women. If Ubuntu’s ever going to reach 200 million users: if it works great for twice as many people, that would certainly help!
So, I would like to say thank you to all women involved in open source communities, both Ubuntu and upstream. That includes a thank you for not quitting when times get rough.
And finally, if I may extend my appreciation to an invitation: you don’t have to be as fantastic as the open source women I’ve met, to be contributing to Ubuntu, Debian, or upstream. If you already have skills, that helps, but for the most part, you’ll learn as you go. Commit to respecting each other first, and then you can start helping out with everything from writing code to organizing events. Welcome!
Disclaimer: As usual, these are my own views rather than those of my employer, my family, or anyone else. Also, just to make the point clear, this is not scientific research and does not claim that women are in general different from men – we all are so much different from, and so much more than, what an average person of the same gender would be. It is just my “thank you” post, based on my own personal experiences.
[Thanks to Leann Ogasawara for providing some useful feedback when writing this blog post.]
The first PulseAudio conference is approaching quickly. This is a shoutout for people who might be interested, but missed the mailinglist announcement.
The conference will be Friday 2nd November 2012, and colocated with Ubuntu Developer Summit and Linaro Connect, in Bella Center, Copenhagen.
There is no attendance fee, but you’ll need a UDS or LC registration (which are also free) to be able to get into the conference area. (Note: LC might be the safer bet here as UDS ends on Thursday)
There had been a few topics brought up on the mailinglist, but we welcome anyone who would like to contribute to constructive discussions and help to shape the future of PulseAudio!
If you would like to attend, how about you write an email to the pulseaudio-discuss mailinglist with the topic(s) you want to bring up, and maybe a small presentation with your background and interests? Of course, if you mostly want to listen in rather than bring your own topics, that’s okay too!
The audio stack in Linux/Ubuntu evolves over time. What used to be good advice is not necessarily good advice anymore. (That also means, that if you happen to read this blog post in 2019 or something, don’t trust it!)
Here are some things that people try, and sometimes they even fix the problem, but are often bad in one way or the other. Or at least, they have side effects one needs to be aware of. So – while there are valid exceptions, as a rule of thumb, don’t do the following:
5. Don’t add your user to the “audio” group
A user has access to the audio card if that person is either logged in – both VT and GUI login counts, but not SSH logins, or if that user is in the “audio” group. However, on the level of access we’re talking about here, only one user has access at a time. So the typical problem scenario goes like:
- User Homer has an audio issue, and tries to fix it by adding himself to the audio group. This doesn’t help to resolve the problem.
- Homer discovers his audio is muted, and unmutes it. Happy to have his audio issue resolved, he forgets he’s still in the audio group, or doesn’t realise it leads to problems.
- User Marge comes and wants to borrow the computer. Homer does a fast-user-switching so Marge can log in.
- Because Homer is in the audio group, he has still access to the audio device. If some software, e g PulseAudio, has the audio device opened, it blocks access to other software trying to use it.
- Now Marge has an audio issue!
I’ve written a longer article about the audio group here. In short, there are some usages for it, including that it is also the standard group name for assigning realtime priorities when used together with JACK. But don’t leave a user in the audio group unless you have a good reason.
4. Don’t try different “model” strings
A common way to try to get HDA Intel soundcards to work is to edit /etc/modprobe.d/alsa-base.conf and add the following line:
options snd-hda-intel model=[something]
…where [something] are values you find in some file. Contrary to official documentation, this is in most cases obsolete. In particular, avoid model=generic – that is almost guaranteed to give you trouble. In many cases, when trying different models, you will find that you might fix one thing but break another.
In fact, there is only one model to try, and that is model=auto. If your machine happen to be one of those quirked to use an older model parser, changing to model=auto can improve the situation.
It still happens that BIOS/UEFI assigns the wrong values to pin nodes, which causes an output or input not to work correctly. If so, I recommend trying to tweak this with hda-jack-retask.
In some cases, trying different modules can actually be okay – sometimes, these models point to lightweight fixups instead of the earlier, more heavyweight code that was used in previous kernels. (In this context, I have to mention that Takashi Iwai has done a fantastic job of converting the older models to the newer auto-parser.)
3. Don’t upgrade ALSA drivers by following random blog posts
I’ve seen far too many people reporting bugs on Launchpad where they’ve been following some random blog post that tells you how to upgrade ALSA, and are having audio issues as a result. These guides are of varying quality and often come without good uninstall instructions, so you have no way to revert in case the upgrade did not solve your problem, or broke something else.
First, something not everybody is aware of: 95% of ALSA code is in the kernel, and follows the kernel’s release cycle. That means that even if “/proc/asound/version” says something that was released a year or two ago, don’t panic. It’s the kernel release that tells you how new your sound drivers are, so if you have a new kernel, and you see an ALSA release coming out, you are unlikely to gain from an upgrade.
In some case you do have an old kernel, and newer sound drivers can be worth a try. The Ubuntu Audio Developer’s team provides daily snapshot drivers for HDA Intel cards. Guide is available here and it also comes with proper uninstall instructions.
In the past we have also provided drivers for other cards, but due to the maintenance required to keep this up-to-date, in combination with that the vast majority of people’s bugs concern HDA Intel anyway, this support has been discontinued.
2. Don’t purge PulseAudio
First, PulseAudio itself isn’t perfect, some of the bindings to PulseAudio aren’t perfect, and some of the drivers are not perfect in the way PulseAudio wants to use it either. So there might be valid reasons to temporarily move it out of your way, even if it would be better to actually fix the problem and submit a bug fix patch (if you’re capable of doing so).
But don’t try uninstalling the PulseAudio package, as it has far too many dependencies.
If you just need direct access to your sound card, you can run the “pasuspender” command. You can either run “pasuspender” (in a terminal) to make PulseAudio stay away for the duration of the application. Or if you think that’s simpler, just run “pasuspender bash” (in a terminal), start your application through the menu/dash/whatever you prefer, and when you’re done, write “exit” in the terminal.
If you need to stop the PulseAudio process completely, execute these commands:
echo autospawn=no > ~/.pulse/client.conf
If you need PulseAudio back again, remove ~/.pulse/client.conf, then try to start an application that uses PulseAudio, and it should start automatically.
Unexpected side effects:
- The Gnome sound settings, the sound indicator and the volume up/down keys relies on PulseAudio, so they won’t work when PulseAudio is off.
- PulseAudio mixes audio, so that means that only one application at a time can output audio if PulseAudio is disabled (and you aren’t using some other sound server).
- Several applications have PulseAudio backends. Some of them will need reconfiguration to use ALSA directly, some will just automatically redirect themselves, and some won’t work at all.
- Bluetooth audio might not work without PulseAudio.
1. Don’t replace ALSA with OSS
OSS was the standard used before ALSA came along. These days, ALSA is much better, both when it comes to hardware support, and when it comes to how much software that supports outputting sound to either sound system. OSS is also entirely unsupported, at least by Ubuntu. In addition, I’m not sure exactly how to get back to a working system after you’ve tried OSS…!
If you know your problem is in ALSA, either drivers or userspace, try to track down and/or fix the bug, and talk to us about it. If you’re running Ubuntu, file a bug against the alsa-driver package. You can also contact the alsa-devel mailinglist. While we won’t guarantee responses due to the high volume of bugs/traffic, we are often able to help out.
Note 1. HDA Intel cards are the built-in audio inputs and outputs on your motherboard (at least if you bought your computer after ~2006 or so). HDMI and DisplayPort audio are also HDA Intel cards, but they are covered in more detail here.
Note 2. I have had some problems with spammers posting spam comments to my blog post. I don’t want to spend too much time just reading spam and marking it as such, so I might close for comments in a relatively short period. Sorry for the inconvenience.