The State of the Linux Graphics Stack

Author’s note: This is the unedited version of an article that appeared on the Ars Technica website “The Linux Graphics Stack from X to Wayland” on March 21, 2011. See references at the bottom of this article for the link.

In 1984 the presidential election of the United States of America resulted in the incumbent, Ronald Reagan winning by the second biggest landslide in history. President Reagan captured 49 of 50 states, and nearly got the 50th, but challenger Walter Mondale won his home state of Minnesota by a mere 3761 votes. President Reagan’s victory was not a surprise to Republicans or Democrats, because it was his fiscal policy that had delivered the United States from the deep recession of 1981 and 82. President Reagan won in 1984 because he had already achieved what the American people wanted. Reagan won on the merit of what he had accomplished.

Meanwhile, around the same time in a small lab at MIT computer scientist Bob Scheifler was laying down the principles for a new windowing system. He decided to to call it X (he was improving the W graphical system, which naturally resided on the V operating system). Little did Bob know at the time, the X Window System that he and fellow researches would eventually create caused a landslide of its own. It became the standard graphical interface of virtually all UNIX based operating systems. That is because it was software that provided features and concepts far superior to its competition. In a few short years the adoption of X was clearly evident. Like the American people of the 1984 election, the UNIX community voted with their actions.

What made X so special, of course is legendary. It was the first graphical interface to embrace a networked, distributed solution. An X Server running on one of the time sharing machines was capable of generating the display for windows that belong to any number of local clients. X defined a network display protocol, so those windows could be displayed on remote machines. In fact X was always intended to be used in this network fashion, and the protocol was completely hardware independent. X Servers running on one type of UNIX could send X client displays over the wire to a completely different UNIX hardware platform.

X also abstracted the look and feel away from the server itself. So the X protocol defined pointing devices, and window primitives, but left the appearance of the interface to a special X client program called a window manager. Today Linux users are familiar with window managers such as Gnome’s Metacity and KDE’s KWin, along with a few newcomers such as the light weight Xfvwm as part of the Xfce offering.

As X development proceeded, led by Bob Scheifler and under the stewardship of MIT, more vendors became interested. Industry leaders at the time, like DEC obtained a free license to the source code to make further improvements. Then a curious thing happened. A group of vendors asked MIT if some sort of arrangement could be made to preserve the integrity of the source. They wanted to keep X universally useful to all interested parties. MIT agreed and soon the MIT X Consortium was formed and the full source tree was released, including the DEC improvements. What happened is really an extraordinary event. The vendor community realized that X had become a valuable commodity, and it was in the best interests of all to protect it from themselves. Perhaps the opening of the source code is the single most important event to come out of the X story. The MIT X Consortium maintains the rights to the X source today.

One of the senior developers recruited by the Consortium was Keith Packard, who was commissioned to re-implement the core X server in 1988. As we’ll see, Keith will figure prominently in the development of the Linux graphics stack.

X software really ruled UNIX, and later, the Linux graphics stack from it’s official coming out party in 1987 and continue well into 2004. As fate would have it, the feature laden and ubiquitous software eventually became a victim of it’s own success. As Linux took flight throughout the 90’s, X began to be used as more of a standalone X server and client all on one desktop computer, and bundled with pretty much all of the Linux distributions. The network transparency of X is of course no use on a single desktop installation, and the once vaunted feature was adding overhead to the video drawing.

As PC sales ballooned during this period, the sophistication of dedicated graphics hardware began to creep ahead of the capabilities of X. The development of new and improved hardware in graphics cards was and continues to be very aggressive. Before we tackle the larger question of how X will cope with getting access to video device adaptors in the Linux space, have a look at the sidebar to delve deeper into the hardware.

Sidebar: Video Card Hardware

GPUs are at the core of dedicated graphics cards, formally known as Graphics Processing Units. These behemoths reside right on the cards, and are constantly accelerating graphics by slurping up floating point operations that used to be destined for the CPU. And on the fly video decoding/encoding. And 2D acceleration and framebuffering. And a host of 3D specific operations. Your basic state of the art video card has (at least one, perhaps more) GPUs running in the neighborhood of 700-800 Mhz. You’ll find various manufacturers of dedicated video cards such as Asus, Saphire, and eVGA but virtually all (and I do mean all) of the GPU technology comes out of two sources: AMD and nVida.

Intel is in the space as well, but it appears that the coverage is, naturally, the Integrated Graphics Processors (IGP) found on netbooks, notebooks, laptops, and Intel based desktop computers. IGPs historically cannot perform as well as GPUs on dedicated cards, because they are competing for system RAM along with the CPU. Newer incantations of IGPs have some memory caching but they still lag far behind in terms of performance. The reason is quite simply the fact that GPUs enjoy exclusive access to their memory space at speeds of 10-100 Gbits per second. IGPs must be content with the slower system memory bus running at 2-12 Gbits per second. AMD and nVidia manufacture IGPs of various flavours and these find their home in places like Macbooks and iMacs, and Dell systems.

Once upon a time, when Bob Scheifler was coding X, video hardware consisted of a framebuffer that held data for every pixel to be drawn on the screen. Today’s cards need to do memory intensive texture mapping and rendering of polygons. Engineers have evolved the GPU to handle digital video data in terms of matrix operations as well as discrete values, and so the hardware specializes in this area. In fact the efficiency of GPUs is starting to attract scientists of other disciplines, such as oil exploration, image processing and statistics where arrays of GPUs are harnessed in parallel (known as GPGPU or CUDA).

All this processing requires loads of dedicated memory to work. The memory space on video cards use 128 bit (some of them are up to 256 bit) addressing to over 1GB of on-board GDDR5 memory. Video card manufacturers will have you believe they have developed specialized types of memory on their cards, however the speed boost is mainly due to the fact that the GPU has exclusive access to on board memory.
Over the years hardware manufacturers of video cards have come and gone. But the playing field has thinned considerably. Today there stands but three very large players and a group of smaller ones. The big three are AMD (who acquired ATI Technologies), nVidia, and Intel. There are open source Linux drivers for these three, and respectively are fglrx, nouveau, and intel.

Around 2004 the level of frustration of some Linux developers was evident. They had at their disposal OpenGL, an image rendering API that was originally developed to produce 2D and 3D graphics (derived from work by the now defunct Silicon Graphics) in 1992. But after years of attempting to get X to talk 3D to the graphics device not a single OpenGL call could be made through the X layer.

Then in 2007 a bright light. Thomas Hellstrom, Eric Anholt, and Dave Airlie had developed a memory management module they called TTM (Translation Table Maps). TTM was based on moving the memory buffers destined for graphics devices back and forth between graphic device memory and system memory. It was notable because of the wild applause of the Linux community. It provided hope that somebody, anybody was working on the problem of providing an API to properly manage graphical applications’ 3D needs. The strategy was to make the memory buffer a first class object, and allow applications to allocate and manipulate memory buffers of the graphical content. TTM would manage the buffers for all applications on the host, and provide synchronization between the GPU and the CPU. This would be accomplished with the use of a fence, a fence being a signal that the GPU was finished operating on a buffer, thus relinquishing it back to the owning application.

To be fair TTM was an ambitious attempt to standardize how applications access the GPU; it was an overall memory manager for all video drivers in the Linux space. In short TTM was trying to provide all operations for all things using them. The unfortunate side effect was a very large amount of code — a large API, whereas each individual open source driver only needs a small subset of API calls. A large API means confusion for developers who have to make choices. The loudest complaint was that the TTM had some performance issues, perhaps related to the fencing mechanism, and inefficient copying of the buffer objects. TTM could be many things to many applications, but it couldn’t be slow.

Re-enter Keith Packard. In 2008 he announced work was proceeding on an alternative to TTM.. By now Keith was working for Intel, and together with the help of Eric Anholt used the lessons learned from developing TTM and rewrote it. The new API was to be called GEM (Graphics Execution Manager). Most developers reading this piece can probably guess what happened next. That is because experienced developers know that there is only one thing better than getting a chance to solve a big problem by writing a significant chunk of code. And that is doing it twice.

GEM had many improvements over TTM, a significant one was the fact that the API was much tighter, another that the troublesome fence concept was removed. Keith and Eric put the onus on the applications to lock memory buffers outside of the API. That freed up GEM to concentrate on managing the memory under control of the GPU, and control the video device execution context. The goal was shift the focus to managing ioctl() calls within the kernel instead of managing memory by moving buffers around. The net effect was that GEM became more of a streaming API than a memory manager.

GEM allowed applications to share memory buffers so that the entire contents of the GPU memory space did not have be be reloaded. This is from the original release notes:

Gem provides simple mechanisms to manage graphics data and control execution flow within the linux (sic) operating system. Using many existing kernel subsystems, it does this with a modest amount of code.²

The introduction of GEM in May of 2008 was a promising step forward for the Linux graphics stack. GEM did not try to be all things to all applications. For example it left the execution of the GPU commands to be generated by the device driver. Because Keith and Eric were working at Intel, it was only natural for them to write GEM specific to the open source intel driver. The hope was that GEM could be improved to the point where it could support other drivers as well, thus effectively covering the three biggest manufacturers of GPUs.

However non-intel device driver adoption of GEM was slow. There is some evidence to suggest that the AMD driver adopted a “GEMified TTM manager”³, signifying a reluctance to move the code directly into the GEM space. GEM was in danger of becoming a one horse race.

Both TTM and GEM try to solve the 3D acceleration problem in the Linux graphics stack by integrating directly with X to get to the device to perform GPU operations. Both attempt to put in order the cabal of 3D libraries like OpenGL (which depends on X), Qt (depends on X) and GTK+ (also X). The problem is that X stands between all of these libraries and the kernel, and the kernel is the way to the device driver, and the ultimately the GPU.

X is the oldest lady at the dance, and she insists on dancing with everyone. It has millions of lines of source, but most of it was written long ago, when there were no GPUs, no specialized transistors to do programmable shading, or rotation and translation of vertexes. The hardware had no notion of oversampling and interpolation to reduce aliasing, nor was it capable of producing extremely precise colour spaces. The time has come for the old lady to take a chair.

In 2008 a software engineer by the name of Kristian Høgsberg was driving in the Boston suburbs, perhaps on his way to work, or on his way home. Software engineers live in a world of deep introspection. They spend most of the day solving complex problems, breaking problems apart and reconstructing them. Often when they’re distracted their subconscious can make the most bizarre leaps connecting half baked ideas and partially solved algorithms. This can happen in the oddest of places, for example in the shower, or cooking in the kitchen, or driving in a car. As Mr. Høgsberg drove through the tiny village of Wayland, Massachusetts, an idea that had been germinating in his mind crystallized. His idea, perhaps, is best described in his own words:

The core idea is that all windows are redirected, we can do all rendering client side and pass a buffer handle to the server and the compositing manager runs in the display server. One of the goals is to get an X server running on Wayland, first in a full screen window (like Xnest), then rootless, since X just isn’t going away anytime soon.⁴

His idea was to write a brand new display manager and have it send 3D output directly to the kernel, bypassing X. One of the clients to the new display manager would be X itself. As you can see from the quote above, the new display manager would be christened “Wayland” in homage to the exact geography when the idea occurred to him.

An idea is an idea, nothing more. Lots of people have brilliant ideas every day, but often they disappear in the cacophony of life that we muddle through. Mr. Høgsberg had been working on the rendering libraries and probably wondering what a wonderful world it could be if only the applications could talk directly to the GPU without involving the old lady X. Well, he set himself apart from the 95% rest of us and resolved to write the code. According to Keith Packard of Intel, he had a rudimentary working server in two weeks time⁵.

A key feature of Wayland is the use of a rendering API that does not have any dependencies on X. For that you need to look no further than the mobile device revolution. Small devices, like mobile phones have been refining a sophisticated windowing interface for years. The phones, and tablets need a small footprint and a 3D capable solution. The mobile devices, as it turns out, use a rendering library called OpenGL ES (the ES is for Embedded Systems). The library itself is now mature (currently OpenGL ES 2.0). The devices that use Open GL ES is a veritable who’s who in the mobile market.

iPhone (3 GS or later)
iPad
Android 2.0 and 2.2
Maemeo based Nokia N900s
Samsung Galaxy S and Wave

To maintain compatibility the X Server itself is made into a Wayland client, and all of the X rendering is fed directly into Wayland. The Wayland package, like X before it, defines a protocol only. The architecture of Wayland, with it’s ability to function alongside X provides an easy migration path for existing and even future planned X clients. The X Server can run as before, servicing all of the legacy clients.

The Wayland Display Manager, if we can call it that, leverages the GEM execution manager, evdev (input drivers) and kms (kernel mode switching) that are already in the Linux kernel. Wayland has its own compositing manager, which is in direct contrast to the X architecture, because X relies on an outside compositor to handle memory buffer changes.

Wayland also leverages DRI2. The Wayland Compositor and Wayland clients both have a handle to a shared memory buffer of screen real estate. The client renders through DRI2 and makes use of the 3D APIs, perhaps OpenGL ES. After the buffer is updated by the client then the Wayland Compositor will update its version of the desktop, and redraw the screen.

Certainly the Wayland approach solves many of the problems that traditionally had been difficult for X. It is easy to get excited about the project, there is talent behind the coding, and the project has the backing of Intel and Red Hat.

But there are a few hurdles to clear before the Linux graphical stack can claim to have a modern 3D graphics stack. And two of the biggest hurdles might be developing compatible open source drivers that work with the graphics cards from nVidia and AMD. The third big vendor, Intel is well positioned, because the GEM kernel module was written with the intel driver in mind, and Wayland is already compatible.

But how, or rather who, will update the open source drivers for AMD and nVidia hardware? Developing open source drivers in Linux, especially for graphics adaptors has always been the developers’ scourge. Usually working with incomplete hardware specifications, or none at all, the exercise invariably boils down to reverse engineering the device.

The nouveau driver is a good example of this. First, there are statements floating around⁶ made by nVidia developers that suggest there are “no plans to support Wayland”. But work presses on in the Linux community, sometimes with the support of vendors, sometimes without. In the nouveau project, the driver is being actively developed by reverse engineering. A program called Renouveau (Reverse Engineering for nouveau) will perform the following sequence of events.

record the contents of the device MMIO registers
draw some graphics
record the new values in the registers

It then sends the differences of the two memory dumps as a text file back to a Renouveau ftp server, and the files are made available for further analysis.

AMD fares much better than nVidia in this department. Over the last few years a team was assembled to write open source drivers for their hardware. They also release specifications periodically so open source development can continue in the wild. The driver name is fglx (FireGL and Radeon for X), and the Linux community can get periodic (monthly) updates from AMD.

Wayland is a promising new development for the state of the Linux graphics stack. Recently Ubuntu has stepped up, suggesting that they are planning to use Wayland in conjunction with their own new window manager called Unity. Intel has shown tremendous support in aiding in the development of GEM, and hiring the Wayland developer to ensure their hardware drivers are up to snuff with the new graphics architecture.

The rest of the graphics world is polarized. AMD and nVidia are engaged in fierce competition to capture market share, and the needs of the open source community do not appear to be near the top of their to do lists.

On January 25, President Barack Obama delivered the 2011 State of Union Address. Senator Mark Udall of Colorado proposed that both houses sit together, regardless of party, which they did in a rare display of solidarity. Perhaps the display was only symbolic, perhaps the next day it was business as usual in Washington. But President Obama had an undertone to his address, it was one that embraced the spirit of cooperation.

The Linux community is based on the very same spirit. Through collaboration, and openness it has grown tremendously over the years. The Linux graphics stack appears to be poised to make a great leap forward. Graphics hardware vendors are free to choose how much they want to contribute. It always amazes me how reluctant they are to do just that. Wouldn’t any hardware vendor, especially if their product was perhaps the most capable hardware in the world, want its users to have the best experience they possibly could? By holding back information aren’t they really hurting their own product line? One thing is for sure, you wouldn’t call it cooperation.

References

Edited version of this article as it appeared on Ars Technica website March 21, 2011
Email written by Keith Packard anouncing GEM May 13, 2008 http://lwn.net/Articles/283798
Wikipedia explanation of the Graphics Execution Manager http://en.wikipedia.org/wiki/Graphics_Execution_Manager
Kristian Hoegsberg blog entry November, 2008
Keith Packard’s talk at linuxconfau on January 24, 2011 http://linuxconfau.blip.tv/file/4693305/
Nvidia developer comment

This entry was posted on Wednesday, March 23rd, 2011 at 6:28 pm and is filed under Linux, Open Source. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Positive Path

The State of the Linux Graphics Stack

Leave a Reply