This issue has been a personal bugbear for some time. What started out to be a seemingly simple patch-monkey task had turned into a tedious, labour-intensive slog – encompassing hidden pitfalls and unexpected, non-standard behaviors exhibited by reasonably well known OEMs (Original Equipment Manufacturers). When I initially found, reported and assigned myself to this bug, I competed a quick Google search and discovered a set of patches [1] which appeared to be pertaining to the same issue I was facing. I quickly applied the patches and re-built the affected kernel for test. The behavior was the same with the patches applied.

LP Bug #592295
  omapdss DISPC error: SYNC_LOST_DIGIT

Normally, when a SYNC_LOST_DIGIT IRQ fires only one occurs. The driver code usually cleans up the odd sporadic one or two as they are almost expected from time to time. This bug however, cripples the system, rendering it quite useless. So many IRQs are fired that the console is completely over-whelmed. This particular bug only exhibits this behavior when an HDMI device is plugged in to the port, but power is not applied to it. A SYNC_LOST_DIGIT IRQ can be received for a plethora of reasons. Some of which I’ll mention:

1. If a device has more than one graphical display port, and the HDMI and primary (usually LCD) ports are enabled at the same time, then the HDMI sub-system would display SYNC_LOST_DIGIT. This is fairly irrelevant in this case as the DVI-D port on the Panda Board (the development device where this issue is prevalent) is neither enabled, nor connected.

2. A SYNC_LOST_DIGIT can sometimes display if a downscaled video has been recently viewed on the primary device and the produced overlay is switched to the HDMI monitor. This occurs due to the fact that HDMI does not currently contain downscale support. This issue is known to be rectified by issuing the correct out_width and out_height using sysfs. Again, this is neither the reason for my error, nor does the fix correct the symptoms.

3. The IVAHD (Imaging Video Audio – High Definition) is Texas Instruments’ (TI) own multimedia co-processor DSP (Digital Signal Processor) used to encode/decode High Definition video formats. A SYNC_LOST_DIGIT is normally raised if the first IVAHD frame takes longer than 1ms to decode. If the first frame takes more than 1000ms to appear a dsi_framdone is more likely. This is not believed to be the issue here. Worryingly there is some hacky code in the current kernel which ‘solves’ this issue by commenting out ~30 lines of code. I’ll take a look to resolve this sometime in the near future.

As I’m not a subject matter expert on HDMI, I decided to contact TI’s graphics department. A friendly, helpful girl from India called Mythri replied. I sent her the contents of my start-up logs and she immediately spotted some anomalies. For instance:

omapfb omapfb: failed to allocate framebuffer
omapfb omapfb: failed to allocate fbmem
omapfb omapfb: failed to setup omapfb
omapfb: probe of omapfb failed with error -12

Apparently, the omap driver code requires a chunk of memory in the order of 32MB to be reserved at boot-up. Each buffer, of which there can be a maximum of 3, also need to be given a slice of the chunk [2]. The line below, although unrelated to this bug, placed on the kernel command-line eradicated these warnings.

vram=32M, omapfb.vram=0:8M

In its current state HDMI is enabled directly with no interaction with external hardware. This is considered a bug and may be the cause of the spurious SYNC_LOST_DIGIT IRQ fires. Instead, TI provided a patch to change ‘hdmi_panel_enable’ to ‘hdmi_enable_hpd’ (hpd meaning hot-plug-device). HDMI will now only be enabled when a monitor/TV is detected through the physical connect interrupt. At which point the driver will attempt to read the EDID (Extended Display Identification Data) [3] from the monitor/TV and enable it. If no device is detected HDMI will be started in min_mode complete with minimal clocks.

-	.enable		= hdmi_panel_enable,
+	.enable		= hdmi_enable_hpd,

This patch cleared the spurious SYNC_LOST_DIGIT errors, but still nothing was displayed on the connected monitor. I assumed this was a related bug and together with Mythri’s help perused a working fix. LP Bug #592295 was no longer relevant, so I opened another.

LP Bug #605832
  LG monitor behaving incorrectly when used in conjunction with the Panda board and HDMI

Mythri suggested adding some more arguments to the kernel command-line. The first argument below dictates which timing code the driver should use. The second specifies whether it should use DVI or HDMI code (zero and one respectively). You’ll notice that hdmicode has been set to 4. This tells the driver to use a very low resolution of 640×480. This should display at least something on any monitor. A table of the remaining timing codes has been provided for reference [4].

omapdss.hdmicode=4 omapdss.hdmimode=1

These new kernel command-line arguments provided a very shaky arcane image, along with the warning message “Out of range” displayed by the monitor. As the monitor is rated at 1080p (AKA: 1920×1080 or Full HD), 640×480 should not be a problem and certainly should not be out of range. The issue is believed to be caused by underlying timing issues. Once again I sent Mythri the log to try and debug exactly what was happening.

In the meantime I decided to have a poke around within sysfs to find any related nodes which may prove useful. When the following command was issued the monitor ‘just worked’. It wasn’t a great resolution, but the “Out of range” error disappeared and the font was crystal in clarity.

echo 4 | sudo tee -a /sys/devices/platform/omapdss/display0/custom_edid_timing

Once Mythri was briefed of the findings, she knew almost instantly what was going on. The display was not camping because the vsync and hsync timing values of the monitor do not sync with any HDMI known values, thus the LG W2261VP’s EDID contains non-standard settings. A vsync and hsync check within the driver is necessary to distinguish between 50Hz and 60Hz devices. The fix: instead of reading all EDID blocks and checking each of the horizontal (hsw, hfp, hbp) and vertical (vsw, vfp, vbp) values individually, a checksum will be devised using all of them encompassed. This is the current (borked) implementation:

                        -------------------------
                   Yes |  User enters HDMI code  | No
              ---------|  and mode in boot args  |---------
             |         |                         |         |
             |          -------------------------          |
             |                                             |
 -------------------------                     -------------------------
|  Set code and mode to   |                   |                         |
|  value entered by user  |                   |   Set 1080p as default  |
|                         |                   |                         |
 -------------------------                     -------------------------
             |                                             |
              ---------------------------------------------
                                    |
                        -------------------------
                       |                         |
   EDID data not found |       Read EDID         | EDID data found
              ---------|                         |-----------
             |          -------------------------            |
             |                                               |
 -------------------------                       -------------------------
|                         |                     | Match user entered code |
|  Set 1080p as default   |     Match not found | with block timinig data | Match found
|                         |            ---------|                         |---------
 -------------------------            |          -------------------------          |
                                      |                                             |
                          -------------------------                     -------------------------
                         |                         |                   | Monitor/TV supports user|
                         |   Set 720p as default   |                   |   entered timing value  |
                         |                         |                   |                         |
                          -------------------------                     -------------------------

The reason the monitor refused to display on boot-up and was happy to display correctly using sysfs is that sysfs ignores all EDID values and forcefully displays whichever values are passed in. Hence, on boot-up it was using very weird arcane fall-back values due to the fact that the EDID was unrecognised – resulting in a “EDID TIMING DATA supported NOT FOUND” boot-up message. This was the solution:

                        -------------------------
        Data not found |                         | Data found
              ---------|        Read EDID        |---------
             |         |                         |         |
             |          -------------------------          |
             |                                             |
 -------------------------                     -------------------------
|                         |                   |                         |
|   Set default 640x480   |                   |  Read block timing data | Read next timing block
|                         |                   |                         |-------------------
 -------------------------                     -------------------------                    |
                                                           |                                |
                                               -------------------------                    |
                                              |     Match the timing    |                   |
                                  Match found |        with OMAP4       | Match not found   |
                                     ---------|                         |-----------        |
                                    |          -------------------------            |       |
                                    |                                               |       |
                                    |                                   -------------------------
                                    |                                  |      All 4/8 timing     |
                                    |                                  |        blocks read      |
                                    |                                  |                         |
                                    |                                   -------------------------
                                    |                                               |
                        -------------------------                       -------------------------
                       |  Timing data supported  |                     |  Either timing data not |
                       |   by monitor & OMAP4    |                     |  found or not supported |
                       |      setting values     |                     |   Set to VGA DVI mode   |
                        ------------------------                        -------------------------

Now if the monitor is on during start-up a suitable image is displayed. However, if the monitor is in a Power Saving Mode (PSW) the monitor is taken out of sleep during boot-up, then nods off again once the boot-up sequence has finished. After a few seconds it would wake up then go to sleep again. If left, this behavior would persist indefinitely. A few settings were edited, but its behaviour did not differ. In order to send Mythri a full log which she could use to debug this issue, HDMI was placed into debug mode using the kernel command-line arguments below.

debug omapdss.debug=1

Oddly, when in debug mode the problem disappeared. It was confirmed that this behaviour was reproducible by turning debug mode on and off over subsequent boots. Whenever debug mode was activated the problem would not occur. The same was true in the converse. We decided this must be a timing delay feature. After some investigation and experimentation with her own monitors Mythri was able to isolate the problem and insert a delay in the correct place.

diff --git a/drivers/video/omap2/dss/hdmi.c b/drivers/video/omap2/dss/hdmi.c
index a8bedb1..0830bbc 100644
--- a/drivers/video/omap2/dss/hdmi.c
+++ b/drivers/video/omap2/dss/hdmi.c
@@ -459,6 +459,8 @@ static int hdmi_panel_probe(struct omap_dss_device *dssdev)
        DSSDBG("hdmi_panel_probe x_res= %d y_res = %d", dssdev->panel.timings.x_res,
                 dssdev->panel.timings.y_res);

+       mdelay(50);
+
        return 0;
 }

Problem solved.

[1] [PATCH 4/4] DSS2: clear spurious SYNC_LOST_DIGIT interrupts

http://www.mail-archive.com/linux-omap@vger.kernel.org/msg24900.html

[2] Bootargs for enabling display

http://omappedia.org/wiki/Bootargs_for_enabling_display

[3] Wiki - Extended Display Identification Data (EDID)

http://en.wikipedia.org/wiki/Extended_display_identification_data

[4] VESA timing codes
      Code       Timing values
      4(0x4)     {640,  480,  25175,  96,  16,  48,  2, 11, 31}
      9(0x9)     {800,  600,  40000,  128, 40,  88,  4, 1,  23}
      14(0xE)    {848,  480,  33750,  112, 16,  112, 8, 6,  23}
      23(0x17)   {1280, 768,  71000,  128, 64,  192, 7, 3,  20}
      28(0x1C)   {1280, 800,  83500,  128, 72,  200, 6, 3,  22}
      39(0x27)   {1360, 768,  85500,  112, 64,  256, 6, 3,  18}
      32(0x20)   {1280, 960,  108000, 112, 96,  312, 3, 1,  36}
      35(0x23)   {1280, 1024, 108000, 112, 48,  248, 3, 1,  38}
      16(0x10)   {1024, 768,  65000,  136, 24,  160, 6, 3,  29}
      42(0x2A)   {1400, 1050, 121750, 144, 88,  232, 4, 3,  32}
      47(0x2F)   {1440, 900,  106500, 152, 80,  232, 6, 3,  25}
      58(0x3A)   {1680, 1050, 146250, 176, 104, 280, 6, 3,  30}
      81(0x51)   {1366, 768,  85500,  143, 70,  213, 3, 3,  24}
      82(0x52)   {1920, 1080, 148500, 44,  88,  80,  5, 4,  36}
      22(0x16)   {1280, 768,  68250,  32,  48,  80,  7, 3,  12}
      41(0x29)   {1400, 1050, 101000, 32,  48,  80,  4, 3,  23}
      57(0x39)   {1680, 1050, 119000, 32,  48,  80,  6, 3,  21}