Mirror with some correction

Dependencies:   mbed FastIO FastPWM USBDevice

TLC5940/TLC5940.h

Committer:
arnoz
Date:
2021-10-01
Revision:
116:7a67265d7c19
Parent:
98:4df3c0f7e707

File content as of revision 116:7a67265d7c19:

// Pinscape Controller TLC5940 interface
//
// Based on Spencer Davis's mbed TLC5940 library.  Adapted for the
// KL25Z and simplified (removes dot correction and status input 
// support).

 
#ifndef TLC5940_H
#define TLC5940_H

#include "NewPwm.h"

// --------------------------------------------------------------------------
// Data Transmission Mode.
//
// NOTE!  This section contains a possible workaround to try if you're 
// having data signal stability problems with your TLC5940 chips.  If
// things are working properly, you can ignore this part.
//
// The software has two options for sending data updates to the chips:
//
// Mode 0:  Send data *during* the grayscale cycle.  This is the default,
// and it's the standard method the chips are designed for.  In this mode, 
// we start sending an update just after then blanking interval that starts 
// a new grayscale cycle.  The timing is arranged so that the update is 
// completed well before the end of the grayscale cycle.  At the next 
// blanking interval, we latch the new data, so the new brightness levels 
// will be shown starting on the next cycle.
//
// Mode 1:  Send data *between* grayscale cycles.  In this mode, we send
// each complete update during a blanking period, then latch the update
// and start the next grayscale cycle.  This isn't the way the chips were
// intended to be used, but it works.  The disadvantage is that it requires
// the blanking interval to be extended long enough for the full data 
// update (192 bits * the number of chips in the chain).  Since the
// outputs are turned off throughout the blanking period, this reduces
// the overall brightness/intensity of the outputs by reducing the duty
// cycle.  The TLC5940 chips can't achieve 100% duty cycle to begin with,
// since they require a brief minimum time in the blanking interval
// between grayscale cycles; however, the minimum is so short that the
// duty cycle is close to 100%.  With the full data transmission stuffed
// into the blanking interval, we reduce the duty cycle further below
// 100%.  With four chips in the chain, a 28 MHz data clock, and a
// 500 kHz grayscale clock, the reduction is about 0.3%.
//
// Mode 0 is the method documented in the manufacturer's data sheet.
// It works well empirically with the Pinscape expansion boards.
//
// So what's the point of Mode 1?  In early testing, with a breadboard 
// setup, I saw some problems with data signal stability, which manifested 
// as sporadic flickering in the outputs.  Switching to Mode 1 improved
// the signal stability considerably.  I'm therefore leaving this code
// available as an option in case anyone runs into similar signal problems
// and wants to try the alternative mode as a workaround.
//
#define DATA_UPDATE_INSIDE_BLANKING  0

#include "mbed.h"


// --------------------------------------------------------------------------
// Some notes on the data transmission design
//
// I spent a while working on using DMA to send the data, thinking that
// this would reduce the CPU load.  But I couldn't get this working
// reliably; there was some kind of timing interaction or race condition
// that caused crashes when initiating the DMA transfer from within the
// blanking interrupt.  I spent quite a while trying to debug it and
// couldn't figure out what was going on.  There are some complications
// involved in using DMA with SPI that are documented in the KL25Z
// reference manual, and I was following those carefully, but I suspect
// that the problem was somehow related to that, because it seemed to
// be sporadic and timing-related, and I couldn't find any software race
// conditions or concurrency issues that could explain it.
//
// I finally decided that I wasn't going to crack that and started looking
// for alternatives, so out of curiosity, I measured the time needed for a 
// synchronous (CPU-driven) SPI send, to see how it would fit into various
// places in the code.  This turned out to be faster than I expected: with
// SPI at 28MHz, the measured time for a synchronous send is about 72us for
// 4 chips worth of GS data (192 bits), which I expect to be the typical
// Expansion Board setup.  For an 8-chip setup, which will probably be 
// about the maximum workable setup, the time would be 144us.  We only have
// to send the data once per grayscale cycle, and each cycle is 11.7ms with 
// the grayscale clock at 350kHz (4096 steps per cycle divided by 350,000 
// steps per second = 11.7ms per cycle), so this is only 1% overhead.  The 
// main loop spends most of its time polling anyway, so we have plenty of 
// cycles to reallocate from idle polling to the sending the data.
//
// The easiest place to do the send is in the blanking interval ISR, but
// I wanted to keep this out of the ISR.  It's only ~100us, but even so,
// it's critical to minimize time in ISRs so that we don't miss other 
// interrupts.  So instead, I set it up so that the ISR coordinates with
// the main loop via a flag:
//
//  - In the blanking interrupt, set a flag ("cts" = clear to send),
//    and arm a timeout that fires 2/3 through the next blanking cycle
//
//  - In the main loop, poll "cts" each time through the loop.  When 
//    cts is true, send the data synchronously and clear the flag.
//    Do nothing when cts is false.
//
// The main loop runs on about a 1.5ms cycle, and 2/3 of the grayscale
// cycle is 8ms, so the main loop will poll cts on average 5 times per
// 8ms window.  That makes it all but certain that we'll do a send in
// a timely fashion on every grayscale cycle.
//
// The point of the 2/3 window is to guarantee that the data send is
// finished before the grayscale cycle ends.  The TLC5940 chips require
// this; data transmission has to be entirely between blanking intervals.
// The main loop and interrupt handler are operating asynchronously
// relative to one another, so the exact phase alignment will vary
// randomly.  If we start a transmission within the 2/3 window, we're
// guaranteed to have at least 3.5ms (1/3 of the cycle) left before
// the next blanking interval.  The transmission only takes ~100us,
// so we're leaving tons of margin for error in the timing - we have
// 34x longer than we need.
//
// The main loop can easily absorb the extra ~100us of overhead without
// even noticing.  The loop spends most of its time polling devices, so
// it's really mostly idle time to start with.  So we're effectively
// reallocating some idle time to useful work.  The chunk of time is
// only about 6% of one loop iteration, so we're not even significantly
// extending the occasional iterations that actually do this work.
// (If we had a 2ms chunk of monolithic work to do, that could start
// to add undesirable latency to other polling tasks.  100us won't.)
//
// We could conceivably reduce this overhead slightly by adding DMA, 
// but I'm not sure it would actually do much good.  Setting up the DMA
// transfer would probably take at least 20us in CPU time just to set
// up all of the registers.  And SPI is so fast that the DMA transfer
// would saturate the CPU memory bus for the 30us or so of the transfer.
// (I have my suspicions that this bus saturation effect might be part
// of the problem I was having getting DMA working in the first place.)
// So we'd go from 100us of overhead per cycle to at maybe 50us per 
// cycle.  We'd also have to introduce some concurrency controls to the 
// output "set" operation that we don't need with the current scheme 
// (because it's synchronous).  So overall I think the current
// synchronous approach is almost as good in terms of performance as 
// an asynchronous DMA setup would be, and it's a heck of a lot simpler
// and seems very reliable.
//
// --------------------------------------------------------------------------


/**
  * SPI speed used by the mbed to communicate with the TLC5940
  * The TLC5940 supports up to 30Mhz.  It's best to keep this as
  * high as possible, since a higher SPI speed yields a faster 
  * grayscale data update.  However, I've seen some slight
  * instability in the signal in my breadboard setup using the
  * full 30MHz, so I've reduced this slightly, which seems to
  * yield a solid signal.  The limit will vary according to how
  * clean the signal path is to the chips; you can probably crank
  * this up to full speed if you have a well-designed PCB, good
  * decoupling capacitors near the 5940 VCC/GND pins, and short
  * wires between the KL25Z and the PCB.  A short, clean path to
  * KL25Z ground seems especially important.
  *
  * The SPI clock must be fast enough that the data transmission
  * time for a full update is comfortably less than the blanking 
  * cycle time.  The grayscale refresh requires 192 bits per TLC5940 
  * in the daisy chain, and each bit takes one SPI clock to send.  
  * Our reference setup in the Pinscape controller allows for up to 
  * 4 TLC5940s, so a full refresh cycle on a fully populated system 
  * would be 768 SPI clocks.  The blanking cycle is 4096 GSCLK cycles.  
  *
  *   t(blank) = 4096 * 1/GSCLK_SPEED
  *   t(refresh) = 768 * 1/SPI_SPEED
  *   Therefore:  SPI_SPEED must be > 768/4096 * GSCLK_SPEED
  *
  * Since the SPI speed can be so high, and since we want to keep
  * the GSCLK speed relatively low, the constraint above simply
  * isn't a factor.  E.g., at SPI=30MHz and GSCLK=500kHz, 
  * t(blank) is 8192us and t(refresh) is 25us.
  */
#define SPI_SPEED 28000000

/**
  * The rate at which the GSCLK pin is pulsed.   This also controls 
  * how often the reset function is called.   The reset function call
  * interval in seconds is (4096/GSCLK_SPEED).  The maximum reliable 
  * rate is around 32Mhz.  It's best to keep this rate as low as 
  * possible:  the higher the rate, the higher the refresh() call 
  * frequency, so the higher the CPU load.  Higher frequencies also 
  * make it more challenging to wire the chips for clean signal 
  * transmission.  Lower clock speeds are more forgiving of wiring
  * quality.
  *
  * The lower bound depends on the application.  For driving lights,
  * the limiting factor is flicker: the lower the rate, the more
  * noticeable the flicker.  Incandescents tend to look flicker-free
  * at about 50 Hz (205 kHz grayscale clock).  LEDs need significantly 
  * faster rates than incandescents, since they don't have the thermal
  * lag of incandescents; for flicker-free LEDs, you usually need at
  * least 200Hz (GSCLK_SPEED 819200).
  */
#define GSCLK_SPEED    350000

class TLC5940
{
public:
    /**
      *  Set up the TLC5940
      *
      *  @param SCLK - The SCK pin of the SPI bus
      *  @param MOSI - The MOSI pin of the SPI bus
      *  @param GSCLK - The GSCLK pin of the TLC5940(s)
      *  @param BLANK - The BLANK pin of the TLC5940(s)
      *  @param XLAT - The XLAT pin of the TLC5940(s)
      *  @param nchips - The number of TLC5940s (if you are daisy chaining)
      */
    TLC5940(PinName SCLK, PinName MOSI, PinName GSCLK, PinName BLANK, PinName XLAT, int nchips)
        : spi(MOSI, NC, SCLK),
          gsclk(GSCLK),
          blank(BLANK, 1),
          xlat(XLAT),
          nchips(nchips)
    {
        // start up initially disabled
        enabled = false;
        
        // set XLAT to initially off
        xlat = 0;
        
        // Assert BLANK while starting up, to keep the outputs turned off until
        // everything is stable.  This helps prevent spurious flashes during startup.
        // (That's not particularly important for lights, but it matters more for
        // tactile devices.  It's a bit alarming to fire a replay knocker on every
        // power-on, for example.)
        blank = 1;
        
        // Configure SPI format and speed.  The KL25Z only supports 8-bit mode.
        // We nominally need to write the data in 12-bit chunks for the TLC5940
        // grayscale levels, but SPI is ultimately just a bit-level serial format,
        // so we can reformat the 12-bit blocks into 8-bit bytes to fit the 
        // KL25Z's limits.  This should work equally well on other microcontrollers 
        // that are more flexible.  The TLC5940 requires polarity/phase format 0.
        spi.format(8, 0);
        spi.frequency(SPI_SPEED);
        
        // Send out a full data set to the chips, to clear out any random
        // startup data from the registers.  Include some extra bits - there
        // are some cases (such as after sending dot correct commands) where
        // an extra bit per chip is required, and the initial state is 
        // unpredictable, so send extra bits to make sure we cover all bases.  
        // This does no harm; extra bits just fall off the end of the daisy 
        // chain, and since we want all registers initially set to 0, we can 
        // send arbitrarily many extra 0's.
        for (int i = 0 ; i < nchips*25 ; ++i)
            spi.write(0x00);
            
        // do an initial XLAT to latch all of these "0" values into the
        // grayscale registers
        xlat = 1;
        xlat = 0;

        // Allocate our SPI buffer.  The transfer on each cycle is 192 bits per
        // chip = 24 bytes per chip.
        spilen = nchips*24;
        spibuf = new uint8_t[spilen];
        memset(spibuf, 0x00, spilen);
        
        // Configure the GSCLK output's frequency
        gsclk.getUnit()->period(1.0f/GSCLK_SPEED);
        
        // we're not yet ready to send new data to the chips
        cts = false;
        
        // we don't need an XLAT signal until we send data
        needXlat = false;
    }
     
    // Global enable/disble.  When disabled, we assert the blanking signal
    // continuously to keep all outputs turned off.  This can be used during
    // startup and sleep mode to prevent spurious output signals from
    // uninitialized grayscale registers.  The chips have random values in
    // their internal registers when power is first applied, so we have to 
    // explicitly send the initial zero levels after power cycling the chips.
    // The chips might not have power even when the KL25Z is running, because
    // they might be powered from a separate power supply from the KL25Z
    // (the Pinscape Expansion Boards work this way).  Global blanking helps
    // us start up more cleanly by suppressing all outputs until we can be
    // reasonably sure that the various chip registers are initialized.
    void enable(bool f)
    {
        // note the new setting
        enabled = f;
        
        // If disabled, apply blanking immediately.  If enabled, do nothing
        // extra; we'll drop the blanking signal at the end of the next 
        // blanking interval as normal.
        if (!f)
        {
            // disable interrupts, since the blanking interrupt writes gsclk too
            __disable_irq();
        
            // turn off the GS clock and assert BLANK to turn off all outputs
            gsclk.glitchFreeWrite(0);
            wait_us(3);
            blank = 1;

            // done messing with shared data
            __enable_irq();
        }
        
    }
    
    // Start the clock running
    void start()
    {        
        // Set up the first call to the reset function, which asserts BLANK to
        // end the PWM cycle and handles new grayscale data output and latching.
        // The original version of this library used a timer to call reset
        // periodically, but that approach is somewhat problematic because the
        // reset function itself takes a small amount of time to run, so the
        // *actual* cycle is slightly longer than what we get from counting
        // GS clocks.  Running reset on a timer therefore causes the calls to
        // slip out of phase with the actual full cycles, which causes 
        // premature blanking that shows up as visible flicker.  To get the
        // reset cycle to line up more precisely with a full PWM cycle, it
        // works better to set up a new timer at the end of each cycle.  That
        // organically accounts for the time spent in the interrupt handler.
        // This doesn't result in perfectly uniform timing, since interrupt
        // latency varies slightly on each interrupt, but it does guarantee
        // that the blanking will never be premature - all variation will go
        // into the tail end of the cycle after the 4096 GS clocks.  That
        // might cause some brightness variation, but it won't cause flicker,
        // and in practice any brightness variation from this seems to be too 
        // small to be visible.
        armReset();
    }
    
    // stop the timer
    void stop()
    {
        disarmReset();
    }
    
     /*
      *  Set an output.  'idx' is the output index: 0 is OUT0 on the first
      *  chip, 1 is OUT1 on the first chip, 16 is OUT0 on the second chip
      *  in the daisy chain, etc.  'data' is the brightness value for the
      *  output, 0=off, 4095=full brightness.
      */
    void set(int idx, unsigned short data) 
    {
        // validate the index
        if (idx >= 0 && idx < nchips*16)
        {
#if DATA_UPDATE_INSIDE_BLANKING
            // If we send data within the blanking interval, turn off interrupts while 
            // modifying the buffer, since the send happens in the interrupt handler.
            __disable_irq();
#endif

            // Figure the SPI buffer location of the output we're changing.  The SPI
            // buffer has the packed bit format that we send across the wire, with 12 
            // bits per output, arranged from last output to first output (N = number 
            // of outputs = nchips*16):
            //
            //       byte 0  =  high 8 bits of output N-1
            //            1  =  low 4 bits of output N-1 | high 4 bits of output N-2
            //            2  =  low 8 bits of N-2
            //            3  =  high 8 bits of N-3
            //            4  =  low 4 bits of N-3 | high 4 bits of N-2
            //            5  =  low 8bits of N-4
            //           ...
            //  24*nchips-3  =  high 8 bits of output 1
            //  24*nchips-2  =  low 4 bits of output 1 | high 4 bits of output 0
            //  24*nchips-1  =  low 8 bits of output 0
            //
            // So this update will affect two bytes.  If the output number if even, we're
            // in the high 4 + low 8 pair; if odd, we're in the high 8 + low 4 pair.
            int di = nchips*24 - 3 - (3*(idx/2));
            if (idx & 1)
            {
                // ODD = high 8 | low 4
                spibuf[di]    = uint8_t((data >> 4) & 0xff);
                spibuf[di+1] &= 0x0F;
                spibuf[di+1] |= uint8_t((data << 4) & 0xf0);
            }
            else
            {
                // EVEN = high 4 | low 8
                spibuf[di+1] &= 0xF0;
                spibuf[di+1] |= uint8_t((data >> 8) & 0x0f);
                spibuf[di+2]  = uint8_t(data & 0xff);
            }

#if DATA_UPDATE_INSIDE_BLANKING
            // re-enable interrupts
            __enable_irq();
#endif
        }
    }
    
    // Send updates if ready.  Our top-level program's main loop calls this on
    // every iteration.  This lets us send grayscale updates to the chips in
    // regular application context (rather than in interrupt context), to keep
    // the time in the ISR as short as possible.  We return immediately if
    // we're not within the update window or we've already sent updates for
    // the current cycle.
    void send()
    {
        // if we're in the transmission window, send the data
        if (cts)
        {
            // Write the data to the SPI port.  Note that we go directly
            // to the hardware registers rather than using the mbed SPI
            // class, because this makes the operation about 50% faster.
            // The mbed class checks for input on every byte in case the
            // SPI connection is bidirectional, but for this application
            // it's strictly one-way, so we can skip checking for input 
            // and just blast bits to the output register as fast as 
            // it'll take them.  Before writing the output register 
            // ("D"), we have to check the status register ("S") and see
            // that the Transmit Empty Flag (SPTEF) is set.  The 
            // procedure is: spin until SPTEF is set in "S", write the 
            // next byte to "D", loop until out of bytes.
            uint8_t *p = spibuf;
            for (int i = spilen ; i > 0 ; --i) {
                while (!(SPI0->S & SPI_S_SPTEF_MASK)) ;
                SPI0->D = *p++;
            }
        
            // we've sent new data, so we need an XLAT signal to latch it
            needXlat = true;
            
            // done - we don't need to send again until the next GS cycle
            cts = false;
        }
    }

private:
    // SPI port.  This is master mode, output only, so we only assign the MOSI 
    // and SCK pins.
    SPI spi;

    // SPI transfer buffer.  This contains the live grayscale data, formatted
    // for direct transmission to the TLC5940 chips via SPI.
    uint8_t *volatile spibuf;
    
    // Length of the SPI buffer in bytes.  The native data format of the chips
    // is 12 bits per output = 1.5 bytes.  There are 16 outputs per chip, which
    // comes to 192 bits == 24 bytes per chip.
    uint16_t spilen;
    
    // Dirty: true means that the non-live buffer has new pending data.  False means
    // that the non-live buffer is empty.
    volatile bool dirty;
    
    // Enabled: this enables or disables all outputs.  When this is true, we assert the
    // BLANK signal continuously.
    bool enabled;
    
    // use a PWM out for the grayscale clock - this provides a stable
    // square wave signal without consuming CPU
    NewPwmOut gsclk;

    // Digital out pins used for the TLC5940
    DigitalOut blank;
    DigitalOut xlat;
    
    // number of daisy-chained TLC5940s we're controlling
    int nchips;

    // Timeout to end each PWM cycle.  This is a one-shot timer that we reset
    // on each cycle.
    Timeout resetTimer;
    
    // Timeout to end the data window for the PWM cycle.
    Timeout windowTimer;
    
    // "Clear To Send" flag: 
    volatile bool cts;
    
    // Do we need an XLAT signal on the next blanking interval?
    volatile bool needXlat;
        
    // Reset the grayscale cycle and send the next data update
    void reset()
    {
        // start the blanking cycle
        startBlank();
        
        // we're now clear to send the new GS data
        cts = true;
        
#if DATA_UPDATE_INSIDE_BLANKING
        // We're configured to send the new GS data inline during each 
        // blanking cycle.  Send it now.
        send();
#else
        // We're configured to send GS data during the GS cycle.  This means
        // we can defer the GS data transmission to any point within the next
        // GS cycle, which will last about 12ms (assuming a 350kHz GS clock).
        // That's a ton of time given that our GS transmission only takes about
        // 100us.  With such a leisurely time window to work with, we can move
        // the transmission out of the ISR context and into regular application
        // context, which is good because it greatly reduces the time we spend 
        // in this ISR, which is good in turn because more ISR time means more 
        // latency for other interrupts and more chances to miss interrupts
        // entirely.  
        //
        // The mechanism for deferring the transmission to application context
        // is simple.  The main program loop periodically polls the "cts" flag
        // and transmits the data if it finds "cts" set.  To conform to the
        // hardware spec for the TLC5940 chips, the data transmission has to
        // finish before the next blanking interval.  This means our time 
        // window to do the transmission is the 12ms of the grayscale cycle 
        // minus the ~100us to do the transmission.  So basically 12ms.  
        // Timing is never exact on the KL25Z, though, so we should build in
        // a little margin for error.  To be conservative, we'll say that the 
        // update must begin within the first 2/3 of the grayscale cycle time.
        // That's an 8ms window, and leaves a 4ms margin of error.  It's
        // almost inconceivable that any of the timing factors would be 
        // outside of those bounds.
        //
        // To coordinate this 2/3-of-a-cycle window with the main loop, set
        // up a timeout to clear the "cts" flag 2/3 into the cycle time.  If
        // for some reason the main loop doesn't do the transmission before
        // this timer fires, it'll see the "cts" flag turned off and won't
        // attempt the transmission on this round.  (That should essentially
        // never happen, but it wouldn't be a problem even if it happened with
        // some regularity, because we'd just transmit the data on the next
        // cycle.)
        windowTimer.attach_us(this, &TLC5940::closeSendWindow, 
            uint32_t((1.0f/GSCLK_SPEED)*4096.0f*2.0f/3.0f*1000000.0f));
#endif

        // end the blanking interval
        endBlank();

        // re-arm the reset handler for the next blanking interval
        armReset();
    }
    
    // End the data-send window.  This is a timeout routine that fires halfway
    // through each grayscale cycle.  The TLC5940 chips allow new data to be
    // sent at any time during the grayscale pulse cycle, but the transmission
    // has to fit into this window.  We do these transmissions from the main loop,
    // so that they happen in application context rather than interrupt context,
    // but this means that we have to synchronize the main loop activity to the
    // grayscale timer cycle.  To make sure the transmission is done before the
    // next grayscale cycle ends, we only allow the transmission to start for
    // the first 2/3 of the cycle.  This gives us plenty of time to send the
    // data and plenty of padding to make sure we don't go too late.  Consider
    // the relative time periods: we run the grayscale clock at 350kHz, and each
    // grayscale cycle has 4096 steps, so each cycle takes 11.7ms.  For the
    // typical Expansion Board setup with 4 TLC5940 chips, we have 768 bits 
    // to send via SPI at 28 MHz, which nominally takes 27us.  The actual
    // measured time to send 768 bits via send() is 72us, so there's CPU overhead 
    // of about 2.6x.  The biggest workable Expnasion Board setup would probably 
    // be around 8 TLC chips, so we'd have twice the bits and twice the 
    // transmission time of our 4-chip scenario, so the send time would be
    // about 150us.  2/3 of the grayscale cycle gives us an 8ms window to 
    // perform a 150us operation.  The main loop runs about every 1.5ms, so 
    // we're all but certain to poll CTS more than once during each 8ms window.  
    // Even if we start at the very end of the window, we still have about 3.5ms 
    // to finish a <150us operation, so we're all but certain to finish in time.
    void closeSendWindow() 
    { 
        cts = false; 
    }
    
    // arm the reset handler - this fires at the end of each GS cycle    
    void armReset()
    {
        resetTimer.attach_us(this, &TLC5940::reset, 
            uint32_t((1.0/GSCLK_SPEED)*4096.0*1000000.0f));
    }
    
    void disarmReset()
    {
        resetTimer.detach();
    }

    void startBlank()
    {
        // turn off the grayscale clock
        gsclk.glitchFreeWrite(0);
        
        // Make sure the gsclk cycle ends, since the TLC5940 data sheet
        // says we can't take BLANK high until GSCLK has been low for 20ns.
        // (We don't have to add any padding for the 20ns, since it'll take
        // at least one CPU cycle of 60ns to return from waitEndCycle().
        // That routine won't return until GSCLK is low, so it will have
        // low for at least 60ns by the time we get back from this call.)
        gsclk.waitEndCycle();
        
        // assert BLANK to end the grayscale cycle
        blank = 1;
    }
            
    void endBlank()
    {
        // if we've sent new grayscale data since the last blanking
        // interval, latch it by asserting XLAT
        if (needXlat)
        {
            // latch the new data while we're still blanked
            xlat = 1;
            xlat = 0;
            needXlat = false;
        }

        // End the blanking interval and restart the grayscale clock.  Note
        // that we keep the blanking on if the chips are globally disabled.
        if (enabled)
        {
            blank = 0;
            gsclk.write(.5);
        }
    }
};
 
#endif