USI Serial UART Receive on ATtiny

Atmel USI Block Diagram

Many ATtiny microprocessors don’t include a hardware UART, but do include a Universal Serial Interface, USI. The USI module can be used to implement SPI, TWI (also known as I2C) and UART serial interfaces. This post describes how to implement a simple UART receiver using the USI module.

Arduino Software Serial Library

The Arduino core for ATtiny includes a Software Serial library which implements a serial UART interface. This does not take advantage of the USI module and requires 2K of flash. This is probably your best option if you are using Arduino and have plenty of flash available. Using the USI allows you to use faster baud rates and less microcontroller resources, the approach below uses less than 1K of flash.

Atmel app note AVR307

Atmel describe how to use the USI to implement a serial UART for an ATtiny26 in app note AVR307. This provides great information and is worth a read. They also provide the source code at www.atmel.com/images/AVR307.zip.

We will be targeting the ATtiny25, ATtiny45 and ATtiny85 in the code below.

Borrowing Timer/Counter0

The USI module uses Timer/Counter0 which is also used by the hilowtech Arduino core to keep track of time. So if you are using Arduino you will need to borrow Timer/Counter0 and give it back. I’ve written a separate post describing how to do this: Borrowing an Arduino timer.

Calibrating the internal oscillator

You can use an external oscillator or the internal oscillator. In my experience the internal oscillator is factory calibrated accurately enough that you probably won’t need to do any user tuning.

But if you need to tune your internal oscillator, then take a look the following post: Tuning ATtiny internal oscillator.

UART input signal

A serial UART packet consists of a start bit, 5 to 9 bits of data, an optional parity bit and one or two stop bits. But we will just consider the typical configuration of 8 data bits and no parity bit.

When the input line is idle and no data is being transmitted it is held high. You will want to include a pull up resistor in you circuit or enable pull up on the pin so that a floating input doesn’t trigger a false start.

The start of a byte is indicated by pulling the input line low for one bit width. This is immediately followed by each data bit with low representing zero and high representing one with the least significant bit transmitted first. This is then followed by one or two stop bits which are high.

uart trace
UART Oscilloscope trace

Note how the bits appear in the reverse order in the oscilloscope trace above.

Timing a bit width

We are going to use Timer/Counter0 to time a bit width. This is an 8 bit timer, so the its maximum value is 255. We will need to use the Prescaler to adjust the clock input to Timer/Counter0 so that the number of ticks is 255 or less.

We need to choose the right Prescaler value for the regular ATtiny clock speeds of 1MHz, 8MHz and 16Mz and a range of common UART baud rates of such as 9600, 14400, 28800, 57600, 115200 and 230400. For example 9600 baud is 1666 CPU cycles at 16MHz, which is 208.25 when divided by 8, 208 whole CPU cycles with a 0.12% drift.

When setting the baud rate and CPU clock speed you will want the timer to be as accurate as possible, keeping drift below 5%. So higher baud rates won’t be available at lower CPU speeds. For example 230400 baud with a 1MHz clock is 4.34 CPU cycles (1000000/230400) which is just 4 whole CPU cycles with a 34% drift. And that’s before we take code execution time into account.

We define the clock speed, F_CPU, and the baud rate. Note F_CPU is already defined in Arduino and may already be defined in your development environment, it’s normally defined as a build symbol.

#define F_CPU               8000000
#define BAUDRATE            9600

We can use these to calculate the number of CPU clock cycles per bit width.

#define CYCLES_PER_BIT      ( F_CPU / BAUDRATE )

If this number is 255 or less then we set the clock source to be the CPU clock, otherwise we will use the prescaler to divide the clock input to Timer/Counter0 by 8.

We use the bottom three Clock Select bits of Timer/Counter0 Control Register B, TCCR0B, to configure the prescaler. A Clock Select value of 1 is for the CPU clock and a value of 2 is for CPU clock divided by 8.

#if (CYCLES_PER_BIT > 255)
#define DIVISOR             8
#define PRESCALE            2
#else
#define DIVISOR             1
#define PRESCALE            1
#endif
#define FULL_BIT_TICKS      ( CYCLES_PER_BIT / DIVISOR )

Using the USI

The USI module is optimized for use in either Two Wire I2C or Three Wire SPI mode, it doesn’t have a dedicated mode for UART. However the USI module is flexible enough that we can use an internal clock to trigger the USI to left shift the data bits into the USI data register.

We can use a pin change interrupt to detect the beginning of the start bit and then configure the USI to sample the value of the input pin in the center of each of the eight bits. We can then read the back the eight bit values that the USI shifted into the USI register and reverse them to get the original byte value.

Getting setup

ATtiny85 pinoutThe first thing we need to do is ensure that the USI data input pin, DI, is enabled for input. This pin 0 of port B, PB0 in the diagram above.

DDRB &= ~(1 << DDB0);          // Set pin PB0 to input
PORTB |= 1 << PB0;             // Enable internal pull-up on pin PB0

This is equivalent to the following Arduino function call:

pinMode(0, INPUT_PULLUP);

We will also ensure that the USI is disabled at this point.

USICR = 0;                     // Disable USI

Finally we want to enable pin change interrupts on pin PB0 so we can trigger at the beginning of the start bit.

GIMSK |= 1<<PCIE;               // Enable pin change interrupts
PCMSK |= 1<<PCINT0;             // Enable pin change on pin PB0

UART start bit

The pin change interrupt, PCINT0_vect, will fire when any pin on port B with pin change enabled in PCMSK changes from low to high or high to low. We are only interested on a high to low, falling edge, on pin PB0. We will disable pin change on the data input pin while we are reading the byte, so we just need to check whether PB0 is low.

ISR (PCINT0_vect)
{
  uint8_t pinbVal = PINB;
  if (!(pinbVal & 1<<PINB0))   // Trigger if DI is Low
  {
    onSerialPinChange();
  }
}

Move to the middle

To be resilient to any clock drift we will want the USI to take a sample in the middle of each bit. So we need to delay for half a bit width before starting the USI module.

#define HALF_BIT_TICKS      ( FULL_BIT_TICKS / 2 )

We should also account for the number of CPU cycles the interrupt vector and our code introduce between the beginning of the start bit and us starting the USI. I used the Atmel Studio simulator to get numbers for my code.

#define START_DELAY         ( 65 + 42 )
#define TIMER_START_DELAY   ( START_DELAY  / DIVISOR )

First we need to the disable pin change interrupts as we don’t want to trigger our start bit interrupt vector while we are reading data bits.

void onSerialPinChange() {
  GIMSK &= ~(1<<PCIE);            // Disable pin change interrupts

We configure Timer/Counter0 to Clear Timer on Compare Match (CTC) mode. We do this by setting the three bits of the Waveform Generation Mode, WGM0, flag to 2 (binary 010). Just for fun bit 0 and bit 1 are in the TCCR0A and bit 2 is in TCCR0B, but as bit 2 is being set zero you won’t see it explicitly set in the code. The other bits of TCCR0A should be set to zero to indicate normal port operation. The least significant three bits of TCCR0B are used for the Clock Select mode and the rest of the bits (including WGM0 bit 2) should be set to zero.

  TCCR0A = 2<<WGM00;              // CTC mode
  TCCR0B = CLOCKSELECT;           // Set prescaler to cpu clk or clk/8

Then we can reset the prescaler to indicate that we changed its configuration and start Timer/Counter0 at zero.

  GTCCR |= 1 << PSR0;             // Reset prescaler
  TCNT0 = 0;                      // Count up from 0

We store the number of bits to count in Output Compare Register A, OCR0A. There is no code here to check that HALF_BIT_TICKS is actually greater than TIMER_START_DELAY, so be careful with choosing values for F_CPU, BAUDRATE and TIMER_START_DELAY.

  OCR0A = HALF_BIT_TICKS - TIMER_START_DELAY;

Finally we enable the output compare interrupt.

  TIFR = 1 << OCF0A;              // Clear output compare interrupt flag
  TIMSK |= 1<<OCIE0A;             // Enable output compare interrupt
}

Start the USI

When we reach the middle of the start bit, the Timer/Counter0 Compare Match A interrupt vector will fire.

ISR (TIMER0_COMPA_vect) {

First we disable the Compare Match A interrupt.

  TIMSK &= ~(1<<OCIE0A);          // Disable COMPA interrupt

We are going to configure the USI to sample every time Timer/Counter0 has counted up from zero to the value in Output Compare Register A, so we initialize the counter to zero and OCR0A to the full bit width.

  TCNT0 = 0;                      // Count up from 0
  OCR0A = FULL_BIT_TICKS;         // Shift every bit width

We configure the USI using the USI Control Register, USICR.

We want be notified when a byte has been read, so we set Counter Overflow Interrupt Enable, USIOIE, so that the USI_OVF interrupt will be called when the 8 bits have been shifted into the USI register.

We are not using either of the wire modes directly so we set the Wire Mode, USIWM, to zero indicating the automatic wire mode functionality is disabled.

To indicate we are using Timer/Counter0 Compare Match as the clock source we set Clock Source Select, USICS to 1.

  USICR = 1<<USIOIE | 0<<USIWM0 | 1<<USICS0;

Finally we start the USI using the USI Status Register, USISR.

The bottom four bits hold the USI Counter Value (0 to 15) and we want the overflow interrupt to be called after 8 bits have been read. The USI will still shift data into the USI Data Register on the Timer/Counter0 Compare Match trigger that results in an overflow, so we preset USICNT to 8.

We write 1 to the USI Overflow Interrupt Flag, USIOIF,  to clear the flag and ensure the interrupt doesn’t trigger immediately.

  USISR = 1<<USIOIF | 8;
}

Reversing a byte

The bits are going to be in the USI Data Register in reverse order, so we are going to need some code to reverse the bits. Needing to reverse a byte is a common problem so people have spent time producing the most efficient code to do this, so rather than write our own, we just borrow a common approach:

uint8_t ReverseByte (uint8_t x) {
    x = ((x >> 1) & 0x55) | ((x << 1) & 0xaa);
    x = ((x >> 2) & 0x33) | ((x << 2) & 0xcc);
    x = ((x >> 4) & 0x0f) | ((x << 4) & 0xf0);
    return x;
}

Reading the result

When USI has sampled 8 bits the USI counter will overflow and the USI Overflow interrupt USI_OVF will be called. The first thing we do is grab the received byte from the USI Data Register, USIDR and disable the USI.

ISR (USI_OVF_vect) {
  uint8_t usiByte = USIDR;
  USICR  =  0;                    // Disable USI

Then you will need to do something with the result, as we are still in an interrupt you will probably want to store the result in a volatile variable and then read and act on it in your main loop. Below I just call serialReceived function with the received data.

  serialReceived(ReverseByte(temp));

Now we are done we clear the Pin Change Interrupt Flag, by writing 1 to PCIF in the General Interrupt Flag register, GIFR, so the interrupt doesn’t trigger immediately. And we re-enable pin change interrupts so we are ready to receive the next byte by setting the Pin Change Interrupt Enable bit in the General Interrupt Mask Register, GIMSK.

  GIFR = 1<<PCIF;                 // Clear pin change interrupt flag
  GIMSK |= 1<<PCIE;               // Enable pin change interrupts again
}

We are actually in the middle of the last bit and if this is zero then a pin change interrupt will be called at the beginning of the stop bit which is high. However we only call onSerialChanged() if the state of the pin is low, so it will be ignored.

A one byte buffer

A really simple implementation of serialReceived() is to store a single byte, which is read by the main loop. If we receive a new byte before the main loop has read the last one then we just overwrite the value. A simple implementation like this is might be fine if we were only using the serial input to set the brightness of an LED say.

volatile bool serialDataReady = false;
volatile uint8_t serialInput;

void serialReceived(uint8_t data)
{
    serialDataReady = true;
    serialInput = data;
}

bool readSerialData(uint8_t* pData)
{
    if (serialDataReady)
    {
        *pData =serialInput;
        serialDataReady = false;
        return true;
    }
    return false;
}

The main loop calls readSerialData() to whenever it is ready to receive data. If the return value is true then new data is available in the location passed in the pData argument. For example:

unsigned char serialInput;
if (readSerialData(&serialInput))
{
  ledUpdate(serialInput);
}

Get the example code

I have uploaded an Arduino sketch to GitHub at:

https://github.com/MarkOsborne/becomingmaker/tree/master/USISerial

This example sets the PWM duty load of PB4 to the value of the byte read from the UART input SDA. Connect your favorite serial devices Tx pin to SDA and an LED (plus current limiting resistor) to PB4 and you can set the brightness of the LED by sending bytes of serial.

USI Serial Send

If you want to transmit serial data from an ATtiny using the USI then take a look at my blog post USI Serial UART Send on ATtiny.

Sources of inspiration:

Thanks to @Atmel for publishing app note AVR307 and the source code at www.atmel.com/images/AVR307.zip.

Thanks to @technoblogy for the post Simple ATtiny USI UART which inspired me to write my own based on AVR307.

And of course the ATTiny24/45/85 Datasheet is an essential resource.

Tuning ATtiny internal oscillator

square wavesThe internal oscillator of an ATtiny can be inaccurate and might require tuning.

This post discusses how to do that if you have access to an oscilloscope or a frequency counter.

An accurate clock speed is important if you are doing timing critical operations such as serial UART communication.

The internal oscillator

ATtiny microprocessors can use an internal RC oscillator or an external crystal oscillator. External crystal oscillators are more accurate, but require two pins. For these low pin count devices it can be beneficial to use the internal oscillator.

Atmel claim the internal oscillator is factory calibrated to +/-10% at 25 degrees centigrade and three volts. In my experience they are actually calibrated to +/- 1%.

Higher temperatures will increase the clock rate and higher voltages will decrease the clock rate. My ATtiny85 microcontrollers run about 1.1% slower at 5V.

For serial UART you’ll probably need to be within 5% (less than half a bit width drift over one start and 8 data bits), so even a factory calibrated ATtiny running at 9600 baud will work fine even at 5V. However there are times when you might want the internal oscillator more finely tuned.

The oscillator calibration register, OSCCAL

Tuning of the internal oscillator is controlled by the oscillator calibration register, OSCCAL. When the microcontroller starts the factory calibration value is automatically loaded into the OSCCAL register. You cannot modify the factory calibration value, but your program can change the OSCCAL register at runtime.

So tuning the oscillator is as simple as adjusting OSCCAL at the beginning of your program. For example I added the following to my code for an ATtiny85 running at 5V.

OSCCAL += 3;

Using a delta adjustment of the factory calibrated value can be better than baking in the actual OSCCAL value if you don’t want to tune each chip individually.

For example I know that Atmel did a good job calibrating the oscillator at 3V, but I will be running at 5V so the oscillator will run slightly slower at that voltage. If adding 3 to OSCCAL works for one chip, it will probably work reasonably well for every chip.

Using EEPROM

For most hobbyists it is sufficient to add an OSCCAL adjustment to our code. Even if you have to change that line of code from chip to chip, you are only working with a handful of chips.

However this would be impractical in a production environment where we could be working with hundreds or thousands of chips.

For a production run, we would want to write the value into EEPROM during calibration and copy it from EEPROM to the OSCCAL register at the beginning of the program.

In Arduino we can use the EEPROM library:

#include <EEPROM.h>
OSCCAL = EEPROM.read(0);

In avr-libc we can use the avr/eeprom.h routines:

#include <avr/eeprom.h>
OSCCAL = eeprom_read_byte((uint8_t*)0x00);

Measuring oscillator accuracy

XMEGA XprotolabThere are approaches that involve comparing the clock speed to an external clock source, but if you have access to an oscilloscope or a frequency counter then its simpler to just measure the clock speed directly.

I have a tiny 1″ $49 Gabotronic XMEGA Xprotolab oscilloscope that does the job nicely.

So then we just need to generate a specific frequency on one of the pins, measure it, adjust OSCCAL up or down accordingly and try again.

Using the CKOUT fuse

If you are comfortable setting fuses, then the simplest approach is to set the CKOUT fuse (bit 6 of the LOW fuse register) which will send the clock signal to PB4. Then you can use your oscilloscope or frequency counter to see how close you are to the selected frequency.

I use Atmel Studio and an Atmel-ICE Programmer/Debugger to set this fuse.

Atmel Stdion CKOUT fuse

WARNING: You can “brick” your ATtiny by setting the fuses incorrectly, for example by selecting an external crystal when you don’t have one connected. In the worst case you could make it impossible to reprogram your ATtiny without a special high voltage programmer.

A safer approach

If you don’t want to mess with fuses, then an alternative is to load a small program that oscillates a pin at a measureable frequency. I like to use 10Khz as it’s a nice round number and fast enough for calibration.

We can do this with a PWM output or with a Timer/Counter compare match. The PWM approach is slightly less code, but I am going to use the Timer/Counter approach as I think the code is slightly easier to understand.

The following program uses Timer/Counter0 Comparator A to generate a 10KHz 50% load on pin PB4 for an ATtiny25/45/85. It generates 10KHz whether the ATtiny is running directly off the internal oscillator at 8Mhz or is running at 1MHz because the CKDIV8 fuse is set.

Using Arduino:

// Timer/Counter0 Compare Match A interrupt handler 
ISR (TIMER0_COMPA_vect) {
   PORTB ^= 1 << PINB4;        // Invert pin PB4
}
 
void setup() {
    OSCCAL += 3;                // User calibration
    pinMode(4,OUTPUT);          // Set PB4 to output
    TCNT0 = 0;                  // Count up from 0
    TCCR0A = 2 << WGM00;        // CTC mode
    if (CLKPR == 3)             // If clock set to 1MHz
        TCCR0B = (1<<CS00);     // Set prescaler to /1 (1uS at 1Mhz)
    else                        // Otherwise clock set to 8MHz
        TCCR0B = (2<<CS00);     // Set prescaler to /8 (1uS at 8Mhz)
    GTCCR |= 1 << PSR0;         // Reset prescaler
    OCR0A = 49;                 // 49 + 1 = 50 microseconds (10KHz)
    TIFR = 1 << OCF0A;          // Clear output compare interrupt flag
    TIMSK |= 1 << OCIE0A;       // Enable output compare interrupt
}
 
void loop() {}

Using avr-libc:

 #include <avr/interrupt.h>
 
 // Timer/Counter0 Comparator A interrupt vector
 ISR (TIMER0_COMPA_vect) {
     PORTB ^= 1<< PINB4;         // Invert pin PB4
 }
 
 int main(void)
 {
     OSCCAL += 3;                // User calibration
     DDRB = 1 << PINB4;          // Set PB4 to output
     TCNT0 = 0;                  // Count up from 0
     TCCR0A = 2<<WGM00;          // CTC mode
     if (CLKPR == 3)             // If clock set to 1MHz
         TCCR0B = (1<<CS00);     // Set prescaler to /1 (1uS at 1Mhz)
     else                        // Otherwise clock set to 8MHz
         TCCR0B = (2<<CS00);     // Set prescaler to /8 (1uS at 8Mhz)
     GTCCR |= 1 << PSR0;         // Reset prescaler
     OCR0A = 49;                 // 49 + 1 = 50 microseconds (10KHz)
     TIFR = 1 << OCF0A;          // Clear output compare interrupt flag
     TIMSK |= 1<<OCIE0A;         // Enable output compare interrupt
     sei();                      // Enable global interrupts
     
     while (1) {}
}