USI Serial UART Receive on ATtiny

Atmel USI Block Diagram

Many ATtiny microprocessors don’t include a hardware UART, but do include a Universal Serial Interface, USI. The USI module can be used to implement SPI, TWI (also known as I2C) and UART serial interfaces. This post describes how to implement a simple UART receiver using the USI module.

Arduino Software Serial Library

The Arduino core for ATtiny includes a Software Serial library which implements a serial UART interface. This does not take advantage of the USI module and requires 2K of flash. This is probably your best option if you are using Arduino and have plenty of flash available. Using the USI allows you to use faster baud rates and less microcontroller resources, the approach below uses less than 1K of flash.

Atmel app note AVR307

Atmel describe how to use the USI to implement a serial UART for an ATtiny26 in app note AVR307. This provides great information and is worth a read. They also provide the source code at www.atmel.com/images/AVR307.zip.

We will be targeting the ATtiny25, ATtiny45 and ATtiny85 in the code below.

Borrowing Timer/Counter0

The USI module uses Timer/Counter0 which is also used by the hilowtech Arduino core to keep track of time. So if you are using Arduino you will need to borrow Timer/Counter0 and give it back. I’ve written a separate post describing how to do this: Borrowing an Arduino timer.

Calibrating the internal oscillator

You can use an external oscillator or the internal oscillator. In my experience the internal oscillator is factory calibrated accurately enough that you probably won’t need to do any user tuning.

But if you need to tune your internal oscillator, then take a look the following post: Tuning ATtiny internal oscillator.

UART input signal

A serial UART packet consists of a start bit, 5 to 9 bits of data, an optional parity bit and one or two stop bits. But we will just consider the typical configuration of 8 data bits and no parity bit.

When the input line is idle and no data is being transmitted it is held high. You will want to include a pull up resistor in you circuit or enable pull up on the pin so that a floating input doesn’t trigger a false start.

The start of a byte is indicated by pulling the input line low for one bit width. This is immediately followed by each data bit with low representing zero and high representing one with the least significant bit transmitted first. This is then followed by one or two stop bits which are high.

uart trace
UART Oscilloscope trace

Note how the bits appear in the reverse order in the oscilloscope trace above.

Timing a bit width

We are going to use Timer/Counter0 to time a bit width. This is an 8 bit timer, so the its maximum value is 255. We will need to use the Prescaler to adjust the clock input to Timer/Counter0 so that the number of ticks is 255 or less.

We need to choose the right Prescaler value for the regular ATtiny clock speeds of 1MHz, 8MHz and 16Mz and a range of common UART baud rates of such as 9600, 14400, 28800, 57600, 115200 and 230400. For example 9600 baud is 1666 CPU cycles at 16MHz, which is 208.25 when divided by 8, 208 whole CPU cycles with a 0.12% drift.

When setting the baud rate and CPU clock speed you will want the timer to be as accurate as possible, keeping drift below 5%. So higher baud rates won’t be available at lower CPU speeds. For example 230400 baud with a 1MHz clock is 4.34 CPU cycles (1000000/230400) which is just 4 whole CPU cycles with a 34% drift. And that’s before we take code execution time into account.

We define the clock speed, F_CPU, and the baud rate. Note F_CPU is already defined in Arduino and may already be defined in your development environment, it’s normally defined as a build symbol.

#define F_CPU               8000000
#define BAUDRATE            9600

We can use these to calculate the number of CPU clock cycles per bit width.

#define CYCLES_PER_BIT      ( F_CPU / BAUDRATE )

If this number is 255 or less then we set the clock source to be the CPU clock, otherwise we will use the prescaler to divide the clock input to Timer/Counter0 by 8.

We use the bottom three Clock Select bits of Timer/Counter0 Control Register B, TCCR0B, to configure the prescaler. A Clock Select value of 1 is for the CPU clock and a value of 2 is for CPU clock divided by 8.

#if (CYCLES_PER_BIT > 255)
#define DIVISOR             8
#define PRESCALE            2
#else
#define DIVISOR             1
#define PRESCALE            1
#endif
#define FULL_BIT_TICKS      ( CYCLES_PER_BIT / DIVISOR )

Using the USI

The USI module is optimized for use in either Two Wire I2C or Three Wire SPI mode, it doesn’t have a dedicated mode for UART. However the USI module is flexible enough that we can use an internal clock to trigger the USI to left shift the data bits into the USI data register.

We can use a pin change interrupt to detect the beginning of the start bit and then configure the USI to sample the value of the input pin in the center of each of the eight bits. We can then read the back the eight bit values that the USI shifted into the USI register and reverse them to get the original byte value.

Getting setup

ATtiny85 pinoutThe first thing we need to do is ensure that the USI data input pin, DI, is enabled for input. This pin 0 of port B, PB0 in the diagram above.

DDRB &= ~(1 << DDB0);          // Set pin PB0 to input
PORTB |= 1 << PB0;             // Enable internal pull-up on pin PB0

This is equivalent to the following Arduino function call:

pinMode(0, INPUT_PULLUP);

We will also ensure that the USI is disabled at this point.

USICR = 0;                     // Disable USI

Finally we want to enable pin change interrupts on pin PB0 so we can trigger at the beginning of the start bit.

GIMSK |= 1<<PCIE;               // Enable pin change interrupts
PCMSK |= 1<<PCINT0;             // Enable pin change on pin PB0

UART start bit

The pin change interrupt, PCINT0_vect, will fire when any pin on port B with pin change enabled in PCMSK changes from low to high or high to low. We are only interested on a high to low, falling edge, on pin PB0. We will disable pin change on the data input pin while we are reading the byte, so we just need to check whether PB0 is low.

ISR (PCINT0_vect)
{
  uint8_t pinbVal = PINB;
  if (!(pinbVal & 1<<PINB0))   // Trigger if DI is Low
  {
    onSerialPinChange();
  }
}

Move to the middle

To be resilient to any clock drift we will want the USI to take a sample in the middle of each bit. So we need to delay for half a bit width before starting the USI module.

#define HALF_BIT_TICKS      ( FULL_BIT_TICKS / 2 )

We should also account for the number of CPU cycles the interrupt vector and our code introduce between the beginning of the start bit and us starting the USI. I used the Atmel Studio simulator to get numbers for my code.

#define START_DELAY         ( 65 + 42 )
#define TIMER_START_DELAY   ( START_DELAY  / DIVISOR )

First we need to the disable pin change interrupts as we don’t want to trigger our start bit interrupt vector while we are reading data bits.

void onSerialPinChange() {
  GIMSK &= ~(1<<PCIE);            // Disable pin change interrupts

We configure Timer/Counter0 to Clear Timer on Compare Match (CTC) mode. We do this by setting the three bits of the Waveform Generation Mode, WGM0, flag to 2 (binary 010). Just for fun bit 0 and bit 1 are in the TCCR0A and bit 2 is in TCCR0B, but as bit 2 is being set zero you won’t see it explicitly set in the code. The other bits of TCCR0A should be set to zero to indicate normal port operation. The least significant three bits of TCCR0B are used for the Clock Select mode and the rest of the bits (including WGM0 bit 2) should be set to zero.

  TCCR0A = 2<<WGM00;              // CTC mode
  TCCR0B = CLOCKSELECT;           // Set prescaler to cpu clk or clk/8

Then we can reset the prescaler to indicate that we changed its configuration and start Timer/Counter0 at zero.

  GTCCR |= 1 << PSR0;             // Reset prescaler
  TCNT0 = 0;                      // Count up from 0

We store the number of bits to count in Output Compare Register A, OCR0A. There is no code here to check that HALF_BIT_TICKS is actually greater than TIMER_START_DELAY, so be careful with choosing values for F_CPU, BAUDRATE and TIMER_START_DELAY.

  OCR0A = HALF_BIT_TICKS - TIMER_START_DELAY;

Finally we enable the output compare interrupt.

  TIFR = 1 << OCF0A;              // Clear output compare interrupt flag
  TIMSK |= 1<<OCIE0A;             // Enable output compare interrupt
}

Start the USI

When we reach the middle of the start bit, the Timer/Counter0 Compare Match A interrupt vector will fire.

ISR (TIMER0_COMPA_vect) {

First we disable the Compare Match A interrupt.

  TIMSK &= ~(1<<OCIE0A);          // Disable COMPA interrupt

We are going to configure the USI to sample every time Timer/Counter0 has counted up from zero to the value in Output Compare Register A, so we initialize the counter to zero and OCR0A to the full bit width.

  TCNT0 = 0;                      // Count up from 0
  OCR0A = FULL_BIT_TICKS;         // Shift every bit width

We configure the USI using the USI Control Register, USICR.

We want be notified when a byte has been read, so we set Counter Overflow Interrupt Enable, USIOIE, so that the USI_OVF interrupt will be called when the 8 bits have been shifted into the USI register.

We are not using either of the wire modes directly so we set the Wire Mode, USIWM, to zero indicating the automatic wire mode functionality is disabled.

To indicate we are using Timer/Counter0 Compare Match as the clock source we set Clock Source Select, USICS to 1.

  USICR = 1<<USIOIE | 0<<USIWM0 | 1<<USICS0;

Finally we start the USI using the USI Status Register, USISR.

The bottom four bits hold the USI Counter Value (0 to 15) and we want the overflow interrupt to be called after 8 bits have been read. The USI will still shift data into the USI Data Register on the Timer/Counter0 Compare Match trigger that results in an overflow, so we preset USICNT to 8.

We write 1 to the USI Overflow Interrupt Flag, USIOIF,  to clear the flag and ensure the interrupt doesn’t trigger immediately.

  USISR = 1<<USIOIF | 8;
}

Reversing a byte

The bits are going to be in the USI Data Register in reverse order, so we are going to need some code to reverse the bits. Needing to reverse a byte is a common problem so people have spent time producing the most efficient code to do this, so rather than write our own, we just borrow a common approach:

uint8_t ReverseByte (uint8_t x) {
    x = ((x >> 1) & 0x55) | ((x << 1) & 0xaa);
    x = ((x >> 2) & 0x33) | ((x << 2) & 0xcc);
    x = ((x >> 4) & 0x0f) | ((x << 4) & 0xf0);
    return x;
}

Reading the result

When USI has sampled 8 bits the USI counter will overflow and the USI Overflow interrupt USI_OVF will be called. The first thing we do is grab the received byte from the USI Data Register, USIDR and disable the USI.

ISR (USI_OVF_vect) {
  uint8_t usiByte = USIDR;
  USICR  =  0;                    // Disable USI

Then you will need to do something with the result, as we are still in an interrupt you will probably want to store the result in a volatile variable and then read and act on it in your main loop. Below I just call serialReceived function with the received data.

  serialReceived(ReverseByte(temp));

Now we are done we clear the Pin Change Interrupt Flag, by writing 1 to PCIF in the General Interrupt Flag register, GIFR, so the interrupt doesn’t trigger immediately. And we re-enable pin change interrupts so we are ready to receive the next byte by setting the Pin Change Interrupt Enable bit in the General Interrupt Mask Register, GIMSK.

  GIFR = 1<<PCIF;                 // Clear pin change interrupt flag
  GIMSK |= 1<<PCIE;               // Enable pin change interrupts again
}

We are actually in the middle of the last bit and if this is zero then a pin change interrupt will be called at the beginning of the stop bit which is high. However we only call onSerialChanged() if the state of the pin is low, so it will be ignored.

A one byte buffer

A really simple implementation of serialReceived() is to store a single byte, which is read by the main loop. If we receive a new byte before the main loop has read the last one then we just overwrite the value. A simple implementation like this is might be fine if we were only using the serial input to set the brightness of an LED say.

volatile bool serialDataReady = false;
volatile uint8_t serialInput;

void serialReceived(uint8_t data)
{
    serialDataReady = true;
    serialInput = data;
}

bool readSerialData(uint8_t* pData)
{
    if (serialDataReady)
    {
        *pData =serialInput;
        serialDataReady = false;
        return true;
    }
    return false;
}

The main loop calls readSerialData() to whenever it is ready to receive data. If the return value is true then new data is available in the location passed in the pData argument. For example:

unsigned char serialInput;
if (readSerialData(&serialInput))
{
  ledUpdate(serialInput);
}

Get the example code

I have uploaded an Arduino sketch to GitHub at:

https://github.com/MarkOsborne/becomingmaker/tree/master/USISerial

This example sets the PWM duty load of PB4 to the value of the byte read from the UART input SDA. Connect your favorite serial devices Tx pin to SDA and an LED (plus current limiting resistor) to PB4 and you can set the brightness of the LED by sending bytes of serial.

USI Serial Send

If you want to transmit serial data from an ATtiny using the USI then take a look at my blog post USI Serial UART Send on ATtiny.

Sources of inspiration:

Thanks to @Atmel for publishing app note AVR307 and the source code at www.atmel.com/images/AVR307.zip.

Thanks to @technoblogy for the post Simple ATtiny USI UART which inspired me to write my own based on AVR307.

And of course the ATTiny24/45/85 Datasheet is an essential resource.

5 thoughts on “USI Serial UART Receive on ATtiny”

  1. I did not find much information about how can you calculate the time using timer1 on Atiny 85.
    All tutorials are dealing with timer0,
    Timer 1 is a particular one is not compatible with others,
    It has precallers from 1/2-1/16384 Synchronous
    Clocking Mode.
    The problem is that about any prescaler I chose lets say on internal clock 8Mhz with normal formula.
    (1/clock)*prescaler=Timer tick.I get totally different timing than one expected .I load the time on OCR1A and toggle some pins the time is much less.
    I cleared before
    TCCR1=0 so does not have other prescaler set.
    Can you give me a hint on what is going on?I did not find any info on how to calculate timing on timer 1.It seems different than normal timer0
    I can manually trial and error tweak for the value I need but this is not professional.

  2. All the information you need is in the Atmel datasheet available here: http://www.atmel.com/Images/Atmel-2586-AVR-8-bit-Microcontroller-ATtiny25-ATtiny45-ATtiny85_Datasheet.pdf

    Timer1 supports asynchronous mode (using the fast peripheral clock PCK) and synchronous mode (using CPU clock CK). So the first step is to make sure that you are in synchronous mode. This is the default mode, so unless you have changed CKSEL to select ATtiny15 compatibility mode, or enabled PCK with the PCKE in PLLCSR, that shouldn’t be an issue.

    The Clear Timer/Counter on Compare Match, CTC, for Timer1 is based on OCR1C, but there is no interrupt vector for OCR1C, so the approach for using this timer is different than Timer/Counter0. If we want to trigger an interrupt every 100 ticks, then we can use OCR1C to reset the counter every 100 ticks and then use OCR1A to trigger an interrupt on any counter value between 0 and 99.

    Here is some code that toggles PB4 every 100 ticks of Timer/Counter1. This was written in Atmel Studio rather than Arduino so I could verify timing in the simulator. Just copy the Setup section and the ISR for an Arduino sketch.

    #include <avr/io.h>
    #include <avr/interrupt.h>
    
    ISR(TIMER1_COMPA_vect){
     PORTB ^= 1 << PB4;
    }
    int main(void)
    {
     // Setup
     DDRB |= (1 << PB4); // Set PB4 as output
     TCCR1 = 1 << CTC1; // CTC mode OCRAC
     TCCR1 |= 1; // set prescaler to CLK/1
     GTCCR |= 1 << PSR1; // Reset prescaler
     OCR1A = 99; // trigger on counter value 99
     OCR1C = 99; // reset counter every 100 ticks
     TIFR = 1 << OCF1A; // Clear output compare interrupt flag
     TIMSK |= 1 << OCIE1A; // Enable output compare interrupt
     TCNT1 = 0; // Count up from 0
     sei(); // Enable global interrupts
    
     // Loop
     while (1) {}
    }
    
  3. I used a similar code for compare match but without automatic clearing.
    #include
    #include

    ISR(TIMER1_COMPA_vect){
    TCNT1=0;
    OCR1A=50;
    PORTB ^= 1 << PB4;
    }
    int main(void)
    {
    // Setup
    DDRB |= (1 << PB4); // Set PB4 as output
    // TCCR1 = 1 << CTC1; // CTC mode OCRAC
    TCCR1 = 0; //reset timer;
    TCCR1 |= 1; // set prescaler to CLK/1
    GTCCR |= 1 << PSR1; // Reset prescaler
    OCR1A = 99; // trigger on counter value 99
    //OCR1C = 99; // reset counter every 100 ticks
    TIFR = 1 << OCF1A; // Clear output compare interrupt flag
    TIMSK |= 1 << OCIE1A; // Enable output compare interrupt
    TCNT1 = 0; // Count up from 0
    sei(); // Enable global interrupts

    // Loop
    while (1) {}
    }
    and was not working with calculated timing.
    In your example I have to set both OCR1A and OCR1C.I will test that I need to change in interrupt OCRIA value for different timing.Using both add more code in interrupt ,not very good.

    1. If you want accurate timing, say for sampling a serial UART input, then you are much better off using the hardware to reset the counter as there will be no lag. If you reset the counter inside your interrupt service routine then you are going to have to account for the drift introduced by running the code. The ISR preamble takes about 42 CPU cycles to execute; there is overhead to execute the interrupt, disable global interrupts and to save registers. So in your example the counter reset is apx 43 CPU cycles after the compare match. So after the initial two toggles, the pin is being toggled every 93 (50 + 43) CPU clock cycles. Code lag on executing a timed action can be fine, e.g. toggling the pin 45 CPU cycles after a compare match that triggers an interrupt every 100 cycles would still give you a pin that toggled every 100 cycles. However introducing code lag into the reset of your counter changes the actual timing.

  4. Thanks Mark,
    42 cycles at 8Mhz is 5.25 us.It seems a lot.But Like you said is a timed action the initial lag is not noticed because the difference is the same time 43+100=43+100 ..diff=100 is the same.
    But I think I’ll follow your advice and change it .So I have to double the OCR1A with OCR1C,
    I will use OCR1C=OCR1A.

    Thanks again.

Comments are closed.