HD44780 library for STM32

How to effectively use a HD44780 based display with an STM32 MCU?

The 1602 LCD displays are very common and inexpensive. They are based on a controller compatible with the venerable HD44780, which is very simple to use.
How to effectively use a HD44780 based display with an STM32 MCU, like the Blue Pill or the Black Pill?

A display based on an HD44780 has some advantages:

  • Very cheap
  • Very easy to use

It also has some disadvantages:

  • Uses many pins: 7 to 11
  • It is slow

How to minimize these disadvantages using the special features of STM32 MCUs?


The instructions executed by the controller have a variable duration, and during this time the controller should not be addressed. In many implementations a fixed delay is added after each command to take this into account.
However, the controller has a BUSY flag on pin D7 which can be read to limit the waiting time to a minimum. On the other hand, the STM32 makes it easy to switch the input / output mode of a pin. We will therefore use this functionality to read the BUSY flag of the controller after each command to minimize the time between two commands.

Direct access to the GPIO registers is simple on the STM32s, therefore direct access to the MCU registers is used. This avoids having to use heavy and slow libraries like the HAL.

The pins used by the display

First the number of pins

The controller allows the use of 6, 7 or 11 pins. Using 11 pins (with 8-bit data bus) is more efficient than using a 4-bit data bus. However, the gain is minimal compared to the processing time of each command by the controller: approximately 37 µs. In addition the number of pins available on the MCU is limited. So we will use 7 pins:

  • RS : Data / Command
  • WR : To read the BUSY flag
  • E : Strobe
  • D4-D7 : The data bus

Which pins of the STM32 to use?

For commands (RS, WR, E) no problem: they are used independently and can therefore be assigned to any GPIO of the MCU.

For the data bus, it can be tempting to use 4 contiguous pins on the same GPIO port. But this is very restrictive:
1) all the pins of the MCUs have several functions and it is rare to be able to have 4 contiguous free pins.
2) You must be able to position the data on these pins atomically.

This last point concerns multi-tasking applications, or interruptions routines. You cannot read / modify / write on the GPIO ports. This is why the GPIO ports have registers BSRR and BRR which allow to put at 1 or at 0 a pin in an atomic way.

Since it is agreed to set the pins to 0 or 1 independently of each other using the BSRR and BRR registers, there is no longer any constraint on the arrangement of the data bits on the GPIOs.

All pins on the display can therefore be connected to any unused pins on the MCU.

HD44780 LCD wiring

How to do this?

You must declare the pins used, for example:

// Define pins, modify these definitions according to your wiring

#define	RS_PORT		GPIOA
#define	RS_PIN		10

#define	RW_PORT		GPIOC
#define	RW_PIN		4

#define	E_PORT		GPIOB
#define	E_PIN		5

#define	D4_PORT		GPIOB
#define	D4_PIN		13

#define	D5_PORT		GPIOB
#define	D5_PIN		15

#define	D6_PORT		GPIOB
#define	D6_PIN		14

#define	D7_PORT		GPIOB
#define	D7_PIN		10

You must then declare the macros that will position the pins at 0 or 1.
Macros are used to generate inline code, and they are too simple to justify inline functions. These declarations are independent of the STM32 model used.

In fact these declarations are a little more complex for STM32F4. In this family there is no BRR register. The real story: BRR and BSRR are 16-bit registers, but stm32f4xx.h defines them as a single 32-bit register. Sigh…

#define	rsSet()		RS_PORT->BSRR = (1u << RS_PIN)
#define	rsClear()	RS_PORT->BRR  = (1u << RS_PIN)

#define	rwSet()		RW_PORT->BSRR = (1u << RW_PIN)
#define	rwClear()	RW_PORT->BRR  = (1u << RW_PIN)

#define	eSet()		E_PORT->BSRR  = (1u << E_PIN)
#define	eClear()	E_PORT->BRR   = (1u << E_PIN)

#define	d4Set()		D4_PORT->BSRR = (1u << D4_PIN)
#define	d4Clear()	D4_PORT->BRR  = (1u << D4_PIN)

#define	d5Set()		D5_PORT->BSRR = (1u << D5_PIN)
#define	d5Clear()	D5_PORT->BRR  = (1u << D5_PIN)

#define	d6Set()		D6_PORT->BSRR = (1u << D6_PIN)
#define	d6Clear()	D6_PORT->BRR  = (1u << D6_PIN)

#define	d7Set()		D7_PORT->BSRR = (1u << D7_PIN)
#define	d7Clear()	D7_PORT->BRR  = (1u << D7_PIN)

With these definitions it is easy to manage the pin’s states. For example to send a command to the controller, which requires generating a pulse on pin E:

// Generate E pulse

static void ePulse (void)
	bspDelayUs (1) ;		// For data setup
	eSet () ;
	bspDelayUs (1) ;
	eClear () ;
// Send command to LCD

static	void hdCmd (uint8_t cc)
	setDataPort (cc >> 4) ;		// Get upper nibble
	rsClear () ;				// Set LCD to instruction mode
	ePulse () ;					// E strobe

	setDataPort (cc) ;			// Get lower nibble
	ePulse () ;					// E strobe

	hdWaitForBusy () ;

The setDataPort () function which allows writing a nibble to the data port D7-D4 is hardly more complicated. The function is not that slow: around 1 µs on an STM32F103 at 72 MH.

// Set the low nibble of the data to the data output port

static void setDataPort (uint32_t data)
	if ((data & 0x08u) == 0u)
			d7Clear ()
	else	d7Set ()

	if ((data & 0x04u) == 0u)
			d6Clear ()
	else	d6Set ()

	if ((data & 0x02u) == 0u)
			d5Clear ()
	else	d5Set ()

	if ((data & 0x01u) == 0u)
			d4Clear ()
	else	d4Set ()

The last complex function is the initialization of the GPIO pins and of the LCD:

// Initialize the LCD device

void hdInit (void)
	// Initialize GPIO ports
	hdGpioInit (RS_PORT, RS_PIN) ;
	hdGpioInit (RW_PORT, RW_PIN) ;
	hdGpioInit (E_PORT,  E_PIN) ;
	hdGpioInit (D4_PORT, D4_PIN) ;
	hdGpioInit (D5_PORT, D5_PIN) ;
	hdGpioInit (D6_PORT, D6_PIN) ;
	hdGpioInit (D7_PORT, D7_PIN) ;

	// Reset HD44780
	rsClear () ;
	rwClear () ;
	bspDelayUs (40000) ;	// Min 40ms

	setDataPort (0x03u) ;
	ePulse () ;
	bspDelayUs (6000) ;		// Min 5ms

	setDataPort (0x03u) ;
	ePulse () ;
	bspDelayUs (200) ;		// Min 100 us

	setDataPort (0x03u) ;
	ePulse () ;
	bspDelayUs (200) ;

	setDataPort (0x02u) ;	// Enable 4-Bit Mode
	ePulse () ;
	hdWaitForBusy () ;

	hdCmd (0x28) ;		// Set Interface Length: 4 bits, 2 lines, 5x7 font
	hdCmd (0x08) ;		// Display off
	hdCmd (0x01) ;		// Clear display
	hdCtrl = 0x0Cu ;	// Display on, cursor off, no blink
	hdCmd (hdCtrl) ;
	hdCmd (0x06) ;		// Entry mode: Move cursor after each data

The functions already seen use functions specific to the STM32 model used. In particular, you must be able to configure D7 as output and also as input to be able to read the BUSY flag of the HD44780. For an STM32F103 this gives:

// Must be adapted to your MCU

// For STM32F1xx MCU

#define		MODE_IN		0x04u
#define		MODE_OUT	0x02u
#define		MODE_MASK	0x0Fu

// Configure GPIO pins to output

static void hdGpioInit (GPIO_TypeDef * pGpio, uint32_t pin)
	// Enable port clock (without pointer arithmetic...)
	RCC->APB1ENR |= 1u << ((((uint32_t) pGpio - (uint32_t) GPIOA) / ((uint32_t) GPIOB - (uint32_t) GPIOA)) + RCC_APB2ENR_IOPAEN_Pos) ;

	// Initialize GPIO as output
	if (pin < 8)
		pGpio->CRL = (pGpio->CRL & ~(MODE_MASK << (pin * 4))) | (MODE_OUT << (pin * 4)) ;
		pGpio->CRH = (pGpio->CRH & ~(MODE_MASK << ((pin - 8) * 4))) | (MODE_OUT << ((pin - 8) * 4)) ;

// To switch D7 to input, allows to check the HD44780 busy flag.

__ALWAYS_STATIC_INLINE	void	d7IsInput (void)
	#if (D7_PIN < 8)
		D7_PORT->CRL = (D7_PORT->CRL & ~(MODE_MASK << (D7_PIN * 4))) | (MODE_IN << (D7_PIN * 4)) ;
		D7_PORT->CRH = (D7_PORT->CRH & ~(MODE_MASK << ((D7_PIN - 8) * 4))) | (MODE_IN << ((D7_PIN - 8) * 4)) ;

// To switch D7 to output

__ALWAYS_STATIC_INLINE	void	d7IsOutput (void)
	#if (D7_PIN < 8)
		D7_PORT->CRL = (D7_PORT->CRL & ~(MODE_MASK << (D7_PIN * 4))) | (MODE_OUT << (D7_PIN * 4)) ;
		D7_PORT->CRH = (D7_PORT->CRH & ~(MODE_MASK << ((D7_PIN - 8) * 4))) | (MODE_OUT << ((D7_PIN - 8) * 4)) ;

The GPIO initialization is a bit more complicated on an F4, which uses 5 registers instead of 1.

Once all of this is defined it is very easy to implement the user API, for example:

// LCD clear

void    hdClear (void)
	hdCmd (1) ;

The delays

We can observe that some functions use delays of a few micro seconds. The function of the AdAstra-RTK BSP to generate such delays is:

//	Lets busy wait for delays of tens of microseconds
//	Force optimization level to be independent of the general setting
//	BEWARE: At low core frequency the 1us delay may be impossible

#if defined(__CC_ARM)
	#pragma push
	#pragma Otime
	#pragma GCC push_options
	#pragma GCC optimize ("O3")

void bspDelayUs (uint32_t us)
	volatile uint32_t start = DWT->CYCCNT ;
	volatile uint32_t cycles = (bspTsMhzDiv * us) - 42u ;	// Remove some cycles for instructions

	while ((DWT->CYCCNT - start) < cycles) ;

#if defined(__CC_ARM)
	#pragma pop
	#pragma GCC pop_options


Here is an efficient implementation of a 1602 library for stm32 or blue pill board. These LCD displays are based on an HD44780 controller which allows:

  • Little constraint on pins to use MCU
  • Timing optimized by direct access to the registers and the use of the BUSY flag of the controller.
  • Easy portage between STM32 families.

On a STM32F103 at 72 MHz it was measured approximately 40 µs per character when displaying a chain.

The library hd44780 to use with AdAstra-RTK is available here. It supports F1xx/L1xx and other families (F4, F7, H7…)

The library is also available in the distribution packages of the “Download” page.

Some links

The HD44780 datasheet

Beware: The displays are available in 3.3V and 5V.
With STM32 it is easier to use 3.3V displays: See here how to hack your 1602 display.

Strangely on my 1602 in 3.3V the J3 strap is open …

Leave a Comment

Your email address will not be published. Required fields are marked *

Solve : *
26 ⁄ 13 =