Union in C Programming

Introduction


Union in C Programming

When working with user-defined data types in C, developers encounter a fundamental design choice between structures and unions. While both allow grouping multiple variables of different types, their memory allocation strategies differ significantly. A union allocates memory only for its largest member, with all members sharing the same memory location. This approach presents both opportunities and constraints for system-level programming.


Understanding unions requires examining how memory is organized, what happens when multiple members are accessed, and why this data type remains relevant despite modern memory abundance. In this article, you will gain an understanding of union syntax, memory allocation behavior, practical applications, and scenarios where unions offer advantages over structures.


(toc) #title=(Table of Content)

What Is a Union in C?


A union is a user-defined data type in C that allows different data types to occupy the same memory location. Syntactically, unions resemble structures—they contain members of varying types and support similar access patterns. The critical distinction lies in memory allocation: structures allocate separate storage for each member, while unions allocate a single shared memory block sized to accommodate the largest member.


Syntax of Union Declaration


c

union union_name {
    data_type member1;
    data_type member2;
    // more members
};


For example, a union containing an integer, character, and floating-point value:


c

union Data {
    int integerValue;
    char characterValue;
    float floatValue;
};


A variable declaration follows the same pattern as structures:


c

union Data sample;  // variable declaration


Memory Allocation Model


The fundamental difference between structures and unions appears in memory layout. Consider a structure with three members:


c

struct Record {
    int id;      // 4 bytes
    char grade;  // 1 byte
    float score; // 4 bytes
};


A variable of type struct Record typically consumes 9 bytes of memory (assuming 4-byte integers and floats). Each member has its own distinct memory address.


By contrast, a union with identical members:


c

union CompactRecord {
    int id;      // 4 bytes
    char grade;  // 1 byte
    float score; // 4 bytes
};


A variable of type union CompactRecord consumes only 4 bytes—the size of the largest member. All three members share this same memory location.


Memory Allocation Model


To verify memory sizes programmatically:


c

#include <stdio.h>

int main() {
    printf("Structure size: %lu bytes\n", sizeof(struct Record));
    printf("Union size: %lu bytes\n", sizeof(union CompactRecord));
    return 0;
}


Accessing Union Members


Union members are accessed using the dot operator (.) for variables and the arrow operator (->) for pointers:


c

union CompactRecord data;
data.id = 42;
printf("ID: %d\n", data.id);

union CompactRecord *ptr = &data;
ptr->grade = 'A';
printf("Grade: %c\n", ptr->grade);


The Overwrite Problem


Because all members share the same memory, writing to one member overwrites previously stored data in other members. The union retains only the most recently assigned value.


Consider this example:


c

union CompactRecord item;
item.id = 100;        // Memory contains integer 100
item.grade = 'B';     // Memory now contains ASCII value 66 (overwrites)
item.score = 85.5;    // Memory now contains float 85.5 (overwrites again)

printf("%d\n", item.id);     // Garbage or unexpected output
printf("%c\n", item.grade);  // Garbage or unexpected output
printf("%.1f\n", item.score); // 85.5 (last value stored)


The first two printf statements produce undefined behavior because the memory originally containing valid integer and character data has been reinterpreted as a floating-point representation. Only the most recent assignment (score) yields predictable output.


When to Use Unions


Although modern systems provide abundant memory, unions remain valuable in specific scenarios:


Embedded Systems and Resource-Constrained Environments


Microcontrollers with limited RAM (e.g., 2KB or less) benefit from the memory efficiency of unions. When processing sensor data where only one measurement type is needed at a time, a union reduces memory footprint substantially.


Protocol Parsing and Data Serialization


Network protocols often define packet formats where fields change meaning based on a type discriminator. Unions enable efficient representation of variant data:


c

struct Packet {
    uint8_t type;
    union {
        struct { int x; int y; } coordinates;
        struct { char buffer[64]; } text;
        struct { uint32_t status; } response;
    } payload;
};


Type Punning


Unions provide a method for examining the binary representation of one data type as another, though this usage requires careful attention to endianness and alignment.


Union vs Structure Comparison


Feature Structure Union
Memory allocation Sum of all members Size of largest member
Member storage Independent, simultaneous Shared, sequential only
Access behavior All members retain values Only last assigned member valid
Memory efficiency Lower (more bytes) Higher (fewer bytes)
Use case When all fields needed together When one field needed at a time
Practical usage today Very common Specialized situations

Practical Example: Temperature Data Logger


A temperature monitoring system might record readings from three sensor types, but only one sensor is active at any moment:


c

union SensorReading {
    int thermocouple;    // raw ADC value, range 0-4095
    float thermistor;    // resistance converted to Celsius
    unsigned char ds18b20[8]; // raw 1-Wire data
};

struct LogEntry {
    unsigned long timestamp;
    unsigned char sensorType;  // 0=thermocouple, 1=thermistor, 2=DS18B20
    union SensorReading value;
};


Each log entry consumes approximately 12 bytes for timestamp and sensorType plus 8 bytes for the largest union member (the DS18B20 data array), rather than the 16+ bytes a structure would require for separate fields.


Limitations of Unions


The primary constraint of unions is the inability to store multiple member values simultaneously. A program cannot, for example, track both an integer identifier and a floating-point measurement in the same union variable without losing one value. This limitation makes unions unsuitable for records requiring multiple active fields.


Additionally, unions provide no built-in mechanism to track which member was most recently assigned. Application code must maintain separate state variables to know which interpretation of the union memory is currently valid.


Outlook


Modern C development continues to use unions in systems programming, embedded firmware, and network stack implementations. While general application programming rarely requires unions due to abundant memory and higher-level abstractions, the type remains essential for developers working close to hardware or implementing space-efficient data structures. Memory optimization techniques, including unions, persist as valuable tools for performance-critical systems where every byte counts.


FAQs


Can a union contain another union or structure?

Yes, unions can nest other unions or structures as members, enabling complex hierarchical data representations.



What happens when you read a union member different from the one you last wrote?

Reading a different member produces undefined behavior because the bytes in memory are interpreted according to an incompatible type, yielding unpredictable values.



Does the order of member declaration in a union matter?

No, all members share the same starting address, so declaration order does not affect memory layout or access behavior.



Can I initialize a union variable when declaring it?

Yes, you can initialize a union with the value for its first member, such as `union Data d = {42}` which initializes the integer member.



#buttons=(Ok, Go it!) #days=(20)

Our website uses cookies to enhance your experience. Learn More
Ok, Go it!