Attributes - Learning Module

Loading content...

0/241

Simple Attributes

The Building Blocks of Entity Description

Every entity in a database must be described. A Customer isn't just an abstract concept—it has a name, an email address, a date of birth, a credit limit. A Product has a price, a weight, a SKU number. These descriptive properties are what we call attributes, and they form the vocabulary through which we capture real-world information in our data models.

But not all attributes are created equal. Some are simple and atomic; others are complex and composite. Some can hold multiple values; others are derived from calculations. Understanding these distinctions isn't merely academic—it directly affects how your database schema is designed, how queries perform, and how data integrity is maintained.

We begin with the most fundamental type: simple attributes.

What You Will Learn

By the end of this page, you will understand what makes an attribute 'simple,' why atomicity is the defining characteristic, how to identify simple attributes in real-world scenarios, and how they map to implementation in relational databases. You'll develop the discernment to distinguish simple attributes from their more complex counterparts.

What is a Simple Attribute?

A simple attribute (also called an atomic attribute) is an attribute that cannot be meaningfully subdivided into smaller components. It represents a single, indivisible unit of information about an entity.

The Atomicity Principle:

The word 'atomic' comes from the Greek atomos meaning 'uncuttable.' In database modeling, an atomic attribute is one where breaking it into smaller parts would lose meaning or create unnecessary complexity for the domain being modeled.

Consider a person's age. The value '35' is atomic—there's no meaningful way to subdivide it. You can't split '35' into '3' and '5' and derive any useful information. The value stands alone, complete and indivisible.

Contrast this with a full name like 'John Michael Smith.' This could be subdivided into first name, middle name, and last name—each component carrying independent meaning. Whether you treat this as simple or composite depends on your requirements, but the possibility of meaningful subdivision is the key distinction.

The Context Dependency of Simplicity

Whether an attribute is 'simple' often depends on the application domain. A phone number might be simple in one system (just a string to store) but composite in a telecommunications system (country code, area code, subscriber number, extension). Always model attributes based on how they'll actually be used, not on theoretical divisibility.

Formal Definition:

In Entity-Relationship modeling, a simple attribute is defined as:

An attribute whose values are atomic and cannot be further decomposed into meaningful sub-components within the context of the data model.

Key characteristics of simple attributes:

Single-valued by nature — Each instance holds exactly one value (though it may be NULL)
Not divisible — No meaningful substructure exists within the value
Directly storable — Maps cleanly to a single column in a relational database
Self-contained — Doesn't require parsing or extraction to use
Domain-constrained — Values come from a defined domain (numbers, strings, dates, etc.)

Identifying Simple Attributes in Practice

Recognizing simple attributes requires asking the right questions about your data. Here's a systematic approach to identification:

The Atomicity Test

•Can this value be meaningfully subdivided? If breaking the value into parts yields components that have independent meaning and utility in your system, it's not simple.
•Will I ever need to query or sort by a sub-part? If you need to search by just the area code of a phone number, treating the whole number as atomic won't serve you well.
•Are the 'parts' ever manipulated independently? If first name and last name are displayed in different places or sorted independently, they shouldn't be a single attribute.
•Does the value have internal structure that matters? An IP address has four octets—if network operations need those octets separately, it's composite.

Common Examples of Simple Attributes:

Let's examine attributes that are almost universally treated as simple:

Simple Attributes Across Domains
Entity	Attribute	Domain	Why It's Simple
Employee	Salary	Decimal/Currency	A single monetary value; no meaningful internal subdivision
Product	Weight	Decimal + Unit	A single measurement; the unit is typically standardized
Order	OrderDate	Date	A single point in time; date components (year, month, day) are rarely queried independently in ordering context
Student	GPA	Decimal	A calculated but atomic result; you don't query by the 'first digit' of a GPA
Book	ISBN	String (formatted)	A standardized identifier; though structured, it's used as a whole unit
Sensor	Temperature	Decimal	A single reading value at a point in time
Account	Balance	Decimal/Currency	A single monetary value representing account state
Vehicle	Year	Integer	The manufacturing year; atomic in automotive contexts

Beware of False Simplicity

Some attributes appear simple but carry hidden complexity. A 'ZipCode' seems atomic, but in the US, ZIP+4 codes have two components (ZIP-5 and ZIP-4) sometimes used independently. An 'SSN' (Social Security Number) has three segments historically used to encode location and chronology. Model based on actual usage requirements, not surface appearance.

Representing Simple Attributes in ER Diagrams

In Entity-Relationship diagrams, simple attributes are depicted using standard notation that has remained consistent since Peter Chen's original 1976 paper. Understanding this notation is essential for communicating database designs.

Standard (Chen) Notation:

In Chen notation—the original and still widely-taught ER notation—simple attributes are represented as:

Oval (ellipse) containing the attribute name
Single line connecting the oval to its entity rectangle
Solid border (as opposed to dashed for derived attributes)

┌─────────────────┐
│    EMPLOYEE     │
└────────┬────────┘
         │
    ┌────┴────┐
   ╱          ╲
  (  Salary    )
   ╲          ╱
    └─────────┘

This simple oval representation immediately communicates that the attribute is atomic and single-valued. No additional markers are needed—simplicity is the default assumption.

Crow's Foot (IE) Notation:

In Information Engineering (Crow's Foot) notation, which is more common in industry tools, entities are represented as rectangles with attribute lists:

┌──────────────────┐
│     EMPLOYEE     │
├──────────────────┤
│ employee_id (PK) │
│ first_name       │
│ last_name        │
│ salary           │
│ hire_date        │
└──────────────────┘

Simple attributes appear as plain entries in the attribute list. They don't require any special marker—composite and multivalued attributes are marked differently, making simple the 'default' presentation.

Notation Consistency

Regardless of notation style, simple attributes share a common characteristic: they are presented in their most basic form without special markers, connecting lines to sub-elements, or multiple value indicators. The absence of complexity indicators signals simplicity.

UML Class Diagram Style:

When using UML for data modeling (increasingly common with object-relational mapping), simple attributes appear as:

┌────────────────────────────┐
│         Employee           │
├────────────────────────────┤
│ - employeeId: Integer      │
│ - salary: Decimal          │
│ - hireDate: Date           │
│ - isActive: Boolean        │
├────────────────────────────┤
│ + getSalary(): Decimal     │
└────────────────────────────┘

The notation shows attribute name, data type, and visibility (public/private). Again, simple attributes are the baseline—no special annotations needed.

Data Types and Domains for Simple Attributes

Simple attributes draw their values from defined domains—sets of acceptable values with associated data types. Understanding the relationship between conceptual attributes and implementation types is crucial for effective database design.

What is a Domain?

A domain is the set of all possible legal values for an attribute. For a simple attribute, the domain is:

Atomic — Each value in the domain is indivisible
Homogeneous — All values share the same data type
Constrained — Business rules may further limit acceptable values

Common Simple Attribute Domains
Conceptual Type	SQL Data Types	Domain Constraints	Examples
Identifier	INTEGER, BIGINT, UUID	Unique, non-null	employee_id, order_number
Monetary	DECIMAL(p,s), NUMERIC	Non-negative, precision rules	salary, price, balance
Quantity	INTEGER, SMALLINT	Non-negative, range limits	stock_level, seat_count
Measurement	FLOAT, DOUBLE, DECIMAL	Range constraints, precision	temperature, weight, distance
Text (short)	VARCHAR(n), CHAR(n)	Length limits, character sets	name, code, abbreviation
Text (long)	TEXT, CLOB	Size limits (if any)	description, notes
Date/Time	DATE, TIME, TIMESTAMP	Valid ranges, timezone rules	birth_date, created_at
Boolean	BOOLEAN, BIT	True/False/Unknown	is_active, has_paid
Enumeration	ENUM, CHECK constraint	Fixed set of values	status, priority, gender

Domain Clarity Prevents Errors

Explicitly defining domains during conceptual modeling prevents implementation errors. If 'status' can only be 'active', 'suspended', or 'closed', document this constraint. Don't leave it to be 'discovered' during development—it belongs in the data model.

Type-Attribute Alignment:

Choosing the right data type for a simple attribute involves several considerations:

Precision requirements — Will you need exact decimal values (money) or are floating-point approximations acceptable (scientific measurements)?
Storage efficiency — An INTEGER (4 bytes) might suffice where BIGINT (8 bytes) would waste space for millions of rows.
Value range — A TINYINT (0-255) works for age; a SMALLINT (-32,768 to 32,767) won't hold a world population count.
Null semantics — Can the attribute be unknown/missing? If not, enforce NOT NULL.
Indexing behavior — Some types index more efficiently than others; VARCHAR(255) indexes differently than TEXT.
Comparison semantics — Case-sensitive vs. case-insensitive string comparison; timezone-aware vs. naive timestamps.

domain-definitions.sql
SQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Defining domains for simple attributes in PostgreSQL
CREATE DOMAIN positive_money AS DECIMAL(15,2)
    CHECK (VALUE >= 0);
 
CREATE DOMAIN email_address AS VARCHAR(254)
    CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');
 
CREATE DOMAIN percentage AS DECIMAL(5,2)
    CHECK (VALUE >= 0 AND VALUE <= 100);
 
CREATE DOMAIN us_phone AS VARCHAR(14)
    CHECK (VALUE ~ '^\(\d{3}\) \d{3}-\d{4}$');
 
-- Using domains in table definitions
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email email_address UNIQUE NOT NULL,
    salary positive_money NOT NULL,
    commission_rate percentage,
    phone us_phone
);
 
-- Each attribute maps to a single column
-- Each column has explicit domain constraints
-- Simple attributes = simple storage

Simple Attributes in Entity Instances

Understanding how simple attributes manifest in actual entity instances helps cement the conceptual model. Each instance of an entity has specific values for its simple attributes—this is where abstraction meets data.

From Schema to Instance:

Consider an Employee entity with simple attributes:

Employee Entity Instances
employee_id	first_name	last_name	salary	hire_date	is_active
1001	Alice	Johnson	75000.00	2020-03-15	true
1002	Bob	Williams	62500.00	2021-07-22	true
1003	Carol	Brown	88000.00	2019-01-10	true
1004	David	Miller	55000.00	2022-11-01	false
1005	Eve	Davis	NULL	2023-06-15	true

Observations about these instances:

Each attribute holds exactly one value per row — There's no notion of multiple salaries or multiple hire dates for a single employee (simple + single-valued).
Values are atomic — 'Alice' isn't subdivided; '75000.00' stands as one value; '2020-03-15' is treated as a complete date.
NULL is possible — Eve's salary is NULL (unknown/not yet assigned), which is different from having a value of zero. NULLs are valid for simple attributes unless constrained otherwise.
Each value comes from its domain — Salaries are decimals, dates are dates, booleans are booleans. There's no type mixing within an attribute.

The Single Column Rule

A reliable indicator that you've correctly identified a simple attribute: it maps naturally to exactly one column in a relational table. If you find yourself wanting multiple columns for one 'attribute,' it's probably composite. If you need multiple rows for one entity's 'attribute,' it's probably multivalued.

Value Constraints on Instances:

Beyond data type constraints, simple attribute values may be constrained by:

NOT NULL — The attribute must always have a value
UNIQUE — No two entities can have the same value for this attribute
CHECK constraints — Values must satisfy a boolean condition
FOREIGN KEY (for references) — Values must exist in a referenced table
DEFAULT values — A value to use if none is explicitly provided

These constraints ensure that even simple attributes maintain data integrity and reflect real-world business rules.

Practical Examples Across Domains

Let's examine simple attributes in several real-world entity designs, reinforcing proper identification and representation.

Product Entity in an E-Commerce System:

Consider modeling products for an online store:

Product Simple Attributes
Attribute	Type	Constraints	Notes
product_id	INTEGER	PK, NOT NULL	Unique identifier
sku	VARCHAR(20)	UNIQUE, NOT NULL	Stock keeping unit
name	VARCHAR(200)	NOT NULL	Display name
unit_price	DECIMAL(10,2)	CHECK >= 0	Price per unit
weight_kg	DECIMAL(8,3)	CHECK > 0	Shipping weight
stock_quantity	INTEGER	CHECK >= 0, DEFAULT 0	Current inventory
is_active	BOOLEAN	DEFAULT true	Available for sale
created_at	TIMESTAMP	DEFAULT NOW()	Record creation time

Why these are simple:

sku — Although formatted (ABC-12345), it's used as a single identifier
unit_price — Single monetary value; no separate cents column
weight_kg — Single measurement; unit is standardized
stock_quantity — Single integer; no subdivision needed

Common Mistakes with Simple Attributes

Modeling simple attributes seems straightforward, but several common mistakes can lead to poor designs. Recognizing these pitfalls helps you avoid them.

Mistakes to Avoid

•Treating composite as simple — Storing 'John Smith' instead of separate first_name and last_name when you need to sort by last name or personalize with first name only. This violates First Normal Form and creates query difficulties.
•Over-atomizing — Splitting a date into year, month, day columns when you never query them individually. Sometimes what seems composite should stay together for usability.
•Ignoring domain constraints — Declaring an attribute as VARCHAR without considering valid patterns (email, phone, postal code). Simple doesn't mean unconstrained.
•Embedding units in values — Storing '150 kg' instead of storing 150 with an implicit or separate unit. The value '150 kg' isn't truly atomic—it's a value plus metadata.
•Using wrong data types — Storing numeric values as strings (prices as VARCHAR) because 'they might have dollar signs.' Clean the data; store it correctly.
•Nullable by default — Not thinking about whether NULL is valid for each attribute. Every simple attribute should have a conscious nullability decision.
•Precision mismatches — Using FLOAT for money (leading to rounding errors) or INTEGER for quantities that could be fractional.

The 'Phone Number' Test Case

Phone numbers are a classic gray area. In most business applications, storing the complete phone number as a single VARCHAR is correct—it's used for dialing and display. But in telecommunications analysis, you might need country code, area code, and subscriber number separately. Neither approach is universally 'right'—context determines correct modeling.

The 'Will I Need It' Heuristic:

When deciding whether to treat something as simple or subdivide it, ask:

Will I ever need to query by just part of this value?
Will I ever need to update just part of this value?
Will I ever need to validate parts independently?
Will I ever need to display parts separately?

If the answer to all four is 'no,' keep it simple. If any is 'yes,' consider making it composite—which we'll cover in the next page.

Summary: Simple Attributes

Simple attributes form the foundation of entity description. They represent atomic, indivisible pieces of information that characterize entities in your data model. Let's consolidate our understanding:

Key Takeaways

•Atomicity is the defining characteristic — Simple attributes cannot be meaningfully subdivided; their values are complete and indivisible.
•Simplicity is context-dependent — What's simple in one application may be composite in another. Model based on how data will actually be used.
•Simple maps to simple — A simple attribute becomes exactly one column in a relational table, with one value per row.
•Domains constrain values — Every simple attribute has a domain defining acceptable values, data type, and constraints.
•Notation is consistent — In ER diagrams, simple attributes are the default—no special markers needed.
•Ask the right questions — Use the 'will I need to query/update/validate/display parts?' test to confirm simplicity.

What's Next:

Not all attributes are atomic. In the next page, we explore composite attributes—attributes that have meaningful internal structure, consisting of sub-components that may need to be accessed independently. Understanding when to use composite versus simple attributes is a critical skill in effective ER modeling.

Page Complete

You now understand simple attributes—the atomic building blocks of entity description. You can identify them, represent them in various notations, define their domains, and avoid common modeling mistakes. Next, we'll extend this foundation to handle attributes with internal structure: composite attributes.