Loading content...
Every entity in a database must be described. A Customer isn't just an abstract concept—it has a name, an email address, a date of birth, a credit limit. A Product has a price, a weight, a SKU number. These descriptive properties are what we call attributes, and they form the vocabulary through which we capture real-world information in our data models.
But not all attributes are created equal. Some are simple and atomic; others are complex and composite. Some can hold multiple values; others are derived from calculations. Understanding these distinctions isn't merely academic—it directly affects how your database schema is designed, how queries perform, and how data integrity is maintained.
We begin with the most fundamental type: simple attributes.
By the end of this page, you will understand what makes an attribute 'simple,' why atomicity is the defining characteristic, how to identify simple attributes in real-world scenarios, and how they map to implementation in relational databases. You'll develop the discernment to distinguish simple attributes from their more complex counterparts.
A simple attribute (also called an atomic attribute) is an attribute that cannot be meaningfully subdivided into smaller components. It represents a single, indivisible unit of information about an entity.
The Atomicity Principle:
The word 'atomic' comes from the Greek atomos meaning 'uncuttable.' In database modeling, an atomic attribute is one where breaking it into smaller parts would lose meaning or create unnecessary complexity for the domain being modeled.
Consider a person's age. The value '35' is atomic—there's no meaningful way to subdivide it. You can't split '35' into '3' and '5' and derive any useful information. The value stands alone, complete and indivisible.
Contrast this with a full name like 'John Michael Smith.' This could be subdivided into first name, middle name, and last name—each component carrying independent meaning. Whether you treat this as simple or composite depends on your requirements, but the possibility of meaningful subdivision is the key distinction.
Whether an attribute is 'simple' often depends on the application domain. A phone number might be simple in one system (just a string to store) but composite in a telecommunications system (country code, area code, subscriber number, extension). Always model attributes based on how they'll actually be used, not on theoretical divisibility.
Formal Definition:
In Entity-Relationship modeling, a simple attribute is defined as:
An attribute whose values are atomic and cannot be further decomposed into meaningful sub-components within the context of the data model.
Key characteristics of simple attributes:
Recognizing simple attributes requires asking the right questions about your data. Here's a systematic approach to identification:
Common Examples of Simple Attributes:
Let's examine attributes that are almost universally treated as simple:
| Entity | Attribute | Domain | Why It's Simple |
|---|---|---|---|
| Employee | Salary | Decimal/Currency | A single monetary value; no meaningful internal subdivision |
| Product | Weight | Decimal + Unit | A single measurement; the unit is typically standardized |
| Order | OrderDate | Date | A single point in time; date components (year, month, day) are rarely queried independently in ordering context |
| Student | GPA | Decimal | A calculated but atomic result; you don't query by the 'first digit' of a GPA |
| Book | ISBN | String (formatted) | A standardized identifier; though structured, it's used as a whole unit |
| Sensor | Temperature | Decimal | A single reading value at a point in time |
| Account | Balance | Decimal/Currency | A single monetary value representing account state |
| Vehicle | Year | Integer | The manufacturing year; atomic in automotive contexts |
Some attributes appear simple but carry hidden complexity. A 'ZipCode' seems atomic, but in the US, ZIP+4 codes have two components (ZIP-5 and ZIP-4) sometimes used independently. An 'SSN' (Social Security Number) has three segments historically used to encode location and chronology. Model based on actual usage requirements, not surface appearance.
In Entity-Relationship diagrams, simple attributes are depicted using standard notation that has remained consistent since Peter Chen's original 1976 paper. Understanding this notation is essential for communicating database designs.
Standard (Chen) Notation:
In Chen notation—the original and still widely-taught ER notation—simple attributes are represented as:
┌─────────────────┐
│ EMPLOYEE │
└────────┬────────┘
│
┌────┴────┐
╱ ╲
( Salary )
╲ ╱
└─────────┘
This simple oval representation immediately communicates that the attribute is atomic and single-valued. No additional markers are needed—simplicity is the default assumption.
Crow's Foot (IE) Notation:
In Information Engineering (Crow's Foot) notation, which is more common in industry tools, entities are represented as rectangles with attribute lists:
┌──────────────────┐
│ EMPLOYEE │
├──────────────────┤
│ employee_id (PK) │
│ first_name │
│ last_name │
│ salary │
│ hire_date │
└──────────────────┘
Simple attributes appear as plain entries in the attribute list. They don't require any special marker—composite and multivalued attributes are marked differently, making simple the 'default' presentation.
Regardless of notation style, simple attributes share a common characteristic: they are presented in their most basic form without special markers, connecting lines to sub-elements, or multiple value indicators. The absence of complexity indicators signals simplicity.
UML Class Diagram Style:
When using UML for data modeling (increasingly common with object-relational mapping), simple attributes appear as:
┌────────────────────────────┐
│ Employee │
├────────────────────────────┤
│ - employeeId: Integer │
│ - salary: Decimal │
│ - hireDate: Date │
│ - isActive: Boolean │
├────────────────────────────┤
│ + getSalary(): Decimal │
└────────────────────────────┘
The notation shows attribute name, data type, and visibility (public/private). Again, simple attributes are the baseline—no special annotations needed.
Simple attributes draw their values from defined domains—sets of acceptable values with associated data types. Understanding the relationship between conceptual attributes and implementation types is crucial for effective database design.
What is a Domain?
A domain is the set of all possible legal values for an attribute. For a simple attribute, the domain is:
| Conceptual Type | SQL Data Types | Domain Constraints | Examples |
|---|---|---|---|
| Identifier | INTEGER, BIGINT, UUID | Unique, non-null | employee_id, order_number |
| Monetary | DECIMAL(p,s), NUMERIC | Non-negative, precision rules | salary, price, balance |
| Quantity | INTEGER, SMALLINT | Non-negative, range limits | stock_level, seat_count |
| Measurement | FLOAT, DOUBLE, DECIMAL | Range constraints, precision | temperature, weight, distance |
| Text (short) | VARCHAR(n), CHAR(n) | Length limits, character sets | name, code, abbreviation |
| Text (long) | TEXT, CLOB | Size limits (if any) | description, notes |
| Date/Time | DATE, TIME, TIMESTAMP | Valid ranges, timezone rules | birth_date, created_at |
| Boolean | BOOLEAN, BIT | True/False/Unknown | is_active, has_paid |
| Enumeration | ENUM, CHECK constraint | Fixed set of values | status, priority, gender |
Explicitly defining domains during conceptual modeling prevents implementation errors. If 'status' can only be 'active', 'suspended', or 'closed', document this constraint. Don't leave it to be 'discovered' during development—it belongs in the data model.
Type-Attribute Alignment:
Choosing the right data type for a simple attribute involves several considerations:
Precision requirements — Will you need exact decimal values (money) or are floating-point approximations acceptable (scientific measurements)?
Storage efficiency — An INTEGER (4 bytes) might suffice where BIGINT (8 bytes) would waste space for millions of rows.
Value range — A TINYINT (0-255) works for age; a SMALLINT (-32,768 to 32,767) won't hold a world population count.
Null semantics — Can the attribute be unknown/missing? If not, enforce NOT NULL.
Indexing behavior — Some types index more efficiently than others; VARCHAR(255) indexes differently than TEXT.
Comparison semantics — Case-sensitive vs. case-insensitive string comparison; timezone-aware vs. naive timestamps.
123456789101112131415161718192021222324252627
-- Defining domains for simple attributes in PostgreSQLCREATE DOMAIN positive_money AS DECIMAL(15,2) CHECK (VALUE >= 0); CREATE DOMAIN email_address AS VARCHAR(254) CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'); CREATE DOMAIN percentage AS DECIMAL(5,2) CHECK (VALUE >= 0 AND VALUE <= 100); CREATE DOMAIN us_phone AS VARCHAR(14) CHECK (VALUE ~ '^\(\d{3}\) \d{3}-\d{4}$'); -- Using domains in table definitionsCREATE TABLE employees ( employee_id SERIAL PRIMARY KEY, first_name VARCHAR(50) NOT NULL, last_name VARCHAR(50) NOT NULL, email email_address UNIQUE NOT NULL, salary positive_money NOT NULL, commission_rate percentage, phone us_phone); -- Each attribute maps to a single column-- Each column has explicit domain constraints-- Simple attributes = simple storageUnderstanding how simple attributes manifest in actual entity instances helps cement the conceptual model. Each instance of an entity has specific values for its simple attributes—this is where abstraction meets data.
From Schema to Instance:
Consider an Employee entity with simple attributes:
| employee_id | first_name | last_name | salary | hire_date | is_active |
|---|---|---|---|---|---|
| 1001 | Alice | Johnson | 75000.00 | 2020-03-15 | true |
| 1002 | Bob | Williams | 62500.00 | 2021-07-22 | true |
| 1003 | Carol | Brown | 88000.00 | 2019-01-10 | true |
| 1004 | David | Miller | 55000.00 | 2022-11-01 | false |
| 1005 | Eve | Davis | NULL | 2023-06-15 | true |
Observations about these instances:
Each attribute holds exactly one value per row — There's no notion of multiple salaries or multiple hire dates for a single employee (simple + single-valued).
Values are atomic — 'Alice' isn't subdivided; '75000.00' stands as one value; '2020-03-15' is treated as a complete date.
NULL is possible — Eve's salary is NULL (unknown/not yet assigned), which is different from having a value of zero. NULLs are valid for simple attributes unless constrained otherwise.
Each value comes from its domain — Salaries are decimals, dates are dates, booleans are booleans. There's no type mixing within an attribute.
A reliable indicator that you've correctly identified a simple attribute: it maps naturally to exactly one column in a relational table. If you find yourself wanting multiple columns for one 'attribute,' it's probably composite. If you need multiple rows for one entity's 'attribute,' it's probably multivalued.
Value Constraints on Instances:
Beyond data type constraints, simple attribute values may be constrained by:
These constraints ensure that even simple attributes maintain data integrity and reflect real-world business rules.
Let's examine simple attributes in several real-world entity designs, reinforcing proper identification and representation.
Product Entity in an E-Commerce System:
Consider modeling products for an online store:
| Attribute | Type | Constraints | Notes |
|---|---|---|---|
| product_id | INTEGER | PK, NOT NULL | Unique identifier |
| sku | VARCHAR(20) | UNIQUE, NOT NULL | Stock keeping unit |
| name | VARCHAR(200) | NOT NULL | Display name |
| unit_price | DECIMAL(10,2) | CHECK >= 0 | Price per unit |
| weight_kg | DECIMAL(8,3) | CHECK > 0 | Shipping weight |
| stock_quantity | INTEGER | CHECK >= 0, DEFAULT 0 | Current inventory |
| is_active | BOOLEAN | DEFAULT true | Available for sale |
| created_at | TIMESTAMP | DEFAULT NOW() | Record creation time |
Why these are simple:
sku — Although formatted (ABC-12345), it's used as a single identifierunit_price — Single monetary value; no separate cents columnweight_kg — Single measurement; unit is standardizedstock_quantity — Single integer; no subdivision neededModeling simple attributes seems straightforward, but several common mistakes can lead to poor designs. Recognizing these pitfalls helps you avoid them.
Phone numbers are a classic gray area. In most business applications, storing the complete phone number as a single VARCHAR is correct—it's used for dialing and display. But in telecommunications analysis, you might need country code, area code, and subscriber number separately. Neither approach is universally 'right'—context determines correct modeling.
The 'Will I Need It' Heuristic:
When deciding whether to treat something as simple or subdivide it, ask:
If the answer to all four is 'no,' keep it simple. If any is 'yes,' consider making it composite—which we'll cover in the next page.
Simple attributes form the foundation of entity description. They represent atomic, indivisible pieces of information that characterize entities in your data model. Let's consolidate our understanding:
What's Next:
Not all attributes are atomic. In the next page, we explore composite attributes—attributes that have meaningful internal structure, consisting of sub-components that may need to be accessed independently. Understanding when to use composite versus simple attributes is a critical skill in effective ER modeling.
You now understand simple attributes—the atomic building blocks of entity description. You can identify them, represent them in various notations, define their domains, and avoid common modeling mistakes. Next, we'll extend this foundation to handle attributes with internal structure: composite attributes.