Lesson 3	Entity identifiers
Objective	Explain the Purpose of Entity Identifiers

Purpose of Entity Identifiers in Database Design

The purpose of entity identifiers in database design is to uniquely distinguish each record (or instance) within a table or entity. These identifiers, often called primary keys, play a crucial role in ensuring the integrity and structure of the database. Here's why entity identifiers are important:

Uniqueness: Entity identifiers ensure that each record is unique. This prevents duplicate entries and maintains data integrity by allowing the system to accurately identify and retrieve a specific record without ambiguity.
Efficient Data Retrieval: By using entity identifiers, databases can quickly search and retrieve data. Indexes are often built around these identifiers, improving the efficiency of queries and lookups.
Relationship Mapping: In relational databases, entity identifiers are essential for establishing relationships between tables. For example, a foreign key in one table will reference the primary key (entity identifier) in another, allowing the system to link related data across multiple entities.
Consistency: Entity identifiers help maintain data consistency. By enforcing the uniqueness of records, the database ensures that all related records refer to the correct instance of an entity, preventing orphaned or inconsistent data.
Data Integrity: In complex systems, entity identifiers play a role in enforcing referential integrity. This ensures that relationships between tables remain consistent, preventing data anomalies such as broken links or invalid references.
Scalability: As systems grow, entity identifiers allow for scalable design by supporting efficient management of large datasets, making it easier to maintain, query, and update data.

In summary, entity identifiers form the backbone of database organization by uniquely identifying records, facilitating efficient data retrieval, enforcing relationships, and ensuring data integrity throughout the system.

Relational Database Design Implementation

Entites are represented as Tables in a Database

The reason for having a database is to store data values about entities, and then to retrieve the data values regarding those entities as needed. In order to accomplish this, there must be some way to distinguish one entity from another. Entity identifiers perform this function. Entity identifiers are attributes, specifically, key attributes that uniquely identify each entity. An entity identifier is not an optional attribute; every entity must have a key attribute to uniquely identify it.
Entity identifiers (key attributes) become primary keys in a table.
Composite Entity
A composite entity is also known as a bridge entity. This bridge is used to handle the many-to-many relationships that the traditional entity could not handle. This entity lies between the two entities that are of interest and this composite entity shares the primary keys from both the connecting tables. This composite entity is also known as a gerund because it has the characteristics of an entity and a relationship. You will learn about a composite entity later in the course. The composite entity exists only to link two other entities together. A composite entity has no entity identifier of its own. Instead, it receives the entity identifiers from each of the two entities it serves to link, and combines them to form a composite entity identifier (usually called a composite key attribute).

Composite or Associative Entity) is introduced:

In relational database design, when you have two entities with a many-to-many relationship, directly implementing this relationship can lead to complications. For example, if you have entities like Students and Courses, where each student can enroll in multiple courses and each course can have multiple students, a direct many-to-many relationship would require a complex join table or could lead to data redundancy.
To resolve this, a composite entity (also known as an associative or junction entity) is introduced:

Purpose: It serves to break down the many-to-many relationship into two one-to-many relationships.
Structure: The composite entity will typically have:
- Foreign keys from both related entities.
- Optionally, additional attributes that describe the relationship itself (like enrollment date, grade, etc., in the case of Students and Courses).

Here's a simple example in SQL to illustrate:

-- Table for Students
CREATE TABLE Students (
    StudentID INT PRIMARY KEY,
    Name VARCHAR(100)
);

-- Table for Courses
CREATE TABLE Courses (
    CourseID INT PRIMARY KEY,
    CourseName VARCHAR(100)
);

-- Composite entity table for Enrollment
CREATE TABLE Enrollment (
    EnrollmentID INT PRIMARY KEY,
    StudentID INT,
    CourseID INT,
    EnrollmentDate DATE,
    FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
    FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

In this model:

Enrollment is the composite entity linking Students and Courses.
Each row in Enrollment represents an individual relationship between a student and a course, allowing for additional data like EnrollmentDate.

This approach not only normalizes the data but also makes querying and maintaining data integrity much simpler.

The composite entity is an entity that is created to represent a many-to-many relationship between two or more entities in a database. It is also known as a

bridge entity,
associative entity, or
junction entity.

In a many-to-many relationship, one instance of an entity can be related to multiple instances of another entity, and vice versa. For example, in a database for a university, a many-to-many relationship exists between students and courses, as a student can enroll in multiple courses, and a course can have multiple students enrolled in it.
To represent this many-to-many relationship, a composite entity can be created. The composite entity has its own attributes and primary key, and it serves as a connector between the two entities it relates. In the case of the student and course entities, a composite entity called "enrollment" could be created, which would have attributes such as "semester," "grade," and "attendance." The enrollment entity would have foreign keys that reference the primary keys of the student and course entities, allowing it to establish a many-to-many relationship between them. In this way, the enrollment entity acts as a bridge or junction between the student and course entities, allowing them to be connected in a meaningful way. Composite entities are important in data modeling because they allow for complex relationships between entities to be represented in a structured and organized way. By breaking down many-to-many relationships into smaller, more manageable relationships, composite entities can help make a database more efficient and easier to understand.

Entity Identifier Attribute

An entity is something about which we store data. For example, a customer is an entity, as is a merchandise item stocked by Distributed Networks. Entities are not necessarily tangible. For example, a concert is an entity.
Entities have data that describe them (their attributes). For example, a customer entity is usually described by a

customer number,
first name,
last name,
street,
city,
state,
zip code, and
phone number.

A concert entity might be described using the following attributes: 1) title, 2) date, 3) location, and 4) name of the performer.

The image represents an "unstructured representation of data", where different attributes related to customers are scattered in irregular shapes.
This visualizationillustratea the concept of unnormalized data in database design or data modeling.
**Analysis of Image Elements:**

**Clusters of Data (Irregular Shapes)**

Each shape contains a set of attributes related to a **customer**.

The scattered nature indicates a **lack of structure or normalization**, making it difficult to retrieve or analyze efficiently.

**Customer Data Fields Identified:**
Each customer entity contains:

**Customer ID** (e.g., #0985, #1212, #0081)

**Name** (e.g., Jane Doe, John Doe, Sam Smith)

**Address** (e.g., 185 Main Street, 89 West Ave, 2288 North Road)

**City and State** (e.g., This Town, ST)

**Zip Code** (e.g., 11111)

**Credit Card Number** (e.g., 5555 1111 2222 3333)

**Phone Number** (e.g., (555) 555-1111)

**Date (Possibly Date of Joining or Last Transaction)** (e.g., 12/02, 11/01, 06/02)

**Data Representation Issue:**

The **irregular arrangement of attributes** suggests a **lack of proper entity structuring**.

In a **well-normalized relational database**, this data would be **organized into structured tables with clear relationships** rather than scattered blocks.

**Possible Interpretation in Database Design:**
This could be illustrating:

Unstructured or poorly organized data storage, which is inefficient.

The need for normalization by structuring attributes into a database table.

Redundancy issues where the same fields are formatted inconsistently across different entities.

Figure 5-3 Instances of a customer entity in a database, where each customer has a unique number.

When we represent entities in a database, we actually store only the attributes. Each group of attributes that describes a single real-world occurrence of an entity acts to represent an instance of an entity. For example, in Figure 5-3, you can see three instances of a customer entity stored in a database. If we have 1000 customers in our database, there will be 1000 collections of customer attributes.
The next lesson lists two rules for creating entity identifiers.