Lesson 3	Key differences between relational and flat file databases
Objective	Understand the key differences between relational and flat file databases.

Differences between Relational and Flat File Databases

The key points to understand about relational and flat file databases are the following:

Relational databases differ from older, non-relational databases because information is stored at random in the relational model, then organized as it is retrieved.
Older databases typically use more proprietary query approaches, often hard-coded applications, as compared with dynamic queries, submitted by both the user and applications.
SQL relies on the foundation of a relational database. The engine you are working with must support the SQL characteristics to querying. Compatibility with SQL is not an assumed capability of older database engines.

Evolution of Database Modeling

The evolution of database modeling occurred when each database model improved upon the previous one. The initial solution was no database model at all using a file system (also known as flat files). You can examine files in the file system of the operating system by

running a dir command in DOS,
an ls command in UNIX, or
searching through the Windows Explorer in Microsoft Windows.

The problem that using a file system presents is no database structure at all. Figure 3.3 below shows the evolutionary process over time from the late 1940s through the year 2000, approximately 50 years later. It is very unlikely that network and hierarchical databases are still in use with the exceptions of IBM's IMS.

Figure 3-3: The evolution of database modeling techniques.
1. File Systems, 2. Hierarchical, 3. Network, 4. Relational, 5. Object, 6. Object-Relational

File System Database Model

Using a file system database model^[1] implies that no modeling techniques are applied and that the database is stored in flat files in a file system, utilizing the structure of the operating system alone. The term flat file is a way of describing a simple text file, containing no structure whatsoever and data is simply dumped in a file. By definition, a comma-delimited file (CSV file) contains structure because it contains commas. By definition, a comma-delimited file is a flat file. However, flat file databases in the past tended to use huge strings, with no commas and no new lines and data items were found based on a position in the file. In this respect, a comma-delimited CSV file used with Excel is not a flat file. Any searching through flat files for data has to be explicitly programmed. The advantage of the various database models is that they provide some of this programming for you. For a file system database, data can be stored in individual files or multiple files. Similar to searching through flat files, any relationships and validation between different flat files would have to be programmed and likely be of limited capability.

Ad Relational Database Design

Flat File Database

A flat file database is a database which is stored on its host computer system as an ordinary, non-indexed flat file. To access the structure of the data and manipulate it, the file must be read in its entirety into the memory of the computer. Upon completion of the database operations, the file is again written out in its entirety to the file system of the host. In this stored mode, the database has no structure which can be seen external to the file and thus is referred to as a flat file. The term has generally implied a small, simple database. As computer memory has become cheaper, larger and more sophisticated databases can now be held in memory in their entirety for faster access, but these would not generally be referred to as flat-file databases.
You may hear about xBase, dBase, Btrieve^[2], or other databases as you work with your applications. These databases are prime examples of the older technologies that traditionally employ flat file approaches. Since the "early days," a whole 17 or so years ago, many of these databases have employed engines that let you query their database with SQL. XBase uses an approach of creating a file for the data, a set of header information that describes how the data is formatted in the database, and one or more separate index files, typically updated by the system. It is not unusual to have an xBase system that has tens or hundreds of database files. With SQL, these files map to tables, which you will learn more about in the coming lessons. With a relational database system, these xBase databases are rolled together inside the overall application database.

Network Data Model

Core Concepts

Nodes (or Objects): Represent the entities in your data. These could be people, products, locations, or almost anything you want to model.
Relationships (or Edges): The connections between nodes. Crucially, relationships are first-class citizens in this model, carrying their own properties and types (e.g., "Friend of", "Purchased", "Located in").
Graph Structure: The overall organization is a graph. Unlike strict hierarchies, nodes can have multiple parents and children, forming complex webs of connections.

Key Strengths of the Network Data Model

Modeling Complex Relationships: Networks models excel at representing interconnected data where the relationships between entities are as important as the entities themselves. Think social networks, supply chains, or recommendation systems.
Relationship-Centric Queries: You can easily query things like:
- Find all friends of a friend
- What's the best route between two locations, considering constraints?
- Are there patterns in how products are purchased together?
Schema Flexibility: Adding new relationship types or modifying node properties doesn't usually require extensive restructuring, unlike in rigid relational models.

Use Cases

Social Networks: Representing friend connections, interests, interactions, and content sharing.
Recommendation Engines: Analyzing user-product relationships to suggest new items or services.
Knowledge Graphs: Interlinking concepts, facts, and ontologies for better information retrieval.
Fraud Detection: Identifying suspicious patterns of relationships between transactions, accounts, or devices.
Network Infrastructure Management: Mapping physical and logical connections between network components.
Logistics and Route Planning: Optimizing delivery routes, taking into account distances and restrictions

Example: A Tiny Social Network Fragment

Nodes:
- Person (Name, Age)
- City (Name)
Relationships:
- LIVES_IN (between Person and City)
- FRIEND_OF (between Person and Person)

Popular Implementations

Neo4: A mature and widely used graph database.
Apache TinkerPop (Gremlin): Provides a graph computing framework and query language.
Many relational databases now support graph-like features or extensions

Considerations

Query Language: Specialized query languages like Cypher (Neo4j) or Gremlin are tailored for graph traversal.
Not for All Scenarios: If your data is highly structured and transactional, traditional relational databases might be more suitable.

Database Flat File - Quiz

Take a brief quiz to make sure to understand these flat file and relational database concepts.
Database Flat File - Quiz
In the next lesson, we will use an analogy to drive the relational database point home.

[1]file system database model: A DBMS creates and defines the constraints for a database. A file system allows access to a single file at a time and addresses each file individually. Because of this, functions such as redundancy are performed on an individual level, not by the file system itself.

[2]Btrieve: The original developer, Pervasive Software, integrated Btrieve technology into their flagship product, Pervasive.SQL. This modern database retains some elements of Btrieve but offers additional features and functionality.