Return to Root Glossary

Database Analysis - Glossary


ABCDE
FGHIJ
KLMNO
PQRST
UVWXY
Z

DBLC The Database Life Cycle

Data analysis is concerned with the NATURE and USE of data. It involves the identification of the data elements which are needed to support the data processing system of the organization, the placing of these elements into logical groups and the definition of the relationships between the resulting groups. Other approaches, for example D.F.Ds and Flowcharts, have been concerned with the flow of data-dataflow methodologies. Data analysis is one of several data structure based methodologies.

Access permission
A rule associated with a shared resource that determines which users can access the resource, and at what permissions level.
all-key relation
A relation in which every field is a member of the relation's primary key.
Apriori algorithm
The Apriori algorithm is an algorithm that attempts to operate on database records, particularly transactional records, or records including certain numbers of fields or items. It is one of a number of algorithms using a "bottom-up approach" to incrementally contrast complex records, and it is useful in today's complex machine learning and artificial intelligence projects.
Atomic
An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs. A guarantee of atomicity prevents updates to the database occurring only partially, which can cause greater problems than rejecting the whole series outright. As a consequence, the transaction cannot be observed to be in progress by another database client.
attribute
A characteristic of an entity; data that identifies or describes an entity. Usually represented as a column in a table, attributes store data values.
base table
A table stored in a database.
BLOB domain type
Short for “binary large object” a domain type used to store binary objects such as graphics in a relational database.
block
The smallest area that can be addressed on a magnetic disk.
Boolean expression
An expression that results in a value of either TRUE or FALSE. Boolean expressions are also called comparison expressions, conditional expressions, and relational expressions.
Business analytics
Business Analytics is the process by which businesses use statistical methods and technologies for analyzing historical data in order to gain new insight and improve strategic decision-making.
Business intelligence
Business intelligence (BI) combines business analytics, data mining, data visualization, data tools and infrastructure, and best practices to help organizations to make more data-driven decisions.
business objects
Items in a business environment that are related, and about which data need to be stored (e.g., customers, products, orders, etc.).
Business rules
A set of rules or conditions describing the business polices that apply to the data stored on a company databases.
candidate key
A column or set of columns that could can used as a primary key for a table.
Cartesian product
A type of table join: the Cartesian product of two tables consists of the combination of every record in one table with every record in another table.
CASE tools
Short for computer-aided software-engineering tools, CASE tools are software packages used to design and implement information systems.
catalog
Also called the data dictionary. The catalog is the foundation of the database and is where the RDBMS finds which tables are in a database, which columns are in the tables, which columns are primary or foreign keys, and what type of data to expect in those columns.
Chen model
One of several types of entity-relationship (ER) diagrams. The Chen model is named after the inventor of ER diagramming.
cluster
A group of table columns or rows often accessed together.
clustering
Clustering brings records from two or more tables together on a hard disk to improve joins between the two tables.
column
Part of the structure of a database table; also known as a field or, during early database design, as an attribute.
composite entity
In relational database design, a composite entity is used to provide an indirect link between two entities in aM:N (many-to-many) relationship. Also called a linking table.
composite key attribute
A key attribute belonging to a composite entity, comprising the key attributes from each of the two entities linked by the composite entity. Also called a composite primary key.
composite primary key
Another name for a composite key attribute.
concatenated primary key
A primary key made up of more than one field.
conceptual model
A description of the structure of a database.
constraints
Rules a database designer imposes upon certain elements in a database to preserve data integrity.
Crowsfoot model
One of several types of entity-relationship (ER) diagrams, the Crowsfoot model neatly packages entities with their attributes by placing them in boxes. Also referred to as the Information Engineering model.
data
The "raw facts" stored in a database.
data definition language (DDL)
A collection of programming statements that describe and define data and data relationships in a database.
data dictionary
Also called a catalog. The data dictionary is the foundation of the database, and it is where the RDBMS finds which tables are in a database, which columns are in the tables, which columns are primary or foreign keys, and what type of data to expect in those columns.
data flow diagram
A diagram illustrating the flow of data in an organization, including data sources, data storage, and data transformation processes.
Data integrity
A term used to describe the quality (in terms of accuracy, consistency, and validity) of data in a database, in the sense that values required to enforce data relationships actually exist. Problems with data integrity occur when a value in one table that’s supposed to relate to a value in another can’t, because the second value either has been deleted or was never entered.
data manipulation language (DML)
A collection of programming statements used to manipulate a database.
data mining
An information extraction activity whose goal is to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results. Typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis.
data redundancy
Duplication of data in a database.
data type
The category of data stored in a database. Each field in a database must have a data type assigned to it, such as Integer, Text, or Date/Time.
data value
Data entered at the intersection of a row (record) and column (field); the data describes or identifies the subject of the record.
database model
A formal structure for organizing and linking data in a database.
denormalization
To move a table to a lower normal form (e.g., from 3NF to 2NF).
descriptor column
A descriptor column contains a non-key attribute of an entity.
determinant
An attribute upon which another attribute is functionally dependent.
distribution independence
A characteristic of certain Relational Database Managment Systems(RDBMS) in which storing parts of a database on more than one computer does not affect the RDBMS's ability to work with the database.
Document-Oriented
In a traditional relational database, the user begins by specifying a series of column types and names for a table. Information is then added as rows of values, with each of those named columns as a cell of each row. You cannot have additional values that were not specified when you created the table, and every value must be present, even if it is as a NULL value. Document stores let you enter each record as a series of names with associated values, which you can picture being like a JavaScript object, a Python dictionary, or a Ruby hash. You do not specify ahead of time what names will be in each table using a schema. In theory, each record could contain a completely different set of named values, though in practice, the application layer often relies on an informal schema, with the client code expecting certain named values to be present. The key advantage of this document-oriented approach is its flexibility. You can add or remove the equivalent of columns with no penalty, as long as the application layer does not rely on the values that were removed. A good analogy is the difference between languages where you declare the types of variables ahead of time, and those where the type is inferred by the compiler or interpreter. You lose information that can be used to automatically check correctness and optimize for performance, but it becomes a lot easier to prototype and experiment.
domain
Determines the type of data values that are permitted for that attribute.
domain constraints
Rules that require values of attributes to come from specific domains (e.g., text, numbers, date, etc.).
Entity
A single stand-alone unit or a business object about which data are stored in a database; usually synonymous with a database table.
entity identifier
A key attribute that uniquely identifies each entity.
entity relationship
Entities that participate in a relationship are associated; the three types of entity relationships (associations) are one-to-one, one-to-many, and many-to-many.
entity-relationship (ER) diagram
A diagram used during the design phase of database development to illustrate the organization of and relationships between data during database design.
equi-join
Also referred to as a natural join, an equi-join is a type of table join. The common field contains equivalent values.
ER diagram
(Entered in the glossary db under entity-relationship diagram.) A diagram used during the design phase of database development to illustrate the organization of and relationships between data during database design.
field
The smallest structure in a table; contains data describing the subject of a table.
Field
Short for field variable.
first normal form (1NF)
A table is in first normal form if it contains no repeating groups.
foreign key
A field (or combination of fields) used to link tables; a corresponding primary key field occurs in the same database.
Fourth Normal Form (4NF)
A relation is in 4NF if it all of its determinants are candidate keys and it contains no multi-valued dependencies.
Fully functional dependency
In (2NF), all non-key values must be fully functionally dependent on the primary key and no partial dependencies are allowed. A partial dependency exists when a field is fully dependent on a part of a composite primary key.
functional dependency
A one-way relationship between two attributes so that, for any unique value of one attribute, there is only one value of the other attribute.
hierarchical model
A database model that organizes data in a top-down (inverted tree) structure.
horizontal partitioning
Horizontal partitioning divides a table's records into two or more tables
IDEF1X model
One of several types of entity-relationship (ER) diagrams. Information Engineering (Crowsfoot) model - One of several types of entity-relationship (ER) diagrams; usually referred to as the "Crowsfoot" model.
implementation
Using SQL to create a database; the fourth step in the database life cycle.
index
An index is an ordered list of values in a field, exactly like the index in the back of a book.
indexing
Creating a list of values in a column to speed searches, joins, and queries.
information
A collection of "raw facts" (data) organized in such a way that they have value beyond the value of the facts themselves.
information system
Interrelated components (e.g., people, hardware, software, databases, telecommunications, policies, and procedures) that input, process, output, and store data to provide an organization with useful information.
Inner join
A link between tables in a database that displays only the rows with a match (true value of the join condition) for both join tables. An inner join does not guarantee the return of every row of data that you expect will be returned.
instance of entity.
A group of attributes that identify and describe one occurrence of an entity; the equivalent of a record in a table.
integer division
A mathematical operation that returns the integer portion of a division and discards the remainder.
international issues
Database design considerations based on other countries' differing data standards, alphabets, and writing systems.
join
An operation that links table records based on data in common fields.
key attribute
An attribute that identifies an entity; an entity identifier.
key column
A key column contains information that distinguishes that person, place, or thing from every other item represented in the table.
legacy database
Any type of database that has been in use by an organization for several years.
linking table
Another name for a composite entity. Used to provide an indirect link between two entities in a M:N (many-to-many) relationship.
logical design
The second stage in the DBLC: creating a logical schema, followed by data normalization.
logical schema
The overall logical plan of a database; typically a completed ER diagram.
Mail Exchange Record
DNS database record type that indicates which host can transfer messages for a particular domain.
mandatory entity
In the language of relational database design, mandatory entity is another name for the entity that has mandatory participation in an entity relationship.
mandatory participation
In the language of database design, participation describes the nature of the relationship between two tables. Mandatory participation describes a relationship in which least one record in a table must exist before any records can be added toits associated table.
many-to-many relationship
In relational database design, a many-to-many (M:N) relationship exists when, for one instance of entity A, there exists zero, one, or many instances of entity B; and for one instance of entity B, there exists zero, one, or many instances of entity A.
multi-field determinant
A determinant consisting of more than one field.
multi-valued attribute
An attribute with more than one data value; must be resolved into a single-valued attribute.
multi-valued dependency
A condition in which a primary-key value in one table determines multiple values in two or more independent tables.
natural join
Also referred to as a equi-join, an natural join is a type of table join. The common field contains equivalent values.
network model
An extension of the hierarchical model. (See hierarchical model.)
non-key attribute
Non-key attributes are the attributes or fields of a table, other than candidate key fields in a table.
normal form
A set of theoretical rules for the design of database tables.
normal forms
(Entered in the db under normal form.) A set of theoretical rules for the design of database tables.
normalization
The process of applying increasingly stringent rules to a relational database to correct any problems associated with poor design.
normalize
To break up large tables into smaller, more efficient tables without losing any information.
null value
An unknown value in a column or field.
nulls
Unknown values in a column or field
Object-oriented design:
A design that allows you to not only define what data you can store in your database, you also define a set of actions (methods) associated with the data.
object-oriented model
A database model that stores data as objects.
one-to-many relationship
In relational database design, a one-to-many (1:N) relationship exists when, for one instance of entity A, there exists zero, one, or many instances of entity B; but for one instance of entity B, there exists zero or one instance of entity A.
one-to-one relationship
In relational database design, a one-to-one (1:1) relationship exists when zero or one instance of entity A can be associated with zero or one instance of entity B, and zero or one instance of entity B can be associated with zero or one instance of entity A.
optional participation
In the language of database design, participation refers to the nature of the relationship between tables. Optional participation describes a relationship in which no records need to exist in a table before records are added to its associated table.
outer join
A link between tables in a database. When two tables are combined by an outer join, any records from the base table with no matching records from the other table are included in the results and any columns where no values are available are filled with nulls .
paper-based database
Typically a filing system where data is stored on a variety of paper forms.
partitioning
The process of breaking tables into parts to improve retrieval performance.
permissions
Permissions are database tasks that a user or group of users is allowed to carry out. There are several types of permissions the administrator can grant:..· Select: The user can retrieve data from a table or view..· Insert: The user can create new records in a table or view. The user may also be limited to inserting values into particular fields..· Update: The user can modify existing values in a table or view, again for an entire table or specific columns..· Delete: The user can remove existing records from a table or view..· References: The user can use a field in an existing table or view a field as a foreign key in a table that the user creates. This permission may be limited to particular fields..· All: The user has all of the permissions listed above.

physical design
Partitioning a table, in database lexicon, divides a single table into two or more tables to limit the amount of data the RDBMS has to retrieve at once. There are two types of partitioning operations: .· Horizontal partitioning, which splits a table's records into two or more tables .· Vertical partitioning, which splits a table's columns into two or more tables.
physical schema
The underlying physical storage of data in a database, managed by the RDBMS.
predictive analytics
Predictive analytics is the use of data, statistical algorithms and machine learning techniques to identify the likelihood of future outcomes based on historical data.
primary key
A field (or combination of fields) that uniquely identifies a record in a table.
query
(v.) To extract data from a database; (n.) a set of SQL statements for extracting particular data from a database.
R Language
R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (now Lucent Technologies) by John Chambers and colleagues.
R-Tree
R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both theoretical and applied contexts. A common real-world usage for an R-tree might be to store spatial objects such as restaurant locations or the polygons that typical maps are made of: streets, buildings, outlines of lakes, coastlines, etc. and then find answers quickly to queries such as "Find all museums within 2 km of my current location", "retrieve all road segments within 2 km of my location" (to display them in a navigation system) or "find the nearest gas station" (although not taking roads into account). The R-tree can also accelerate nearest neighbor search[3] for various distance metrics, including great-circle distance.
record
A particular instance of the subject of a table.
redundant data
Duplicate data in a database; although redundant data can never be totally removed, a major goal of database design is to eliminate as much redundant data as possible.
Referential integrity
The means of maintaining the integrity of data between one or more tables that relate to each other, in other words, that a column of data in a table has a null or matching value in a corresponding table. Referential integrity is usually enforced with foreign keys.
relation
Another word for table. Relations represent entities in database design.
relational database
A database built on the relational model, which organizes data into tables comprising columns and rows.
Relational Database Management System (RDBMS)
A software package that manages and provides access to a database. These packages follow Codd’s 12 rules of relational databases and normally use SQL to access data.
relational model
A formal structure that organizes data into relations (i.e., tables).
relational notation
Relations are expressed in a shorthand called relational notation, a textual interpretation of the ER diagram.
relationship
If the same attribute occurs in more than one table, a relationship exists between those two tables.
repeating groups
An attribute that has more than one value in a table row.
requirements analysis
The stage in the database design cycle when designers find out everything they can about the data the client needs to store in the database and the conditions under which that data needs to be accessed.
same-table join
A type of inner join, so named because it creates two copies of a table and then joins records from the tables where the tables have equivalent values in designated fields.
schema
The overall logical plan of a database.
second normal form (2NF)
A table is in second normal form when every attribute is functionally dependent on the entire primary key.
set theory
A branch of mathematics that studies how groups of objects interact.
single-valued attribute
An attribute with more than one data value; must be resolved into a single-valued attribute.
SQL
SQL is an acronym for Structured Query Language. It provides a set of commands that can be used to add data to a database, retrieve that data, and update it. SQL, often pronounced “sequel”, is universally supported by relationaldatabase vendors.
Structured Query Language (SQL)
The standard language used by all relational databases, including Oracle 13c, DB2 Version 11, Microsoft SQL Server 2016.
system administrator
The person responsible for administering a multi-user computer system; duties range from setting up and configuring system components (e.g., an RDBMS) to performing maintenance procedures (e.g., database backups) on the system.
Table
A collection of data arranged in rows and columns. A table is the largest structure in a relational database.
third normal form (3NF)
A table is in third normal form if it contains no transitive dependencies.
three-valued logic
In logic, a three-valued logic (also trinary logic, trivalent, ternary, or trilean, sometimes abbreviated 3VL) is any of several many-valued logic systems in which there are three truth values indicating true, false and some indeterminate third value. This is contrasted with the more commonly known bivalent logics (such as classical sentential or Boolean logic) which provide only for true and false. Conceptual form and basic ideas were initially created by Jan Lukasiewicz and C. I. Lewis. These were then re-formulated by Grigore Moisil in an axiomatic algebraic form, and also extended to n-valued logics in 1945.
transitive dependency
A term derived from the mathematical equation for a transitive property: If a>b and b>c, then a>c. In terms of a database, a transitive dependency refers to a relationship between two entities that is dependent upon a second relationship between one of the first two entities and a third entity.
transitivity
Amathematical property which states that if A=B and B=C, then A=C.
tuple
In the lexicon of relational database design another word for rows or records.
user views
Saved queries created with SQL. User views specify which users are permitted access to what data in a database.
Vertical partitioning
To divide a table into multiple tables, placing different columns in each new table.
View
A database object that enables you to define the columns and rows that a specific user can see. A view can also serve as a tool for enforcing security within your database.
virtual table
A table stored in the computer’s memory. Virtual tables themselves are not stored in the database; rather, the definition of the view is stored and given a name. Users call up that name, and the view is created (from base tables) on the fly. When a user closes the view, it "disappears" from memory, only to be recreated the next time its name is invoked.
weak entity
In the language of relational database design, weak entity is another name for the entity that has optionalparticipation in an entity relationship.
Yourdon/DeMarco
A popular style of data-flow diagram. It includes symbols that illustrate who handles the data in a database, where the data moves, where the data are stored, and what is done to the data.


Ad Database Analysis for Design