Database Analysis  «Prev  Next»

Lesson 1

Relational Database Analysis

Welcome to Relational Database Design Fundamentals: Data Analysis. This course introduces you to techniques of data analysis involved in designing databases and how to apply them effectively.
This is the second in a two-part series devoted to the fundamentals of database design. This course will enable you to design an efficient database by creating relational constructs, normalizing tables, creating joins and views, optimizing the database's physical design, and avoiding common database design mistakes.
Taken in conjunction with Relational Database Design in this series, this course will prepare you to make solid contributions as a member of a database design team. Along the way, you will work on a course project that will give you a chance to put your new skills to use in the context of actual business scenarios.

Course goals

After completing the course, you will be able to:
  1. Create relational constructs
  2. Normalize tables to first, second, and third normal forms
  3. Create joins and views
  4. Optimize a database's physical design
  5. Identify and avoid common design mistakes

In this course, you will learn and practice database design skills with the aid of several kinds of
  1. Image Galleries and
  2. Diagrams.
You can measure your progress by completing the quizzes and exercises included throughout the course.
You will also create a course project that involves designing and optimizing a database to track orders that Stories on CD, a fictional company that sells books on CD, places with its distributors. You will complete the project incrementally via a series of exercises that provide opportunities to apply what you have learned to a real-world situation.


numpy.corrcoef

numpy.corrcoef(x, y=None, rowvar=True, bias=<no value>, ddof=<no value>)[source]
Return Pearson product-moment correlation coefficients.
The relationship between the correlation coefficient matrix, R, and the covariance matrix, C
Question: What is the relationship between the correlation coefficient matrix R, and the covariance matrix C?
The Pearson product-moment correlation coefficient is a measure of the strength and direction of the relationship between two variables. It ranges from -1 to +1, with -1 indicating a perfect negative correlation, +1 indicating a perfect positive correlation, and 0 indicating no correlation. The covariance matrix (C) is a square matrix where the element (Cij) is the covariance between the (i)-th and (j)-th random variables in a dataset. If you have (n) random variables, the covariance matrix will be (n X n).
The correlation coefficient matrix (R) is also a square matrix that is of the same dimension as the covariance matrix (C).
Each element (Rij) in (R) is the Pearson correlation coefficient between the (i)-th and (j)-th variables.
The relationship between (C) and (R) is as follows:


  
    
      R
      
        i
        j
      
    
    =
    
      
        C
        
          i
          j
        
      
      
        
          
            C
            
              i
              i
            
          
          ×
          
            C
            
              j
              j
            
          
        
      
    
  


Here is a summary:
  1. (C) tells you about both the direction (positive or negative) and the scale of the relationship between variables.
  2. (R) tells you about the direction (positive or negative) and the strength (0 to 1 in absolute value) of the relationship between variables, but not the scale.
  3. (R) is dimensionless, whereas (C) has units that are the product of the units of the two variables.
By using (R) and (C), you can glean different aspects of the relationships between variables in your dataset.

The series

Relational Data Analysis is the second of two courses in the Relational Database Design Fundamentals series. In the next lesson, the prerequisites for the course will be discussed.