Network Topology   «Prev  Next»

Lesson 9 Joining Tables from Remote Sites
Objective How do I join tables from remote sites in a SQL query?

Inter-Database Communication — Joining Tables from Remote Sites

Inter-database communication is the infrastructure that makes distributed Oracle queries possible — the combination of database links established in Lesson 8 and Oracle Net Services established in Lessons 1 through 7. Joining tables from remote sites in a single SQL query is the most powerful and commonly used practical outcome of that infrastructure. A single SELECT statement can combine local tables with tables on databases in London, Paris, Tokyo, or any OCI region, and Oracle Net handles the distributed execution transparently — the application sees a single result set as if all tables were local.

This lesson covers the mechanics of distributed joins in Oracle, the three-site Denver/London/Paris example that demonstrates the pattern, synonym creation for location transparency, the DRIVING_SITE optimizer hint for performance control, distributed query best practices, and how multitenant architecture in Oracle 19c and 23ai affects distributed join behavior.

The Relationship Between Inter-Database Communication and Distributed Joins

The distinction between inter-database communication and joining remote tables is important:

The feature has existed since Oracle 7 with SQL*Net and remains fundamentally unchanged in Oracle 23ai — the same table@database_link syntax, the same optimizer mechanics, and the same transparency to the application layer. What has changed is performance, security defaults, and multitenant scoping.

Creating a Database Link for Distributed Joins

Before joining tables from a remote site, a database link must exist. The general syntax for creating a database link is:

CREATE [SHARED] [PUBLIC] DATABASE LINK link_name
  CONNECT TO username IDENTIFIED BY password
  [USING 'connect_string'];

In Oracle 23ai, the recommended approach uses CONNECT TO CURRENT_USER to avoid storing passwords in the data dictionary, combined with an Easy Connect Plus string in the USING clause:

-- Oracle 23ai recommended: Current User link with Easy Connect Plus
CREATE DATABASE LINK london
  CONNECT TO CURRENT_USER
  USING 'tcps://london-scan:2484/sales_pdb?ssl_server_dn_match=yes';

CREATE DATABASE LINK paris
  CONNECT TO CURRENT_USER
  USING 'tcps://paris-scan:2484/orders_pdb?ssl_server_dn_match=yes';

Once the database links exist, any table on the remote database is accessible using the table_name@link_name syntax. A simple two-table distributed join between a local table and a remote table:

-- Join local employees with remote departments via database link
SELECT e.employee_id, e.first_name, d.department_name
FROM   employees e
JOIN   departments@london d ON e.department_id = d.department_id
WHERE  e.hire_date > SYSDATE - 365;

Three-Site Distributed Join — Denver, London, Paris

The classic Oracle distributed join example connects three tables across three geographically separate databases: a local ORDER table in Denver, a CUSTOMER table in London, and an ORDERLINE table in Paris. A single SQL query retrieves joined data from all three sites:

-- Three-site distributed join: local ORDER, remote CUSTOMER and ORDERLINE
SELECT customer.customer_name,
       order_tab.order_date,
       orderline.quantity_ordered
FROM   customer@london,
       order_tab,
       orderline@paris
WHERE  customer.cust_number  = order_tab.customer_number
AND    order_tab.order_number = orderline.order_number;

Note that ORDER is a reserved word in Oracle SQL and cannot be used as a table name without quoting — the example uses order_tab as the local table name. Oracle Net resolves the @london and @paris references through the database links defined in Lesson 8, transparently establishing connections to both remote databases and assembling the distributed result set.

By default, Oracle's query optimizer makes the local Denver database the driving site — it fetches rows from London and Paris to the local site and performs the join locally. When the remote tables are significantly larger than the local table, this default behavior results in large data volumes crossing the network unnecessarily. The DRIVING_SITE hint addresses this.

Location Transparency — Synonyms for Remote Tables

The @link_name notation exposes the fact that tables are remote to anyone reading the SQL. Oracle synonyms eliminate this exposure — creating a local name that maps to the remote table and database link transparently:

-- Create public synonyms to hide the database link names
CREATE PUBLIC SYNONYM customer  FOR customer@london;
CREATE PUBLIC SYNONYM orderline FOR orderline@paris;

Once the synonyms exist, the three-site join can be written as if all tables were local to the Denver database — Oracle resolves the synonym to the remote database link transparently:

-- Distributed join with synonyms — appears identical to a local join
SELECT customer.customer_name,
       order_tab.order_date,
       orderline.quantity_ordered
FROM   customer,
       order_tab,
       orderline
WHERE  customer.cust_number  = order_tab.customer_number
AND    order_tab.order_number = orderline.order_number;

This is location transparency in practice — one of C.J. Date's 12 specifications for an ideal distributed database covered in Lesson 6. The application code is identical whether the tables are local or remote. Migrating a table from a remote database to the local database requires only dropping and recreating the synonym, with no application code changes.

DRIVING_SITE Hint — Controlling Distributed Query Execution

The DRIVING_SITE hint is the most important optimizer hint for distributed query performance. It specifies which database should act as the driving site — the location where the join, grouping, and sorting actually execute. The driving site fetches rows from all other sites and assembles the final result locally before returning it to the originating database.

-- Force execution at the london site — efficient when remote table is largest
SELECT /*+ DRIVING_SITE(d) */
       e.employee_id, e.first_name, d.department_name
FROM   employees e
JOIN   departments@london d ON e.department_id = d.department_id
WHERE  e.hire_date > SYSDATE - 365;

DRIVING_SITE(d) forces the join to execute at the London database — Oracle ships the local employees predicate results to London, joins them against the large departments table there, and returns only the final result set to Denver. For multi-site joins, combine DRIVING_SITE with LEADING to control both the driving site and the join order:

-- Multi-site join: drive at london, start with local customers for early filtering
SELECT /*+ DRIVING_SITE(s) LEADING(c) */
       c.customer_name, o.order_date, s.sales_amount
FROM   customers c
JOIN   orders@paris  o ON c.cust_id  = o.cust_id
JOIN   sales@london  s ON o.order_id = s.order_id
WHERE  c.region = 'EUROPE';

Always use table aliases in the DRIVING_SITE hint — not the full table name or the database link name. The hint applies to the alias defined in the query's FROM clause.

Distributed Query Performance Best Practices

Distributed joins introduce network latency as a performance variable that does not exist in local queries. The following practices minimize that impact:

Distributed Joins in Oracle 19c and 23ai — Multitenant Considerations

Oracle's multitenant architecture — mandatory from Oracle 21c onward and strongly recommended in 19c and 23ai — introduces container scoping that affects where database links are created and how distributed joins execute.

In the pre-12c non-CDB world, a database link connected one complete database to another. In a multitenant CDB/PDB environment, database links are scoped to the container in which they are created:

The DRIVING_SITE hint and all distributed query syntax work identically in multitenant environments — no syntax changes are required. The performance characteristics of distributed joins in 19c and 23ai benefit from improved predicate pushdown in the optimizer, better parallel query execution across distributed sites, and Local Undo mode which improves distributed transaction performance in multitenant deployments.

For very large-scale distributed environments where the number of database links becomes unmanageable — many sites, complex routing, high transaction volumes — Oracle Sharding with Raft replication in Oracle 23ai provides a managed alternative to manual database link topologies. Oracle Sharding routes distributed queries through Oracle Global Data Services rather than explicit database links, reducing the operational overhead of managing individual link definitions across many sites.

Summary

Inter-database communication — the combination of database links and Oracle Net Services — is the infrastructure that enables distributed SQL in Oracle. Joining tables from remote sites is the most powerful practical application of that infrastructure: a single SQL query combining local and remote tables with Oracle Net handling the distributed execution transparently. The @link_name syntax identifies remote tables; synonyms provide location transparency by hiding the link name from application SQL. The DRIVING_SITE hint controls which database executes the join — critical for performance when remote tables are significantly larger than local tables. In Oracle 19c and 23ai multitenant environments, database links are scoped to the PDB in which they are created; the distributed join syntax and optimizer hints are unchanged. For very large distributed topologies, Oracle Sharding in 23ai provides a managed alternative to manual database link architectures. The next lesson examines the Transparent Network Substrate (TNS) — the communication layer that Oracle Net uses to establish all distributed connections.

Overview Topology — Exercise

Click the Exercise link below to practice defining database links and writing distributed queries.
Overview Topology — Exercise

SEMrush Software 9 SEMrush Banner 9