| Lesson 9 | Joining Tables from Remote Sites |
| Objective | How do I join tables from remote sites in a SQL query? |
Inter-database communication is the infrastructure that makes distributed Oracle queries possible —
the combination of database links established in Lesson 8 and Oracle Net Services established in
Lessons 1 through 7. Joining tables from remote sites in a single SQL query is the most powerful
and commonly used practical outcome of that infrastructure. A single SELECT statement
can combine local tables with tables on databases in London, Paris, Tokyo, or any OCI region, and
Oracle Net handles the distributed execution transparently — the application sees a single result
set as if all tables were local.
This lesson covers the mechanics of distributed joins in Oracle, the three-site Denver/London/Paris
example that demonstrates the pattern, synonym creation for location transparency, the
DRIVING_SITE optimizer hint for performance control, distributed query best practices,
and how multitenant architecture in Oracle 19c and 23ai affects distributed join behavior.
The distinction between inter-database communication and joining remote tables is important:
CREATE DATABASE LINK combined with Oracle Net Services providing the
network connectivity layer. You can use inter-database communication for simple remote queries
(SELECT * FROM table@remote_link), remote procedure calls, distributed DML, and
distributed transactions without ever writing a multi-site join.
The feature has existed since Oracle 7 with SQL*Net and remains fundamentally unchanged in Oracle
23ai — the same table@database_link syntax, the same optimizer mechanics, and the
same transparency to the application layer. What has changed is performance, security defaults,
and multitenant scoping.
Before joining tables from a remote site, a database link must exist. The general syntax for creating a database link is:
CREATE [SHARED] [PUBLIC] DATABASE LINK link_name
CONNECT TO username IDENTIFIED BY password
[USING 'connect_string'];
In Oracle 23ai, the recommended approach uses CONNECT TO CURRENT_USER to avoid
storing passwords in the data dictionary, combined with an Easy Connect Plus string in the
USING clause:
-- Oracle 23ai recommended: Current User link with Easy Connect Plus
CREATE DATABASE LINK london
CONNECT TO CURRENT_USER
USING 'tcps://london-scan:2484/sales_pdb?ssl_server_dn_match=yes';
CREATE DATABASE LINK paris
CONNECT TO CURRENT_USER
USING 'tcps://paris-scan:2484/orders_pdb?ssl_server_dn_match=yes';
Once the database links exist, any table on the remote database is accessible using the
table_name@link_name syntax. A simple two-table distributed join between a local
table and a remote table:
-- Join local employees with remote departments via database link
SELECT e.employee_id, e.first_name, d.department_name
FROM employees e
JOIN departments@london d ON e.department_id = d.department_id
WHERE e.hire_date > SYSDATE - 365;
The classic Oracle distributed join example connects three tables across three geographically separate databases: a local ORDER table in Denver, a CUSTOMER table in London, and an ORDERLINE table in Paris. A single SQL query retrieves joined data from all three sites:
-- Three-site distributed join: local ORDER, remote CUSTOMER and ORDERLINE
SELECT customer.customer_name,
order_tab.order_date,
orderline.quantity_ordered
FROM customer@london,
order_tab,
orderline@paris
WHERE customer.cust_number = order_tab.customer_number
AND order_tab.order_number = orderline.order_number;
Note that ORDER is a reserved word in Oracle SQL and cannot be used as a table name
without quoting — the example uses order_tab as the local table name. Oracle Net
resolves the @london and @paris references through the database links
defined in Lesson 8, transparently establishing connections to both remote databases and
assembling the distributed result set.
By default, Oracle's query optimizer makes the local Denver database the driving site — it fetches
rows from London and Paris to the local site and performs the join locally. When the remote tables
are significantly larger than the local table, this default behavior results in large data volumes
crossing the network unnecessarily. The DRIVING_SITE hint addresses this.
The @link_name notation exposes the fact that tables are remote to anyone reading
the SQL. Oracle synonyms eliminate this exposure — creating a local name that maps to the remote
table and database link transparently:
-- Create public synonyms to hide the database link names
CREATE PUBLIC SYNONYM customer FOR customer@london;
CREATE PUBLIC SYNONYM orderline FOR orderline@paris;
Once the synonyms exist, the three-site join can be written as if all tables were local to the Denver database — Oracle resolves the synonym to the remote database link transparently:
-- Distributed join with synonyms — appears identical to a local join
SELECT customer.customer_name,
order_tab.order_date,
orderline.quantity_ordered
FROM customer,
order_tab,
orderline
WHERE customer.cust_number = order_tab.customer_number
AND order_tab.order_number = orderline.order_number;
This is location transparency in practice — one of C.J. Date's 12 specifications for an ideal distributed database covered in Lesson 6. The application code is identical whether the tables are local or remote. Migrating a table from a remote database to the local database requires only dropping and recreating the synonym, with no application code changes.
The DRIVING_SITE hint is the most important optimizer hint for distributed query
performance. It specifies which database should act as the driving site — the location where the
join, grouping, and sorting actually execute. The driving site fetches rows from all other sites
and assembles the final result locally before returning it to the originating database.
-- Force execution at the london site — efficient when remote table is largest
SELECT /*+ DRIVING_SITE(d) */
e.employee_id, e.first_name, d.department_name
FROM employees e
JOIN departments@london d ON e.department_id = d.department_id
WHERE e.hire_date > SYSDATE - 365;
DRIVING_SITE(d) forces the join to execute at the London database — Oracle ships the
local employees predicate results to London, joins them against the large
departments table there, and returns only the final result set to Denver. For multi-site
joins, combine DRIVING_SITE with LEADING to control both the driving
site and the join order:
-- Multi-site join: drive at london, start with local customers for early filtering
SELECT /*+ DRIVING_SITE(s) LEADING(c) */
c.customer_name, o.order_date, s.sales_amount
FROM customers c
JOIN orders@paris o ON c.cust_id = o.cust_id
JOIN sales@london s ON o.order_id = s.order_id
WHERE c.region = 'EUROPE';
Always use table aliases in the DRIVING_SITE hint — not the full table name or the
database link name. The hint applies to the alias defined in the query's FROM clause.
Distributed joins introduce network latency as a performance variable that does not exist in local queries. The following practices minimize that impact:
DRIVING_SITE on the site with
the largest table or the best filtering capability. Test execution with driving at both the local
and remote site and compare elapsed time and bytes transferred in V$SQL.SELECT * in distributed
queries. Every column in the select list that comes from a remote table crosses the network.
Specify only the columns required by the application.SET ARRAYSIZE 5000.
In JDBC, set setFetchSize() on the statement. In OCI, configure
prefetch_rows.SQLNET.COMPRESSION = ON in sqlnet.ora reduces bytes
transferred at the cost of CPU on both sides.REMOTE operations identifying which table accesses execute at remote sites. High
values in the bytes sent/received columns of V$SQL indicate excessive
data movement across the network — the primary signal that DRIVING_SITE tuning
is needed.Oracle's multitenant architecture — mandatory from Oracle 21c onward and strongly recommended in 19c and 23ai — introduces container scoping that affects where database links are created and how distributed joins execute.
In the pre-12c non-CDB world, a database link connected one complete database to another. In a multitenant CDB/PDB environment, database links are scoped to the container in which they are created:
The DRIVING_SITE hint and all distributed query syntax work identically in multitenant
environments — no syntax changes are required. The performance characteristics of distributed joins
in 19c and 23ai benefit from improved predicate pushdown in the optimizer, better parallel query
execution across distributed sites, and Local Undo mode which improves distributed transaction
performance in multitenant deployments.
For very large-scale distributed environments where the number of database links becomes unmanageable — many sites, complex routing, high transaction volumes — Oracle Sharding with Raft replication in Oracle 23ai provides a managed alternative to manual database link topologies. Oracle Sharding routes distributed queries through Oracle Global Data Services rather than explicit database links, reducing the operational overhead of managing individual link definitions across many sites.
Inter-database communication — the combination of database links and Oracle Net Services — is the
infrastructure that enables distributed SQL in Oracle. Joining tables from remote sites is the
most powerful practical application of that infrastructure: a single SQL query combining local and
remote tables with Oracle Net handling the distributed execution transparently. The
@link_name syntax identifies remote tables; synonyms provide location transparency
by hiding the link name from application SQL. The DRIVING_SITE hint controls which
database executes the join — critical for performance when remote tables are significantly larger
than local tables. In Oracle 19c and 23ai multitenant environments, database links are scoped to
the PDB in which they are created; the distributed join syntax and optimizer hints are unchanged.
For very large distributed topologies, Oracle Sharding in 23ai provides a managed alternative to
manual database link architectures. The next lesson examines the Transparent Network Substrate
(TNS) — the communication layer that Oracle Net uses to establish all distributed connections.