Joins and aliases
Last updated on 2023-05-02 | Edit this page
Estimated time: 45 minutes
Overview
Questions
- How do I join two tables if they share a common point of information?
- How can I use aliases to improve my queries?
Objectives
- Understand how to link tables together via joins.
- Understand when it is valuable to use aliases or shorthand.
Joins
The SQL JOIN
clause allows us to combine columns from
one or more tables in a database by using values common to each. It
follows the FROM
clause in a SQL statement. We also need to
tell the computer which columns provide the link between the two tables
using the word ON
.
Let’s start by joining data from the articles
table with
the journals
table. The ISSNs
columns in both
these tables links them.
ON
is similar to WHERE
, it filters things
out according to a test condition. We use the table.colname
format to tell the SQL manager what column in which table we are
referring to.
We can represent a LEFT
join using the following
diagram.
Alternatively, we can use the word USING
, as a
short-hand. In this case we are telling DB Browser that we want to
combine articles
with journals
and that the
common column is ISSNs
.
This figure shows the relations between the tables and helps to
visualise joining or linking the tables in the database: We will cover
relational database design in the
next episode. In addition to visual above, SQL
Join Types Explained Visually provides visual/animated examples
to help convey to learners what is happening in SQL
JOIN
s.
When joining tables, you can specify the columns you want by using
table.colname
instead of selecting all the columns using
*
. For example:
SQL
SELECT articles.ISSNs, journals.Journal_Title, articles.Title, articles.First_Author, articles.Month, articles.Year
FROM articles
JOIN journals
ON articles.ISSNs = journals.ISSNs;
Joins can be combined with sorting, filtering, and aggregation. So, if we wanted the average number of authors for articles on each journal, we can use the following query:
SQL
SELECT articles.ISSNs, journals.Journal_Title, ROUND(AVG(articles.Author_Count), 2)
FROM articles
JOIN journals
ON articles.ISSNs = journals.ISSNs
GROUP BY articles.ISSNs;
The ROUND
function allows us to round the
Author_Count
number returned by the AVG
function by 2 decimal places.
Challenge
Write a query that JOINS
the articles
and
journals
tables and that returns the
Journal_Title
, total number of articles published and
average number of citations for every journal ISSN.
You can also join multiple tables. For example:
SQL
SELECT articles.Title, articles.First_Author, journals.Journal_Title, publishers.Publisher
FROM articles
JOIN journals
ON articles.ISSNs = journals.ISSNs
JOIN publishers
ON publishers.id = journals.PublisherId;
Challenge:
Write a query that returns the Journal_Title
,
Publisher
name, and number of articles published, ordered
by number of articles in descending order.
There are different types of joins which you can learn more about at SQL Joins Explained.
Aliases
As queries get more complex, names can get long and unwieldy. To help make things clearer we can use aliases to assign new names to items in the query.
We can alias both table names:
SQL
SELECT ar.Title, ar.First_Author, jo.Journal_Title
FROM articles AS ar
JOIN journals AS jo
ON ar.ISSNs = jo.ISSNs;
And column names:
SQL
SELECT ar.title AS title, ar.first_author AS author, jo.journal_title AS journal
FROM articles AS ar
JOIN journals AS jo
ON ar.issns = jo.issns;
The AS
isn’t technically required, so you could do:
But using AS
is much clearer so it is good style to
include it.
Key Points
- Joining two tables in SQL is an good way to analyse datasets, especially when both datasets provide partial answers to questions you want to ask.
- Creating aliases allows us to spend less time typing, and more time querying!