Stream JOINs
In the last article, we pointed out the great resemblance between Streams and SQL constructs. Although, the SQL operation JOIN lacks a natural mapping in the general case. Therefore, Speedment leverages its ownJoinComponent
to join up to 10 tables (using INNER JOIN, RIGHT JOIN, LEFT JOIN or CROSS JOIN) in a type-safe way. Before we introduce the JoinComponent
in more depth, we will elaborate on the similarities between individual tables and joins.
We previously used a Speedment Manager as a handle to a database table. This process is visualized below:
A Manager acts as a handle to a database table and can act as a stream source. In this case, every row corresponds to an instance of Film.
Now that we wish to retrieve data from multiple tables, the Manager
on its own is not sufficient. An SQL JOIN-query outputs a virtual table that combines data from multiple tables in different ways (e.g. depending on the join-type and WHERE-clauses). In Speedment, that virtual table is represented as a Join<T>
object holding tuples of type T
.
Join Component
To retrieve a Join-object we need the previously mentionedJoinComponent
which uses a builder pattern. The resulting Join-objects are reusable and acts as handles to "virtual join tables", as described by this image:
The JoinComponent creates a Join-object which acts as a handle to a virtual table (the result of the join) and can act as a stream source. In this case, every row corresponds to an instance of Tuple2Many-to-one
JoinCompontent
: Join<Tuple2<Film, Language>> join = joinComponent .from(FilmManager.IDENTIFIER) .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID) .build(Tuples::of);Basically, we start with the Film table and perform an INNER JOIN with the Language table on rows that have matching language_id:s. We can then use the Join-object to stream over the resulting Tuples and print them all out for display. As always with Streams, no specific order of the elements is guaranteed even if the same join-element is reused.
join.stream() .forEach(System.out::println); Tuple2Impl {FilmImpl { filmId = 1, title = ACADEMY DINOSAUR, ... }, LanguageImpl { languageId = 1, name = English, ... }} Tuple2Impl {FilmImpl { filmId = 2, title = ACE GOLDFINGER, ... }, LanguageImpl { languageId = 1, name = English, ... }} Tuple2Impl {FilmImpl { filmId = 3, title = ADAPTATION HOLES, ... }, LanguageImpl { languageId = 1, name = English, ... }} …
Many-to-Many
A Many-to-Many relationship is defined as a relationship between two tables where many multiple rows from a first table can match multiple rows in a second table. Often a third table is used to form these relations. For example, an actor may participate in several films and a film usually have several actors. The relation between films and actors in Sakila is described by theFilmActor
table which references films and actors using foreign keys. Hence, if we would like to relate each Film
entry to the actors who starred in that movie we need to join all three tables:
Join<Tuple3<FilmActor, Film, Actor>> join = joinComponent .from(FilmActorManager.IDENTIFIER) .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID) .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID) .build(Tuples::of);We start with the table describing the relation between the Film and Actor and perform and INNER JOIN with both Film and Actor on matching FILM_ID:s and ACTOR_ID:s respectively.
Collect Join Stream to Map
Our Join-object can now be used to create a Map that correlates aFilm
with a List
of the starring Actor
:s. Since the elements of our stream are Tuples we need to point to the desired entries. This is done using zero-indexed getters (get0()
referencing FilmActor
and so on).
Map<Film, List<Actor>> actorsInFilms = join.stream() .collect( groupingBy(Tuple3::get1, mapping(Tuple3::get2, toList()) ) );Lastly we print the entries to display the name of the films and actors.
actorsInFilms.forEach((f, al) -> System.out.format("%s : %s%n", f.getTitle(), al.stream() .sorted(Actor.LAST_NAME) .map(a -> a.getFirstName() + " " + a.getLastName()) .collect(joining(", ") ) ) );
WONDERLAND CHRISTMAS : HARRISON BALE, CHRIS BRIDGES, HUMPHREY GARLAND, WOODY JOLIE, CUBA OLIVIER BUBBLE GROSSE : VIVIEN BASINGER, ROCK DUKAKIS, MENA HOPPER OPUS ICE : DARYL CRAWFORD, JULIA FAWCETT, HUMPHREY GARLAND, SEAN WILLIAMS …
Filtering Tables
If we know initially that we are only interested in a subset of theFilm
entries, it is more efficient to get rid of these instances as we define the Join
-object. This is done using the .where()-operator which is the equivalent to a filter()
on a stream (and maps to the SQL keyword WHERE). As a filter it takes a Predicate
that evaluates to true or false and should be expressed using Speedment Fields
for optimization. Here we want to find the language of the films with titles beginning with an “A”:
Join<Tuple2<Film, Language>> join = joinComponent .from(FilmManager.IDENTIFIER) .where(Film.TITLE.startsWith(“A”)) .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID) .build(Tuples::of);If further filtering is needed, it is possible to stack any number of .where()-operations as they are combined with the SQL keyword AND under the hood.
Specialized Constructors
Sofar we have had to deal with the fairly abstract getters of the tuples (get0, get1 and so on). Although, upon building our Join-object we can provide any constructor to specialized objects. In the examples shown above, we have been interested in the title of the films and the name of the actors. That allows us to define our own objectTitleActorName
as such:
final class TitleActorName { private final String title; private final String actorName; TitleActorName(Film film, Actor actor) { this.title = film.getTitle(); this.actorName = actor.getFirstName() + actor.getLastName(); } public String title() { return title; } public String actorName() { return actorName; } @Override public String toString() { return "TitleLanguageName{" + "title=" + title + ", actorName=" + actorName + '}'; } }We then provide the constructor of our own object to the Join builder and discard the linking
FilmActor
instance since it’s not used:
Join<TitleActorName> join = joinComponent .from(FilmActorManager.IDENTIFIER) .innerJoinOn(Film.FILM_ID).equal(FilmActor.FILM_ID) .innerJoinOn(Actor.ACTOR_ID).equal(FilmActor.ACTOR_ID) .build((fa, f, a) -> new TitleActorName(f, a));This greatly improves the readability of any operations involving the resulting Join-object.
Map<String, List<String>> actorsInFilms = join.stream() .collect( groupingBy(TitleActorName::title, mapping(TitleActorName::actorName, toList()) ) ); actorsInFilms.forEach((f, al) -> System.out.format("%s : %s%n", f, al) );
Simplifying Types
When a large number of tables are joined, the Java type can be tedious to write (e.g.Tuple5<...>
). If you use a more recent version of Java, you can simply omit the type for the local variable like this:
var join = joinComponent .from(FilmManager.IDENTIFIER) .where(Film.TITLE.startsWith(“A”)) .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID) .build(Tuples::of);In this case, Java will automatically infer the type to
Join<Tuple2<Film, Language>>
If you are using an older Java version, you can inline the join-declaration and the stream operator like this:
joinComponent .from(FilmManager.IDENTIFIER) .where(Film.TITLE.startsWith(“A”)) .innerJoinOn(Language.LANGUAGE_ID).equal(Film.LANGUAGE_ID) .build(Tuples::of) .stream() .forEach(System.out::println);
Exercises
This week’s exercises will require combined knowledge from all the previous units and therefore acts as a great follow-up on the previous modules. There is still a connection to an instance of the Sakila database in the cloud so no setup of Speedment is needed. As usual, the exercises can be located in this GitHub repo. The content of this article is sufficient to solve the fifth unit which is calledMyUnit5Extra
. The corresponding Unit5Extra
interface contains JavaDocs which describe the intended implementation of the methods in MyUnit5Extra
.
public interface Unit5Extra { /** * Creates and returns a new Map with Actors as keys and * a List of Films in which they appear as values. * <p> * The result might look like this: * * ActorImpl { actorId = 126, firstName = FRANCES, lastName = TOMEI, ... }=[FilmImpl { filmId = 21, title = AMERICAN CIRCUS, ...}, ...] * … * * @param joinComponent for data input * @return a new Map with Actors as keys and * a List of Films in which they appear as values */ Map<Actor, List<Film>> filmographies(JoinComponent joinComponent);The provided tests (e.g.
Unit5ExtraTest
) will act as an automatic grading tool, letting you know if your solution was correct or not.
Next Article
By now we hopefully managed to demonstrate how neat the Stream API is for database queries. The next article will move beyond the realm of movie rentals and allow you to write standalone database applications in pure Java for any data source. Happy coding!Authors
Per Minborg Julia GustafssonResources
GitHub Opensource Project SpeedmentSpeedment Stream ORM Initializer
GitHub Repository "hol-streams"
Article Part 1: Creating Streams
Article Part 2: Intermediate Operations
Article Part 3: Terminal Operations