In this tutorial, we will set up a Spring boot application to use Hibernate search with a Lucene indexing backend. This tutorial assumes that you already have a working Spring boot application with JPA/Hibernate configured. Please check out our tutorial “Getting started with Spring boot 2 and JPA” if you are new to Spring boot and Hibernate.

If you would like to have a solid knowledge foundation about JPA and Hibernate, then I strongly recommend reading the book “High-Performance Java Persistence (click to check current price on Amazon)” by Vlad Mihalcea. The book focuses on JPA and working with relational databases in order to cleanly develop applications that utilize a persistence backend. The author is one of the developers of the Hibernate framework which gives it a high authority on the topic.

I would also recommend that you read our article “Configuring and mapping database tables to JPA entities” if you are not familiar with mapping database tables to JPA entities.

Introduction

Hibernate search is an opensource library that integrates easily with existing Hibernate ORM/JPA systems. When Hibernate Search is installed onto an application, it performs two functions.

First, it provides an indexing API to be used for your indexing configuration. For example, you may decide to index the bank account numbers in your banking application, as it is an often searched term. For this, you can use certain annotations from Hibernate Search to mark that field for indexing.

The second function that Hibernate Search performs is to integrate your application with other data indexing, search and analysis systems. Such technologies include Apache Lucene, Elastic Search and Solr. Hibernate Search will integrate the indexed data with these systems. In other words, only the data that you marked for inclusion in the index will be available on these systems.

Apache Lucene is an opensource indexing and text search library. Lucene manages to do these tasks very efficiently, causing it to become not just popular, but also as the basic building block of numerous other systems, such as Elastic search, Apache Solr and many more.

In this tutorial, we will focus on configuring Hibernate Search with Lucene as the search technology onto a Spring boot application.

Adding the required dependencies

In order to get started with the configuration, we will need to add Hibernate search dependency in our application. We will add the following dependency in to our pom.xml file.

As of the time of the writing of this article, the most recent version of the hibernate-search-orm dependency is 5.11.1.Final. Please make sure to check the maven repository here in order find out the latest version to add to your application.

Configuring the index location

Lucene provides indexing capabilities to other systems that utilize it’s API, in this case, Hibernate Search. When indexing data, the resulting indices can be stored locally in the filesystem, on a remote system such as Elastic Search or Solr, or on the cloud.

In this tutorial, we will store the indices  on the local filesystem for the sake of simplicity. Let us edit our application.properties file by adding the following properties.

The “spring.jpa.properties.hibernate.search.default.directory_provider” property indicates the location where the index files will be written. This is indicated by the usage of the “filesystem” value.

The “spring.jpa.properties.hibernate.search.default.indexBase” indicates the location where the lucene index files will be written.

Please note that the spring.jpa.properites. suffix is added for Spring-Boot’s autoconfiguration to pick up the properties and configure Hibernate Search for us. If you are not using Spring-boot, then you can omit this suffix, but you will need to add these properties to a “Hibernate.properties” file or configure these properties programmatically.

If you want to improve your Spring boot skills, then I would suggest that you check out the book “Learning Spring Boot 2.0 – Second Edition: Simplify the development of lightning fast applications based on microservices and reactive programming”(click to check current price on Amazon) by Greg Turnquist. The book covers Spring boot basics and provides an overview of integrating important technologies such as AMQP messaging and REST into your application.

Configuring indices on JPA entities 

Now that we have all the required configurations in place, it is time to configure the JPA entities for indexing. Let us take the following simple entity as an example.


The @Indexed annotation

In order to mark an entity for indexing, we will need to add the @Indexed annotation to the entity. This tells Hibernate Search that the entity contains indexed fields. This in turn causes Lucene to create an index file for the entity.


Please note that if you add the @Indexed annotation to a JPA entity class, the entity’s Id field will automatically be added to the index. This means that Id fields will not need any extra configuration to be included to the index. However, you can still add extra annotations to the Id field if you would like to change the default indexing behavior for that field.


The @Field annotation

In order to include an entity field into the index file, we will need to add the @Field annotation to the field. For example, to mark the “name” field for indexing, we will configure the field’s getter method as follows.


Configuring Spring-boot to create the index files

There is one extra step that is required in order to complete our Lucene configuration, and that is to create the index files. By default, Hibernate search will not create the files unless explicitly instructed to do so.

This can be done by using the FullTextEntityManager. If you have an efficient database design, then recreating the full index should take only a few seconds. Therefore, we recommend that you trigger the reindexing on application startup. There are few reasons for this:

  • If your database was modified while your Spring-boot application is down, for example, due to maintenance or upgrades.
  • If your application did not shutdown properly, there is no guarantee that your database and your index will be in sync. Therefore its better to trigger the indexing in order to have both the database and the index consistent.
  • To clean up any junk modifications that may have occurred in the index files while the application was down.


In order to achieve this behavior, we will create a new Spring bean called “LuceneIndexSupport”. On application startup, this bean will be created and through this bean, we will trigger the indexing. Let us start by first creating the bean.


Having the FullTextEntityManager contained in a bean has the advantage that it lives within your application context. Hence, it can be reused later on in the application to manually (or automatically) cleanup and trigger the indexing.

The call fullTextEntityManager.createIndexer().startAndWait() is a synchronous call. Therefore, it is recommended to be used only if you have an efficient Lucene index design. If the operation takes a long time, then we recommend that you explore other options, such as the .start() asynchronus call.

Now, let us start up this bean in a Java configuration file.


Finally, let us make sure that this configuration is picked up by the Spring boot application using the @Import annotation.


If you are not sure how to configure Spring beans using Java configurations, then we recommend that you check out our tutorial “How to define and declare Spring beans using Java configuration and constructor injection”.

Notice that our “Person” table is empty. Therefore, let us create a person repository and add some data to our example using a quick “CommandLineRunner” bean.


Let us add data using by adding the CommandLineRunner bean to our application.


Let us start up our Spring boot application. If everything was configured properly, then you should see the following statements in the logs.

Notice that the logs says “0 entites”. This is because on startup, our only indexed entity has no data. If we re-run the application, we should see the following in the logs.


You should be able to find the index in the location we defined in the application.properties file. The folder name should be the same as the fully qualified entity name. In our example, we will find the index in a folder named com.nullbeans.persistence.models.Person. The contents of the folder will look as follows.


Please keep in mind that during the uptime of your application, Hibernate Search should make sure that your index is up to data automatically. This means that you do not need to trigger the indexing everytime you create a new entity or modify an existing one. Hibernate Search will automatically re-write the affected index file.

Querying data from the index


What is the use of an index if we cannot query it, right? Let us create a query to search in our “Person” entity index. In order to search for items in a Lucene index, we will need to create a Lucene Query. Luckily, you do not need to write  complicated query syntax. We can simply use the QueryBuilder from Hibernate DSL.

The way it works is as follows:

  • Instantiate a Hibernate DSL query builder. This one can generate queries which can be translated to Lucene queries.
  • Use the QueryBuilder to define the query search parameters.
  • Convert the query into a Lucene query using the FullTextEntityManager.
  • Run the query.


Let us translate these steps into a test method.


If we run this test, we will get the following result (make sure to add a toString method to the Person class):


Notice that we were able to find our entity, even though the query was not an exact match to the person’s name. This is because Lucene tokenizes the indexed data. So if your query matches on or more tokens, then you will be able to find the corresponding entity in the search results.

Summary

In this tutorial, we were able to create index JPA entity data using Lucene and Hibernate Search. We explored how to configure a Spring-boot application to utilize these search frameworks in order to index and query data.

If you have any questions, then please let us know in the comments below!