
A while back I was working on a Django project and wanted to implement fast free text search. Instead of using a regular database for this search function ― such as mysql or PostgreSQL ― I decided to use a NoSQL database. That is when I discovered ElasticSearch .
ElasticSearch indexes documents for your data instead of using data tables like a regular relational database does. This speeds up search, and offers a lot of other benefits that you don’t get with a regular database. I kept a regular relational database as well for storing user details, logins, and other data that ElasticSearch didn’t need to index.
After searching for a long time on how to properly implement ElasticSearch with Django, I didn’t really find any satisfying answers. Some guides or tutorials were convoluted and seemed to be taking unnecessary steps in order to index the data into ElasticSearch. There was quite a bit of information on how to perform searching, but not as much about how the indexing should be done. I felt like there must be a simpler solution out there, so I decided to give it a try myself.
I wanted to keep it as simple as possible, because simple solutions tend to be the best ones i my opinion. KISS (Keep It Simple Stupid), Less is More and all of that stuff is something that resonates with me a lot, especially when every other solution out there is complex. I decided to use Honza Král’s example in this video to have something to base my code on. I recommend watching it, although it is a bit outdated at this point.
Since I was using Django ― which is written in python ― it was easy to interact with ElasticSearch. There are two client libraries to interact with ElasticSearch with Python. There’s elasticsearch-py , which is the official low-level client. And there’s elasticsearch-dsl , which is build upon the former but gives a higher-level abstraction with a bit less functionality.
We will get into some example soon, but first I need to clarify what we want to accomplish:
Setting up ElasticSearch on our local machine and ensuring it works properly Setting up a new Django project Bulk indexing of data that is already in the database Indexing of every new instance that a user saves to the database A basic search exampleAll right, that seems simple enough. Lets get started by installing ElasticSearch on our machine. Also, all the code will be available on my GitHub so that you can easily follow the examples.
Installing ElasticSearchSince ElasticSearch runs on Java you must ensure you have an updated JVM version. Check what version you have with java -version in the terminal. Then you run the following commands to create a new directory, download, extract and start ElasticSearch:
mkdir elasticsearch-example wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.1.1.tar.gz tar -xzf elasticsearch-5.1.1.tar.gz ./elasticsearch-5.1.1/bin/elasticsearchWhen ElasticSearch starts up there should be a lot of output printed to the terminal window. To check that its up and running correctly open up a new terminal window and run this curl command:
curl -XGET http://localhost:9200The response should be something like this:
{ "name" : "6xIrzqq", "cluster_name" : "elasticsearch", "cluster_uuid" : "eUH9REKyQOy4RKPzkuRI1g", "version" : { "number" : "5.1.1", "build_hash" : "5395e21", "build_date" : "2016-12-06T12:36:15.409Z", "build_snapshot" : false, "lucene_version" : "6.3.0" }, "tagline" : "You Know, for Search"Great, you now have ElasticSearch running on your local machine! It’s time to set up your Django project.
Setting up a DjangoprojectFirst you create a virtual environment with virtualenv venv and enter it with source venv/bin/activate in order to keep everything contained. Then you install some packages:
pip install django pip install elasticsearch-dslTo start a new Django project you run:
django-admin startproject elasticsearchproject cd elasticsearchproject python manage.py startapp elasticsearchappAfter you created your new Django projects you need to create a model that you will use. For this guide I chose to go with a good old fashioned blog post example. In models.py you place the following code:
from django.db import modelsfrom django.utils import timezone
from django.contrib.auth.models import User
# Create your models here.
# Blogpost to be indexed into ElasticSearch
class BlogPost(models.Model):
author = models.ForeignKey(User, on_delete=models.CASCADE, related_name='blogpost')
posted_date = models.DateField(default=timezone.now)
title = models.CharField(max_length=200)
text = models.TextField(max_length=1000)
Pretty straight forward, so far. Don’t forget to add elasticsearchapp to INSTALLED_APPS in settings.py and register your new BlogPost model in admin.py like this:
from django.contrib import adminfrom .models import BlogPost
# Register your models here.
# Need to register my BlogPost so it shows up in the admin
admin.site.register(BlogPost)
You must also python manage.py makemigrations , python manage.py migrate and python manage.py createsuperuser to create the database and an admin account. Now, python manage.py runserver , go to http://localhost:8000/admin/ and login. You should now be able to see your Blog posts model there. Go ahead and create your first blog post in the admin.
Congratulations, you now have a functioning Django project! It’s finally time to get into the fun stuff ― connecting ElasticSearch.
Connecting ElasticSearch withDjangoYou begin by creating a new file called search.py in our elasticsearchapp directory. This is where the ElasticSearch code will live. The first thing you need to do here is to create a connection from your Django application to ElasticSearch. You do this in your search.py file:
from elasticsearch_dsl.connections import connectionsconnections.create_connection() Now that you have a global connection to your ElasticSearch set-up you need to define what you want to index