Lists or arrays are commonly used data types in programming. However, Django does not have a built-in field for storing lists in models. In a recent project, I needed to store a list of strings in a model, and I had to find an elegant and efficient way to store the list in the database.
In this article, I’ll show you how to store a list in a Django model using the ArrayField
from the django.contrib.postgres.fields
module, which perfectly fits this use case.
Don’t Reinvent the Wheel
Storing a list in a Django model is a common use case, and there are many ways to achieve this. You can store the list as a comma-separated string, a JSON string, or use a separate model to store the list items. However, these methods can be inefficient, inelegant, and may not fit all use cases.
While Django aims to be database-agnostic, it does not include a built-in field for storing lists. Fortunately, the django.contrib.postgres.fields
module provides an ArrayField
that allows you to store lists in a model.
This field saves you from writing custom code to serialize and deserialize the list, and it also allows you to query the list items using Django’s ORM.
Use Case
In my use case, I needed to store a list of days on which a student could choose to receive Python tips and lessons. The list of days could be any of the following: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
.
The requirements were as follows:
- The list of days should be stored in the database.
- The list should have a limit of 5 items, raising an error if exceeded.
- The list should be easily queryable using Django’s ORM.
Implementation
According to the documentation, ArrayField
is:
A field for storing lists of data. Most field types can be used, and you pass another field instance as the
base_field
. You may also specify asize
.ArrayField
can be nested to store multi-dimensional arrays.
Let’s see how to use it in our model.
from django.contrib.postgres.fields import ArrayField
from django.db import models
from django.utils.translation import gettext_lazy as _
# other imports....
class User(AbstractUser):
"""
Default custom user model for DailyPy.
"""
# other fields ...
lesson_days = ArrayField(
models.CharField(
max_length=9,
choices=[
("monday", _("Monday")),
("tuesday", _("Tuesday")),
("wednesday", _("Wednesday")),
("thursday", _("Thursday")),
("friday", _("Friday")),
("saturday", _("Saturday")),
("sunday", _("Sunday")),
],
),
size=5,
blank=True,
default=list,
help_text=_("Select up to 5 days for lessons"),
)
Breakdown of the Code
- We import the
ArrayField
fromdjango.contrib.postgres.fields
andCharField
fromdjango.db.models
. - We define a
User
model that inherits from Django’sAbstractUser
model. - We define a
lesson_days
field, which is anArrayField
ofCharField
with amax_length
of 9 andchoices
for the days of the week. - We set the
size
to 5, limiting the number of items in the list to 5. However, this is not enforced at the database level but at the application level. - The
blank=True
option allows the field to be empty, anddefault=list
initializes the field with an empty list.
This approach is feature-rich and more elegant than a custom solution. However, storing a list in the database is only part of the challenge. We also need to query the list items efficiently, and Django’s ORM provides powerful tools for this.
Querying the List
Django provides several lookups specifically for ArrayField
:
# Contains lookup returns all users who have lessons on Monday
User.objects.filter(lesson_days__contains=["monday"])
# Contained_by lookup returns all users who have lessons on Monday and Tuesday
User.objects.filter(lesson_days__contained_by=["monday", "tuesday"])
# Overlap lookup returns all users who have lessons on Monday or Tuesday
User.objects.filter(lesson_days__overlap=["monday", "tuesday"])
More lookups can be found in the Django documentation.
Advantages of ArrayField
Efficient Storage:
ArrayField
allows you to store multiple values in a single database column, which can be more efficient in terms of storage and retrieval compared to creating multiple related records.
Simplified Schema:
- Instead of creating a separate related table to store multiple values (e.g., using a ForeignKey or ManyToManyField), you can store the list directly in the main table. This simplifies your database schema and can make it easier to manage certain types of data.
Performance:
- Retrieving related data is faster since the array is stored in the same table. There’s no need to join with other tables, which can reduce query complexity and improve performance for read operations.
Querying Capabilities:
- PostgreSQL offers powerful querying capabilities for arrays, such as searching for elements, checking if arrays overlap, and more. These capabilities are accessible directly from Django’s ORM.
Limitations of ArrayField
While ArrayField
is powerful, it’s important to note that it is specific to PostgreSQL. If you plan to migrate to another database in the future, this could complicate the process. Additionally, ArrayField
might not be suitable for very large datasets or for cases where you need highly normalized data.
Custom Validation in the Model
While powerful you could still add some custom validation to the model to ensure that the list is not longer than 5 items and that the items are unique.
from django.core.exceptions import ValidationError
class User(AbstractUser):
# other fields ...
lesson_days = ArrayField(
models.CharField(
max_length=9,
choices=[
("monday", _("Monday")),
("tuesday", _("Tuesday")),
("wednesday", _("Wednesday")),
("thursday", _("Thursday")),
("friday", _("Friday")),
("saturday", _("Saturday")),
("sunday", _("Sunday")),
],
),
size=5,
blank=True,
default=list,
help_text=_("Select up to 5 days for lessons"),
)
def clean(self):
super().clean()
if len(self.lesson_days) > 5:
raise ValidationError("You can only select up to 5 days for lessons.")
if len(set(self.lesson_days)) != len(self.lesson_days):
raise ValidationError("Days must be unique.")
In the clean
method, we check if the length of the lesson_days
list is greater than 5 and raise a ValidationError
if it is. We also check if the list contains unique items by converting it to a set and comparing the lengths.
Conclusion
Storing a list in a Django model when using PostgreSQL is straightforward with the ArrayField
from the django.contrib.postgres.fields
module. This approach saves you from writing custom serialization and deserialization code and allows you to efficiently query list items using Django’s ORM.
By considering both the advantages and limitations of ArrayField
, you can make an informed decision about whether it’s the right choice for your specific use case.