Lists or arrays are commonly used data types in programming. However, Django does not have a built-in field for storing lists in models. In a recent project, I needed to store a list of strings in a model, and I had to find an elegant and efficient way to store the list in the database.

In this article, I’ll show you how to store a list in a Django model using the ArrayField from the django.contrib.postgres.fields module, which perfectly fits this use case.

Don’t Reinvent the Wheel

Storing a list in a Django model is a common use case, and there are many ways to achieve this. You can store the list as a comma-separated string, a JSON string, or use a separate model to store the list items. However, these methods can be inefficient, inelegant, and may not fit all use cases.

While Django aims to be database-agnostic, it does not include a built-in field for storing lists. Fortunately, the django.contrib.postgres.fields module provides an ArrayField that allows you to store lists in a model.

This field saves you from writing custom code to serialize and deserialize the list, and it also allows you to query the list items using Django’s ORM.

Use Case

In my use case, I needed to store a list of days on which a student could choose to receive Python tips and lessons. The list of days could be any of the following: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'].

The requirements were as follows:

  • The list of days should be stored in the database.
  • The list should have a limit of 5 items, raising an error if exceeded.
  • The list should be easily queryable using Django’s ORM.

Implementation

According to the documentation, ArrayField is:

A field for storing lists of data. Most field types can be used, and you pass another field instance as the base_field. You may also specify a size. ArrayField can be nested to store multi-dimensional arrays.

Let’s see how to use it in our model.

from django.contrib.postgres.fields import ArrayField
from django.db import models
from django.utils.translation import gettext_lazy as _
# other imports....

class User(AbstractUser):
    """
    Default custom user model for DailyPy.
    """

    # other fields ...
    lesson_days = ArrayField(
        models.CharField(
            max_length=9,
            choices=[
                ("monday", _("Monday")),
                ("tuesday", _("Tuesday")),
                ("wednesday", _("Wednesday")),
                ("thursday", _("Thursday")),
                ("friday", _("Friday")),
                ("saturday", _("Saturday")),
                ("sunday", _("Sunday")),
            ],
        ),
        size=5,
        blank=True,
        default=list,
        help_text=_("Select up to 5 days for lessons"),
    )

Breakdown of the Code

  • We import the ArrayField from django.contrib.postgres.fields and CharField from django.db.models.
  • We define a User model that inherits from Django’s AbstractUser model.
  • We define a lesson_days field, which is an ArrayField of CharField with a max_length of 9 and choices for the days of the week.
  • We set the size to 5, limiting the number of items in the list to 5. However, this is not enforced at the database level but at the application level.
  • The blank=True option allows the field to be empty, and default=list initializes the field with an empty list.

This approach is feature-rich and more elegant than a custom solution. However, storing a list in the database is only part of the challenge. We also need to query the list items efficiently, and Django’s ORM provides powerful tools for this.

Querying the List

Django provides several lookups specifically for ArrayField:

# Contains lookup returns all users who have lessons on Monday
User.objects.filter(lesson_days__contains=["monday"])

# Contained_by lookup returns all users who have lessons on Monday and Tuesday
User.objects.filter(lesson_days__contained_by=["monday", "tuesday"])

# Overlap lookup returns all users who have lessons on Monday or Tuesday
User.objects.filter(lesson_days__overlap=["monday", "tuesday"])

More lookups can be found in the Django documentation.

Advantages of ArrayField

  1. Efficient Storage:

    • ArrayField allows you to store multiple values in a single database column, which can be more efficient in terms of storage and retrieval compared to creating multiple related records.
  2. Simplified Schema:

    • Instead of creating a separate related table to store multiple values (e.g., using a ForeignKey or ManyToManyField), you can store the list directly in the main table. This simplifies your database schema and can make it easier to manage certain types of data.
  3. Performance:

    • Retrieving related data is faster since the array is stored in the same table. There’s no need to join with other tables, which can reduce query complexity and improve performance for read operations.
  4. Querying Capabilities:

    • PostgreSQL offers powerful querying capabilities for arrays, such as searching for elements, checking if arrays overlap, and more. These capabilities are accessible directly from Django’s ORM.

Limitations of ArrayField

While ArrayField is powerful, it’s important to note that it is specific to PostgreSQL. If you plan to migrate to another database in the future, this could complicate the process. Additionally, ArrayField might not be suitable for very large datasets or for cases where you need highly normalized data.

Custom Validation in the Model

While powerful you could still add some custom validation to the model to ensure that the list is not longer than 5 items and that the items are unique.

from django.core.exceptions import ValidationError

class User(AbstractUser):
    # other fields ...
    lesson_days = ArrayField(
        models.CharField(
            max_length=9,
            choices=[
                ("monday", _("Monday")),
                ("tuesday", _("Tuesday")),
                ("wednesday", _("Wednesday")),
                ("thursday", _("Thursday")),
                ("friday", _("Friday")),
                ("saturday", _("Saturday")),
                ("sunday", _("Sunday")),
            ],
        ),
        size=5,
        blank=True,
        default=list,
        help_text=_("Select up to 5 days for lessons"),
    )

    def clean(self):
        super().clean()
        if len(self.lesson_days) > 5:
            raise ValidationError("You can only select up to 5 days for lessons.")
        if len(set(self.lesson_days)) != len(self.lesson_days):
            raise ValidationError("Days must be unique.")

In the clean method, we check if the length of the lesson_days list is greater than 5 and raise a ValidationError if it is. We also check if the list contains unique items by converting it to a set and comparing the lengths.

Conclusion

Storing a list in a Django model when using PostgreSQL is straightforward with the ArrayField from the django.contrib.postgres.fields module. This approach saves you from writing custom serialization and deserialization code and allows you to efficiently query list items using Django’s ORM.

By considering both the advantages and limitations of ArrayField, you can make an informed decision about whether it’s the right choice for your specific use case.