{
  "$type": "site.standard.document",
  "canonicalUrl": "https://rednafi.com/python/manipulate-text-with-django-query-expression/",
  "description": "Use Django query expressions like Replace, Upper, Lower, Concat, and Substr for efficient database-level text manipulation without fetching data.",
  "path": "/python/manipulate-text-with-django-query-expression/",
  "publishedAt": "2023-01-07T00:00:00.000Z",
  "site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
  "tags": [
    "Python",
    "Django",
    "Database"
  ],
  "textContent": "I was working with a table that had a similar (simplified) structure like this:\n\nLet's say the above table is represented by the following Django model:\n\nI needed to extract the file names with their extensions from the file_path column and\ncreate new paths by adding the prefix dir/ before each file name. This would involve\nstripping everything before the file name from a file path and adding the prefix, resulting\nin a list of new file paths like this: ['dir/file_1.pdf', ..., 'dir/image_2.jpg'].\n\nUsing Django ORM and some imperative Python code you could do the following:\n\nHere, we use the FileCabinet model to make a query and obtain the file paths. We then use\nPython to split the file paths and extract the file names, and add the prefix dir/ to\ncreate the new paths. While this approach is relatively simple, it can be slow and\nresource-intensive if the size of the working dataset is large. This is because the entire\nworking dataset is loaded into memory and the text manipulation is performed in Python.\n\nTo improve performance and efficiency, Django offers a declarative approach using\nexpressions. These expressions allow you to offload operations like this to the database,\nwhich can be significantly faster and less resource-intensive than the imperative approach,\nespecially for larger querysets. Here's how you can achieve the same result in a declarative\nmanner:\n\nYou can see the new file paths by inspecting the file_cabinet queryset as follows:\n\nThis will give you the following queryset:\n\nNow, let's step through the each of the ORM functionality that was levereged here:\n\nThe annotate function is being used to add additional information to each returned\nFileCabinet object. This function allows you to specify additional fields that should be\ncalculated and included in the returned queryset. Inside the annotation method, we use F\nobjects to reference a model field within the query. They can be used to refer to a field's\nvalue in the context of an update or filter, rather than referring to the actual field\nitself.\n\nThree fields are being added to the FileCabinet objects: last_occur, file_name, and\nfile_path_new.\n\nlast_occur is being calculated by using the StrIndex function. This function takes two\narguments: the string to search and the string to search for. In this case, the string being\nsearched is the file_path field, but it has been passed through the Reverse function to\nreverse the string. This is done so that the StrIndex function starts searching from the\nend of the string, rather than the beginning. The second argument to StrIndex is the\nstring to search for, which in this case is /. The StrIndex function returns the\nposition of the first occurrence of the search string in the main string.\n\nfile_name is being calculated by using the Right function. This function takes two\narguments: the string to extract from and the number of characters to extract. In this case,\nthe string being extracted from is the file_path field, and the number of characters to\nextract is specified by the last_occur field. The last_occur field represents the\nposition of the last occurrence of / in the file_path field, so extracting the characters\nfrom this position onwards gives us the file name with its extension. The - 1 at the end\nis used to remove the / character itself from the extracted string.\n\nFinally, the file_path_new is constructed by using the Concat function. This function\ntakes a variable number of arguments and concatenates them together into a single string. In\nthis case, the dir/ prefix is being concatenated with file_name field.\n\nPerfection!\n\nFurther reading\n\n- [Do database work in the database rather than in Python]\n- [Use StrIndex to get the last instance of a character for an annotation in Django]\n\n\n\n\n[do database work in the database rather than in python]:\n    https://docs.djangoproject.com/en/4.1/topics/db/optimization/#do-database-work-in-the-database-rather-than-in-python\n\n[use strindex to get the last instance of a character for an annotation in django]:\n    https://stackoverflow.com/questions/67030571/django-use-strindex-to-get-the-last-instance-of-a-character-for-an-annotation",
  "title": "Manipulating text with query expressions in Django"
}