Raw Record Source

{
  "$type": "site.standard.document",
  "canonicalUrl": "https://rednafi.com/python/recipes-from-python-sqlite-docs/",
  "description": "Practical SQLite recipes for Python: execute statements, batch operations, transactions, row factories, and context managers with sqlite3.",
  "path": "/python/recipes-from-python-sqlite-docs/",
  "publishedAt": "2022-09-11T00:00:00.000Z",
  "site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
  "tags": [
    "Database",
    "Python",
    "SQL"
  ],
  "textContent": "While going through the documentation of Python's [sqlite3] module, I noticed that it's\nquite API-driven, where different parts of the module are explained in a prescriptive\nmanner. I, however, learn better from examples, recipes, and narratives. Although a few good\nrecipes already exist in the docs, I thought I'd also enlist some of the examples I tried\nout while grokking them.\n\nExecuting individual statements\n\nTo execute individual statements, you'll need to use the cursor_obj.execute(statement)\nprimitive.\n\nExecuting batch statements\n\nYou can bundle up multiple statements and execute them in a single go with the\ncursor_obj.executemany(template_statement, (data, ...)) API.\n\nApplying user-defined callbacks\n\nYou can define and apply arbitrary Python callbacks to different data points in an SQLite\ntable. There are two types of callbacks that you can apply:\n\n- Scalar function: A scalar function returns one value per invocation; in most cases, you\n  can think of this as returning one value per row.\n\n- Aggregate function: In contrast, an aggregate function returns one value per group of\n  rows.\n\nApplying user-defined scalar functions\n\nIn the following example, I've created a table called users with two text type columns -\nusername and password. Here, we define a transformation scalar function named sha256\nthat applies sha256 hashing to all the elements of the password column. The function is\nthen registered via the connection_obj.create_function(func_name, narg, func) API.\n\nApplying user-defined aggregate functions\n\nAggregate functions are defined as classes and then registered with the\nconnection_obj.create_aggregate(func_name, narg, aggregate_class) API. In the example\nbelow, I've created a table called series with a single integer type column val. To\ndefine an aggregate function, we'll need to write a class with two methods - step and\nfinalize where step will return the value of an intermediary progression step and\nfinalize will return the final result. Below, you can see that the aggregate function\nreturns a single value in the output.\n\nPrinting traceback when a user-defined callback raises an error\n\nBy default, sqlite3 will suppress the traceback of any error raised from an user-defined\nfunction. However, you can turn on the traceback option as follows:\n\nTransforming types\n\nConventionally, Python sqlite3 documentation uses the term _adaptation_ to refer to the\ntransformation that changes Python types to SQLite types and _conversion_ to refer to the\nchange in the reverse direction.\n\nAdapting Python types to SQLite types\n\nTo transform Python types to native SQLite types, you'll need to define a transformation\ncallback that'll carry out the task. Then the callback will need to be registered with the\nsqlite3.register_adapter(type, adapter_callback) API.\n\nHere, I've created an in-memory table called colors with a single text type column name\nthat refers to the name of the color. Then I register the lambda color: color.value\nanonymous function that serializes an enum value to a text value. This allows me to pass an\nenum member directly into the cursor_obj.execute method.\n\nConverting SQLite types to Python types\n\nConverting SQLite types to Python types works similarly to the previous section. Here, as\nwell, I've created the same colors table with a single name column as before. But this\ntime, I want to insert string values into the name column and get back native enum objects\nfrom that field while performing a get query.\n\nTo do so, I've registered a converter function with the\nsqlite3.register_converter(\"sqlite_type_as_a_string\", converter_callback) API. Another\npoint to keep in mind is that you'll have to set detect_type=sqlite3.PARSE_DECLTYPES in\nthe sqlite3.connection method for the adaptation to work. Notice the output of the last\nselect ... statement and you'll see that we're getting enum objects in the returned list.\n\nUsing the default adapters and converters\n\nThe sqlite3 module also employs some default adapters and converters that you can take\nadvantage of without defining and registering custom transformers. For example, SQLite\ndoesn't have any special types to represent a date or timestamp. However, Python sqlite3\nallows you to annotate a column with a special type and it'll automatically convert the\nvalues of the column to a compatible type of Python object while returning the result of a\nget query.\n\nHere, I've created a table called timekeeper with two columns - d and dt where d\nexpects a date and dt expects a timestamp. So, in the table creation DDL statement, we\nannotate the columns with date and timestamp types respectively. We've also turned on\ncolumn type parsing by setting\ndetect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES in the sqlite3.connect\nmethod.\n\nFinally, notice how we're inserting datetime.date and datetime.datetime objects directly\ninto the table. Also, this time, the final select ... statement looks a bit different.\nWe're specifying the expected type in the select ... statement and it's returning native\nPython objects in the returned list.\n\nImplementing authorization control\n\nSometimes you need control over what operations are allowed to be run on an SQLite database\nand what aren't. The connection_obj.set_authorizer(auth_callback) allows you to implement\nauthorization control. Here, auth_callback takes in 5 arguments. From the docs:\n\n> The 1st argument to the callback signifies what kind of operation is to be authorized. The\n> 2nd and 3rd arguments will be arguments or None depending on the 1st argument. The 4th\n> argument is the name of the database (\"main\", \"temp\", etc.) if applicable. The 5th\n> argument is the name of the inner-most trigger or view that is responsible for the access\n> attempt or None if this access attempt is directly from input SQL code. Please consult the\n> SQLite documentation about the possible values for the first argument and the meaning of\n> the second and third arguments depending on the first one.\n\nYou can find the list of all the [supported SQLite actions]. In the following example, I'm\ndisallowing create table, create index, drop table, and drop index actions. To deny an\naction, the auth_callback will have to return sqlite3.SQLITE_DENY and that'll raise an\nsqlite3.DatabaseError exception whenever a user tries to execute any of the restricted\nactions. Returning sqlite3.SQLITE_OK from the callback ensures that unfiltered actions can\nstill pass through the guardrail without incurring any errors.\n\nChanging the representation of a row\n\nThe sqlite3 module allows you to change the representation of a database row to your\nliking. By default, the result of a query comes out as a list of tuples where each tuple\nrepresents a single row. However, you can change the representation of database rows in such\na way that the result might come out as a list of dictionaries or a list of custom objects.\n\nVia an arbitrary container object as the row factory\n\nYou can attach a callback to the connection_obj.row_factory attribute to change how you\nwant to display the rows in a result list. The factory callback takes in two arguments -\ncursor and row where cursor is a tuple containing some metadata related to a single\ntable record and row is the default representation of a single database row as a tuple.\n\nIn the following snippet, just like before, I'm creating the same colors table with two\ncolumns - name and hex. Here, the row_factory function is the factory callback that\nconverts the default row representation from a tuple to a dictionary. We're then registering\nthe row_factory function with the connection_obj.row_factory = row_factory assignment\nstatement. Afterward, the sqlite3 module calls this statement on each record and\ntransforms the representation of the rows. When you run the snippet, you'll see that the\nresult comes out as a list of dictionaries instead of a list of tuples.\n\nVia a specialized Row object as the row factory\n\nInstead of rolling with your own custom row factory, you can also take advantage of the\nhighly optimized sqlite3.Row object. From the docs:\n\n> A Row instance serves as a highly optimized row_factory for Connection objects. It\n> supports iteration, equality testing, len(), and mapping access by column name and index.\n> Two row objects compare equal if have equal columns and equal members.\n\nIn the following example, I've reused the script from the previous section and just replaced\nthe custom row factory callback with sqlite3.Row. In the output, you'll see that the Row\nobject not only allows us to access the value of a column by row[column_name] syntax but\nalso let us convert the representation of the final result.\n\nVia text factory\n\nIf you need to apply a common transformation callback to multiple text columns, the\nsqlite3 module has a shortcut to do so. You can certainly write an ordinary row factory\nthat'll only transform the text columns but the connection_obj.text_factory attribute\nenables you to do that in a more elegant fashion. You can set\nconnection_obj.text_factory = row_factory and that'll selectively apply the row_factory\ncallback only to the text columns. In the following example, I'm applying an anonymous\nfunction to the text columns to translate the color names to English.\n\nCreating custom collation\n\nCollation defines how the string values in a text column are compared. It also dictates how\nthe data in the column will be ordered when you perform any kind of sort operation. A\ncollation callback can be registered with the\nconnection_obj.create_collation(name, collation_callback) syntax where the name denotes\nthe name of the collation rule and the collation_callback determines how the string\ncomparison should be done. The callback accepts two string values as arguments and returns:\n\n- 1 if the first is ordered higher than the second\n- -1 if the first is ordered lower than the second\n- 0 if they are ordered equal\n\nThen you can use the collation rules with an order by clause as follows:\n\nHere's a full example of a collation callback in action:\n\nRegistering trace callbacks to introspect running SQL statements\n\nDuring debugging, I often find it helpful to be able to trace all the SQL statements running\nunder a certain connection. This becomes even more useful in a multiprocessing environment\nwhere each process opens a new connection to the DB and runs its own sets of SQL queries. We\ncan leverage the connection_obj.set_trace_callback method to trace the statements. The\nset_trace_callback method accepts a callable that takes a single argument and sqlite3\nmodule passes the currently running statement to the callback every time it invokes it.\nNotice how the output prints all the statements executed by SQLite behind the scene. This\nalso reveals that cursor_obj.executemany wraps up multiple statements in a transaction and\nruns them in an atomic manner.\n\nBacking up a database\n\nThere are a few ways you can back up your database file via Python sqlite3.\n\nDumping the database iteratively\n\nThe following snippet creates a table, inserts some data into it, and then, iteratively\nfetches the database content via the connection_obj.iterdump() API. Afterward, the\nreturned content is written to another database file using the file.write primitive.\n\nFor demonstration purposes, I'm using an in-memory DB and backing that up in another\nNamedTemporaryFile. This will work the same way with an on-disk DB and on-disk backup file\nas well. One advantage of this approach is that your data is not loaded into memory at once,\nrather it's streamed iteratively from the main DB to the backup DB.\n\nCopying an on-disk database to another\n\nThis example shows another approach that you can adopt to create a second copy of your\non-disk DB. First, it connects to the source DB and then creates another connection to an\nempty backup DB. Afterward, the source data is backed up to the destination DB with the\nconnection_obj_source.backup(connection_obj_destination) API.\n\nThe .backup method takes in three values - a connection object that points to the\ndestination DB, the number of pages to copy in a single pass, and a callback to introspect\nthe progress. You can set the value of the progress parameter to -1 if you want to load\nthe entire source database into memory and copy everything to the destination in a single\npass. Also, in this example, the progress hook just prints the progress of the copied\npages.\n\nLoading an on-disk database into the memory\n\nThe connection_obj.backup API also lets you load your existing database into memory. This\nis helpful when the DB you're working with is small and you want to leverage the extra\nperformance benefits that come with an in-memory DB. The workflow is almost exactly the same\nas before and the only difference is that the destination connection object points to an\nin-memory DB instead of an on-disk one.\n\nCopying an in-memory database to an on-disk file\n\nYou can also dump your in-memory DB into the disk. Just point the source connection object\nto the in-memory DB and the destination connection to the on-disk DB file.\n\nImplementing a full text search engine\n\nThis is not exactly a feature that's specific to the sqlite3 API. However, I wanted to\nshowcase how effortless it is to leverage SQLite's native features via the Python API. The\nfollowing example creates a virtual table and implements a full-text search engine that\nallows us to fuzzy search the colors in the colors table by their names or hex values.\n\n\n\n\n[sqlite3]:\n    https://docs.python.org/3/library/sqlite3.html\n\n[supported SQLite actions]:\n    https://www.sqlite.org/c3ref/constlist.html",
  "title": "Recipes from Python SQLite docs"
}