{
"$type": "site.standard.document",
"canonicalUrl": "https://rednafi.com/python/tqdm-progressbar-with-concurrent-futures/",
"description": "Display progress bars for concurrent Python tasks using tqdm with ThreadPoolExecutor and as_completed for real-time execution monitoring.",
"path": "/python/tqdm-progressbar-with-concurrent-futures/",
"publishedAt": "2023-01-06T00:00:00.000Z",
"site": "at://did:plc:fgtm2c26vfcj74rfmeggbyqj/site.standard.publication/3mnl6f7ob462z",
"tags": [
"Python",
"Concurrency"
],
"textContent": "At my workplace, I was writing a script to download multiple files from different S3\nbuckets. The script relied on Django ORM, so I couldn't use Python's async paradigm to speed\nup the process. Instead, I opted for boto3 to download the files and\nconcurrent.futures.ThreadPoolExecutor to spin up multiple threads and make the requests\nconcurrently.\n\nHowever, since the script was expected to be long-running, I needed to display progress bars\nto show the state of execution. It's quite easy to do with tqdm when you're just looping\nover a list of file paths and downloading the contents synchronously:\n\nBut you can't do this when multiple threads or processes are doing the work. Here's what\nI've found that works quite well:\n\nRunning this will print:\n\nThis script makes 5 concurrent requests by leveraging ThreadPoolExecutor from the\nconcurrent.futures module. The make_request function just sends one request to a URL and\nsleeps for a second to simulate a long-running task. Then the make_requests function spins\nup 5 threads and calls the make_request function in each one with a different URL.\n\nHere, we're instantiating tqdm as a context manager and passing the total length of the\nurls. This allows tqdm to calculate the progress bar. Then in a nested context manager,\nwe spin up the threads and pass the make_request to the executor.submit method. We\ncollect the future objects returned by the executor.submit methods in a list and update\nthe progress bar with pbar.update(1) while iterating through the futures. And that's it,\nmission successful.\n\nI usually use contextlib.ExitStack to avoid nested context managers like this:\n\nRunning this script will yield the same result as before.\n\nFurther reading\n\n- [How to use tqdm with multithreading?]\n\n\n\n\n[how to use tqdm with multithreading?]:\n https://stackoverflow.com/questions/63826035/how-to-use-tqdm-with-multithreading",
"title": "Using tqdm with concurrent.fututes in Python"
}