{ "cells": [ { "cell_type": "markdown", "id": "8f399a66-d82e-4210-a08e-4501903b6e2b", "metadata": { "tags": [] }, "source": [ "# Memory Profiling Python Scripts in the MAAP ADE\n", "\n", "Authors: Rajat Shinde (UAH), Alex Mandel (Development Seed), Jamison French (Development Seed), Sheyenne Kirkland (UAH), Brian Freitag (NASA MSFC), Chuck Daniels (Development Seed)\n", "\n", "Date: February 7, 2024\n", "\n", "Description: Memory profiling your Python script is a good practice to understand the resource requirements. This is useful when you have working code and you want to estimate the size of the DPS worker to be used. Additionally, it is helpful to optimize the code for resource requirements. \n", "\n", "In this tutorial, we will use [memory-profiler](https://pypi.org/project/memory-profiler/) for profiling a sample Python script [demo_memory_profiling.py](./demo_memory_profiling.py). We also see how to log the output to a `.log` file. \n", "\n", "### Run This Notebook\n", "To access and run this tutorial within MAAP's Algorithm Development Environment (ADE), please refer to the [\"Getting started with the MAAP\"](https://docs.maap-project.org/en/latest/getting_started/getting_started.html) section of our documentation.\n", "\n", "Disclaimer: It is highly recommended to run a tutorial within MAAP's ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.\n", "\n", "### Additional Resources\n", "\n", "1. [https://github.com/pythonprofilers/memory_profiler](https://github.com/pythonprofilers/memory_profiler)" ] }, { "cell_type": "markdown", "id": "24efa20c-66e9-45d2-bc6d-695b38329158", "metadata": {}, "source": [ "### Installation \n", "\n", "We will begin by installing `memory-profiler` in the current working environment." ] }, { "cell_type": "code", "execution_count": 1, "id": "c54089e1-d6ec-4386-95da-06fdd3c357cb", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Collecting memory-profiler\n", " Using cached memory_profiler-0.61.0-py3-none-any.whl (31 kB)\n", "Requirement already satisfied: psutil in /opt/conda/envs/vanilla/lib/python3.10/site-packages (from memory-profiler) (5.9.7)\n", "Installing collected packages: memory-profiler\n", "Successfully installed memory-profiler-0.61.0\n", "\u001b[33mWARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv\u001b[0m\u001b[33m\n", "\u001b[0m" ] } ], "source": [ "# !pip install -U memory-profiler" ] }, { "cell_type": "markdown", "id": "5a3baaf1-8d1b-4e61-9798-f205fc2648fd", "metadata": {}, "source": [ "### Add Decorator \n", "\n", "Typically, line-by-line memory usage is required for analyzing code. For this example, we are creating dummy functions to be profiled, named `my_function` and `my_other_function`. \n", "\n", "You may add the `@profile` decorator to individual functions that you want to profile. This allows you to limit which parts of your program are profiled, thus limiting the volume of profiling output.\n", "\n", "However, this requires you to modify your code. If you wish to avoid modifying your code, particularly when it is not yet obvious which parts of your code may be consuming too much memory, simply add the `-m memory_profiler` to the python command." ] }, { "cell_type": "code", "execution_count": 13, "id": "f4c9941c-e8b8-4c3c-b545-a173284248d8", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ERROR: Could not find file /tmp/ipykernel_314/3429381877.py\n", "ERROR: Could not find file /tmp/ipykernel_314/3429381877.py\n" ] } ], "source": [ "from memory_profiler import profile\n", "\n", "@profile\n", "def my_function():\n", " # Perform some potentially memory-intensive computation\n", " \n", " return 0\n", "\n", "@profile\n", "def my_other_function():\n", " # Perform some potentially memory-intensive computation\n", " \n", " return 0\n", "\n", "def main():\n", " my_function()\n", " my_other_function()\n", " \n", " #...\n", "\n", "if __name__ == \"__main__\":\n", " main()" ] }, { "cell_type": "markdown", "id": "de6a276c-78a9-4769-9fa8-f387deaac9de", "metadata": {}, "source": [ "### Running Memory Profiler \n", "\n", "For understanding how to run memory profiler on an existing Python script from a Jupyter notebook, we copied the code snippet from above to a file named `demo_memory_profiling.py` in the working directory. After executing the Python script, we can see the details about memory usage and increment due to a particular line in the output." ] }, { "cell_type": "code", "execution_count": 10, "id": "dababd0c-74e8-4b26-ae15-f25ccbac3c0e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Filename: /projects/maap-documentation/docs/source/technical_tutorials/user_data/demo_memory_profiling.py\n", "\n", "Line # Mem usage Increment Occurrences Line Contents\n", "=============================================================\n", " 6 43.6 MiB 43.6 MiB 1 @profile\n", " 7 def my_function():\n", " 8 # Include each line of the script which needs to be profiled\n", " 9 # under this function\n", " 10 \n", " 11 43.6 MiB 0.0 MiB 1 return 0\n", "\n", "\n", "Filename: /projects/maap-documentation/docs/source/technical_tutorials/user_data/demo_memory_profiling.py\n", "\n", "Line # Mem usage Increment Occurrences Line Contents\n", "=============================================================\n", " 13 43.6 MiB 43.6 MiB 1 @profile\n", " 14 def my_other_function():\n", " 15 # Include each line of the script which needs to be profiled\n", " 16 # under this function\n", " 17 \n", " 18 43.6 MiB 0.0 MiB 1 return 0\n", "\n", "\n" ] } ], "source": [ "# With @profile decorator in the script\n", "\n", "!python demo_memory_profiling.py" ] }, { "cell_type": "markdown", "id": "8122d39c-d7d8-4db4-9b08-b89280ee24ce", "metadata": {}, "source": [ "### Logging the Output\n", "\n", "By default, the output can be seen in the cell output or on the command line as standard output. This can be changed to store the output in a log file. For more details, it is recommended to follow the [documentation](https://github.com/pythonprofilers/memory_profiler?tab=readme-ov-file#reporting)." ] }, { "cell_type": "code", "execution_count": 14, "id": "24518fe3-973e-4c23-ba46-afb04da2815e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ERROR: Could not find file /tmp/ipykernel_314/1328302985.py\n", "ERROR: Could not find file /tmp/ipykernel_314/1328302985.py\n" ] } ], "source": [ "fp=open('memory_profiler.log','w+')\n", "@profile(stream=fp)\n", "def my_function():\n", " # Perform some potentially memory-intensive computation\n", " \n", " return 0\n", "\n", "@profile(stream=fp)\n", "def my_other_function():\n", " # Perform some potentially memory-intensive computation\n", " \n", " return 0\n", "\n", "def main():\n", " my_function()\n", " my_other_function()\n", " \n", " #...\n", "\n", "if __name__ == \"__main__\":\n", " main()" ] }, { "cell_type": "markdown", "id": "56291280-e0ad-409b-bce2-0c96c07664e7", "metadata": {}, "source": [ "To test the logging, we will run memory profiling on the `demo_memory_profiling_logging.py` script saved in the working directory." ] }, { "cell_type": "code", "execution_count": 12, "id": "98bbc844-b02a-4269-ad13-f8361157be52", "metadata": { "tags": [] }, "outputs": [], "source": [ "!python demo_memory_profiling_logging.py" ] }, { "cell_type": "markdown", "id": "c8a2b85d-9da8-4246-927b-00cf4e20f296", "metadata": {}, "source": [ "After executing the above script, we can see that the memory profiling output is saved in the `memory_profiler.log` file. You can also log profiling output to different log files for different functions by defining a separate logging file in the argument `fp`. " ] } ], "metadata": { "kernelspec": { "display_name": "vanilla", "language": "python", "name": "vanilla" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }