Autonomous Workflows: Chapter 1

Basics: Function Calling with OpenAI Models

Jan 28, 2025

I need to analyze 200 PDFs, extract specific data, and generate reports.
Can I orchestrate 10 AIs to work together step by step to solve a task?
Could an AI help manage my email inbox, draft responses, analyze stocks, and even buy them with my confirmation?
What if AI could handle my repetitive computer tasks automatically?

These aren't hypothetical scenarios anymore. While most people use LLMs like ChatGPT for text generation, a smaller group of developers has discovered something more powerful: how to make AI systems that actually do things. This series shows you how to build LLMs that can automate simple and complex workflows.

No PhD required—just bring your curiosity and basic Python knowledge. This series isn’t about training or fine-tuning generative AI models; it’s about building applications and autonomous workflows on top of them.

Prerequisite

You don't need to be an expert to follow this series. Here's a quick overview of the key concepts we'll use. If you are already familiar with these, please feel free to skip to Intro.

IDEs (Integrated Development Environments)

A platform where you can write, edit, and debug your code, all within a single interface.

Virtual environment

Isolated workspaces for Python projects. These prevent conflicts between different projects' dependencies. Think of them as separate containers for each project, where you can install your Python libraries.

Git

A version control system that tracks your code changes. Helps manage project history, and makes collaborations easy.

API

An API (Application Programming Interface) is a way for two systems to communicate. You send data to it (a request), and it sends data back (a response).

Environment variables

Secure storage for sensitive information like API keys. They keep your credentials safe and separate from your code.

JSON

JSON (JavaScript Object Notation) is a lightweight, human-readable data format commonly used for exchanging data between systems, especially in APIs. It looks like a Python dictionary with key-value pairs. For example:

{"name": "Sina", "age": 33}

Machine learning

Machine learning is a way to program computers to learn patterns from data. Instead of manually coding rules (e.g., if-else statements), you provide examples of input and output data, and the computer figures out how to map inputs to outputs.

Deep learning

Deep learning is a subset of machine learning that uses neural networks—algorithms inspired by how the human brain works. These networks are great for tasks like image recognition, natural language processing, and generative AI.

Generative AI

Generative AI is a subset of AI focused on creating new content—like text, images, or music—rather than just analyzing existing data. Examples include ChatGPT for text, Midjourney for images, and Sora for videos.

Large language models (llms)

LLMs are massive deep learning models trained on huge amounts of text data. They can understand and generate human-like text, making them incredibly versatile for tasks like answering questions, summarizing text, or writing code.

Autonomous agents and workflows

This is what we’re here to explore! Autonomous agents and workflows use LLMs to perform tasks on their own. They can call functions, generate structured outputs, and connect the dots between tasks—essentially automating processes without needing constant human input.

Intro

You’ve probably used ChatGPT or other large language model (LLM) applications that let you ask a question and get an answer instantly through a graphical user interface. But have you ever wondered how these LLMs handle tasks that require more than just their training—like fetching real-time stock prices or performing accurate calculations?

For example, when you ask a question like “What is the price of MSFT stock today?” the LLM does not have the answer directly in its training data. Instead, it needs access to a tool—a function that lets it either search the web or call a financial API. Without such access, the LLM either tells you it doesn’t know or, worse, it might make up an answer.

You may also recall that in the early days of ChatGPT, it couldn’t do simple math reliably. But now, if you ask ChatGPT to multiply 1,248,124 by 21,421,124, it can write and execute Python code to calculate the correct result. This ability comes from giving the model access to external tools, like a Python sandbox, allowing it to overcome its built-in limitations.

In this lesson, we’re going to take a similar approach. You’ll learn how to use OpenAI’s API to ask the model to multiply two integers. First, we’ll see how it fails without any external help. Then, we’ll give it access to a Python function that can handle multiplication. The LLM will be able to decide, based on your question, whether it needs to call that function to give you the right answer.

Here’s what you’ll need to get started:

Choose an IDE of your choice. I’ll be using VSCode, but feel free to use PyCharm or other IDEs. You can also use interactive environments such as Jupyter Notebook or Google Colab if you find it easier to understand the code by breaking it down. However, I recommend practicing scripting in a .py file, as this is essential for building real applications. Jupyter notebooks are great for experimenting but aren’t ideal for developing production-ready apps.
Go to OpenAI’s developer platform and generate an API key if you don’t have one. This is like your password to communicate with OpenAI servers. Do not share with anyone.
If you don’t want to spend any money on this, stay tuned for Chapter 2, where I’ll show you how to implement the same concept for free using Meta’s Llama models, which can run locally on your laptop.
Next, you’ll need to clone this project’s repository from GitHub. You can find the code here: Repository Link. If you find it helpful, feel free to give it a star—your support means a lot! You can always download the code directly instead of cloning it if the next steps feel like too much of a headache.
Make sure you have git installed on your system. If you have never used git before, chances are you do not have it. Open a terminal and test it with:
```
git --version 
```
If you do not have it, please search for how to install it on your system, depending on whether you are using Windows, Mac, or Linux.
After installing it, create a folder for your project. Then, open a terminal (either in VS Code or directly on your system) and enter the following command. This will copy my code, along with its history, into your local folder.
```
git clone https://github.com/sinamhd/autonomous.git
```
Finally, after cloning the repository, there are a few steps you’ll need to follow to set up your environment and get started. These are outlined in detail in the README file of the Chapter 1, but here’s a quick overview:
After cloning, move into the project directory:
```
cd autonomous
```

Create a new file, name it .env. In terminal, you can do:
```
nano .env
```
Write this into the file:
```
OPENAI_API_KEY = YOUR_API_KEY
```
Replace YOUR_API_KEY with the API Key you got from OpenAI.
Create and activate a virtual environment to keep dependencies isolated. The commands will differ slightly depending on your operating system:
- macOS/Linux:
```
python -m venv myenv
source myenv/bin/activate
```
- Windows:
```
python -m venv myenv
myenv\Scripts\activate
```
In the terminal, enter the following command. This will install the required Python libraries needed to run the code.
```
pip install -r requirements.txt
```

Now you have the code in your system. Let’s walk through the code, line by line, before running it.

Hands-On

`openai_basic_call.py`

You can run the code with running this command in your terminal.

python3 openai_basic_call.py

before you do, let’s dive into the code and see how an LLM handles a task without external tools. Our example script, openai_basic_call.py, demonstrates a basic use of OpenAI’s API to ask the model to perform a multiplication. Let’s break it down step by step.

1. Imports and Environment Setup

import os
from dotenv import find_dotenv, load_dotenv
from openai import OpenAI

Here, we’re importing the necessary modules:

dotenv to load environment variables (like the API key) from a .env file.
OpenAI to interact with OpenAI’s API.

This ensures our API key is securely stored in a .env file rather than hardcoding it in the script.

2. Loading the API Key

load_dotenv(find_dotenv())
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise ValueError("OPENAI_API_KEY not found in the environment variables. Make sure to set it in your .env file.")

load_dotenv() loads the variables from your .env file into the environment.
os.getenv() retrieves the OPENAI_API_KEY variable. If the key isn’t found, the script raises an error, ensuring you don’t proceed without it.

3. Initializing the OpenAI Client

client = OpenAI(api_key=api_key)

This initializes the OpenAI client using the provided API key. Without this step, you can’t communicate with OpenAI’s servers.

4. Setting Up the Model and User Input

model = "gpt-4o-mini"
question = "What is the multiplication of 1248124 * 21421124?"

We specify the model to use (gpt-4o-mini, a lightweight fast and cheap version of GPT-4o).
The question asks the LLM to calculate a hard multiplication, which the LLM can’t do reliably without external help.

5. Conversation Context

messages = [
    {"role": "system", "content": "Respond short with emojis."},
    {"role": "user", "content": question},
]

The system role defines how the LLM should respond (in this case, short and emoji-based).
The user role contains the actual question we’re asking.

6. Calling the API

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=0,
)

Here, we send the messages to the OpenAI API and specify:

model: The chosen GPT model.
temperature: This parameter controls randomness (set it to 0 for more predictable responses). You can increase it, up to 1, when you want the responses to be more creative.

7. Displaying the Response

response_message = response.choices[0].message.content
print(f"AI Response: {response_message}")

The response from the model is extracted and printed. Looks correct, right? It really isn't. Take a closer look. Try the multiplication in the calculator :)

`openai_function_call.py`

Let’s see how we can give the model access to a Python function so it can handle a hard math multiplication.

1. Imports and Environment Setup

import json
import os
from dotenv import find_dotenv, load_dotenv
from openai import OpenAI

As before, we’re importing the necessary modules:

json is included because the function arguments and responses are structured as JSON objects.

2. Defining the Function

def multiply(parameters):
    number1 = parameters["number1"]
    number2 = parameters["number2"]
    return number1 * number2

This is a simple Python function that takes a dictionary (parameters) containing two integers, number1 and number2, and returns their product.
The function will later be registered as a "tool" for the LLM to call when required.

3. Loading the API Key

load_dotenv(find_dotenv())
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    raise ValueError("OPENAI_API_KEY not found in the environment variables. Make sure to set it in your .env file.")

This step is the same as in openai_basic_call.py. It loads your OpenAI API key from the .env file and ensures it’s available to initialize the OpenAI client.

4. Initializing the OpenAI Client

client = OpenAI(api_key=api_key)

Again, we initialize the OpenAI client, which acts as the bridge between your script and OpenAI’s servers.

5. Defining the Tools (Functions) for the Model

tools = [
    {
        "type": "function",
        "function": {
            "name": "multiply",
            "description": "Use this function to multiply two integers and return the result.",
            "parameters": {
                "type": "object",
                "properties": {
                    "number1": {"type": "integer", "description": "The first integer to multiply."},
                    "number2": {"type": "integer", "description": "The second integer to multiply."},
                },
                "required": ["number1", "number2"],
            },
        },
    }
]

Here, we define the metadata for the multiply function, registering it as a tool for the LLM. Don’t worry if this looks complex. We will simplify this in the next chapters.
The metadata includes:
- Name: "multiply"
- Description: Explains what the function does.
- Parameters: Specifies the expected input (number1 and number2 as integers).

6. Setting Up the Conversation

question = "What is the multiplication of 1248124 * 21421124?"
messages = [
    {"role": "system", "content": "Respond short with emojis."},
    {"role": "user", "content": question},
]

The question remains the same as in the previous example.
The messages include the system instructions and the user’s query.

7. Calling the API with Tools Enabled

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0,
    tools=tools,
    tool_choice="auto",  # Let the model decide if a tool is required
)

Here’s the key difference from openai_basic_call.py:

The tools parameter provides the list of external functions the model can call.
tool_choice="auto" lets the LLM decide if the multiply function is relevant for the user’s query.

8. Extract the model's initial response

response_message = response.choices[0].message

messages.append(response_message)

We receive the response, and append it to the existing conversation.

The response object looks like this.

message=ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_DJazmCjrJMCfoE2b1TqYrnoa', function=Function(arguments='{"number1":1248124,"number2":21421124}', name='multiply'), type='function')]))]

Add a print statement to your code to try it yourself.

As you can see, the content argument is None. However, there is an argument named tool_calls, which means the model decided to use a tool instead of generating the final response.

9. Handling the Function Call

tool_calls = response_message.tool_calls

if tool_calls:
    # Extract details about the tool call
    tool_call_id = tool_calls[0].id
    tool_function_name = tool_calls[0].function.name
    tool_arguments = json.loads(tool_calls[0].function.arguments)
        
    if tool_function_name == "multiply":
        # Execute the multiply function
        result = multiply(tool_arguments)
            
        # Append the tool response to the message history
        messages.append(
            {
                "role": "tool",
                "tool_call_id": tool_call_id,
                "name": tool_function_name,
                "content": str(result),
            }
        )
            
        # Get a new response from the model after the function result is provided
        model_response_with_function_call = client.chat.completions.create(
            model=model,
            messages=messages,
        )
        print(f"AI Response: {model_response_with_function_call.choices[0].message.content}")
    else:
        print(f"Error: function {tool_function_name} does not exist")
else:
    # If no tool was identified, print the initial response
    print(f"AI Response: {response_message.content}")

The script checks if the model requested a tool (tool_calls).
If the multiply function is requested, it extracts the arguments, calls the function, and appends the result to the conversation. This means we are running the actual Python function that we defined, based on the arguments generated by the model.

After the function call, the script sends the updated conversation back to the model to generate a final response based on the tool’s output. Please note that this involves calling the model again.
We print the response content of model_response_with_function_call. Please note that if there were no tool calls, the code would go directly to the else block and print the content of the response.
Do you see something different in the result this time?

Key Takeaways

Function Calling Solves Real Gaps:
By giving the model access to a Python function, we enable it to handle tasks that go beyond its inherent limitations, like precise mathematical operations.
Extending the Framework:
The same concept can be applied to more complex tasks, such as calling APIs or interacting with databases, by registering additional tools for the model to use. What tools can you think of?
How Did This Work?
We simply told the LLM that it has access to a tool that can perform specific tasks and to use it when needed. When you ask a question related to that tool, the LLM generates the arguments to call it—this includes the name of the function and its required inputs. We then take those arguments, run the actual Python function, append the result back to the messages, and call the LLM one more time to generate the final response.

Would you please share how long did it take you to try this tutorial? It helps me adjust my future chapters?

Autonomous Workflows

Discussion about this post

Ready for more?