Build Your Agents from Scratch
Design your own agents without any framework
Photo by Arseny Togulev on Unsplash
In the recent months, we’ve have all heard about Agents and Multi-Agent frameworks. These AI agents have become the unsung heroes of automation and decision-making.
While pre-built frameworks like AutoGen and CrewAI offer tempting shortcuts, (and rightly so!) there’s an unparalleled thrill and depth of understanding that comes from building your own agent from the ground up.
It’s like choosing between instant ramen and crafting a gourmet meal — sure, the former is quick, but the latter? That’s where the real magic happens.
Today, we’re going to roll up our sleeves and dive into the nitty-gritty of creating AgentPro, our very own AI assistant. By the end of this article, you’ll have a foundational understanding of how AI agents tick, and you’ll be well on your way to creating a digital companion that can generate and execute code on demand.
It’s like teaching a robot to fish, except instead of fish, it’s pulling Python scripts out of the ether!
Caution: this code might not work in all cases but it should help you get started + indentation errors migh occur in code
Here’s the Colab Notebook
The Building Blocks: A Roadmap to AgentPro
Before we dive into the code, let’s outline the key components we’ll be constructing:
The 5 Stages of developing an Agent from Scratch (image by author)Initialization: Setting up our agent’s “brain”Code Generation: Teaching our agent to write Python scriptsLibrary Management: Enabling our agent to install necessary toolsCode Execution: Empowering our agent to run the code it generatesCommand Center: Creating a central hub to manage all these functions
Now, let’s break down each of these steps and see how they come together to form our AI assistant.
Step 1: Initialization — Giving Our Agent Its First Spark of Life
Every great journey begins with a single step, and in the world of AI agents, that step is initialization. This is where we set up the basic structure of our agent and connect it to its primary source of intelligence — in this case, the OpenAI API.
from openai import OpenAI
import os
from google.colab import userdata
import base64
import requests
from PIL import Image
from io import BytesIO
import subprocess
import tempfile
import re
import importlib
import sys
os.environ[“OPENAI_API_KEY”] = userdata.get(‘OPENAI_API_KEY’)
class AgentPro:
def __init__(self):
# Future initialization code can go here
pass
This snippet is the digital equivalent of giving life to our AI assistant. We’re importing necessary libraries, setting up our OpenAI API key, and creating the skeleton of our AgentPro class. It’s like providing a body for our AI — not very useful on its own, but essential for everything that follows.
Step 2: Code Generation — Teaching Our Agent to Write Python
Now that our agent has a “body,” let’s give it the ability to think — or in this case, to generate code. This is where things start to get exciting!
def generate_code(self, prompt):
client = OpenAI()
response = client.chat.completions.create(
model=”gpt-4o”,
messages=[
{“role”: “system”, “content”: “You are a Python code generator. Respond only with executable Python code, no explanations or comments except for required pip installations at the top.”},
{“role”: “user”, “content”: f”Generate Python code to {prompt}. If you need to use any external libraries, include a comment at the top of the code listing the required pip installations.”}
],
max_tokens=4000,
temperature=0.7,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
code = re.sub(r’^“`pythonn|^“`n|“`$’, ”, response.choices[0].message.content, flags=re.MULTILINE)
code_lines = code.split(‘n’)
while code_lines and not (code_lines[0].startswith(‘import’) or code_lines[0].startswith(‘from’) or code_lines[0].startswith(‘#’)):
code_lines.pop(0)
return ‘n’.join(code_lines)
This method is the crown jewel of our agent’s capabilities. It’s using the OpenAI API to generate Python code based on a given prompt.
Think of it as giving our agent the ability to brainstorm and write code on the fly. We’re also doing some cleanup to ensure we get clean, executable Python code without any markdown formatting or unnecessary comments.
The parameters we’re using (like temperature and top_p) allow us to control the creativity and randomness of the generated code. It’s like adjusting the “inspiration” knob on our AI’s imagination!
Step 3: Library Management — Equipping Our Agent with the Right Tools
Every good coder knows the importance of having the right libraries at their disposal. Our AI assistant is no different. This next method allows AgentPro to identify and install any necessary Python libraries
def install_libraries(self, code):
libraries = re.findall(r’#s*pip installs+([w-]+)’, code)
if libraries:
print(“Installing required libraries…”)
for lib in libraries:
try:
importlib.import_module(lib.replace(‘-‘, ‘_’))
print(f”{lib} is already installed.”)
except ImportError:
print(f”Installing {lib}…”)
subprocess.check_call([sys.executable, “-m”, “pip”, “install”, lib])
print(“Libraries installed successfully.”)
This method is like sending our agent on a shopping spree in the Python Package Index. It scans the generated code for any pip install comments, checks if the libraries are already installed, and if not, installs them. It’s ensuring our agent always has the right tools for the job, no matter what task we throw at it.
Step 4: Code Execution — Bringing the Code to Life
Generating code is great, but executing it is where the rubber meets the road. This next method allows our agent to run the code it has generated:
def execute_code(self, code):
with tempfile.NamedTemporaryFile(mode=’w’, suffix=’.py’, delete=False) as temp_file:
temp_file.write(code)
temp_file_path = temp_file.name
try:
result = subprocess.run([‘python’, temp_file_path], capture_output=True, text=True, timeout=30)
output = result.stdout
error = result.stderr
except subprocess.TimeoutExpired:
output = “”
error = “Execution timed out after 30 seconds.”
finally:
os.unlink(temp_file_path)
return output, error
This method is where the magic really happens. It takes the generated code, writes it to a temporary file, executes it, captures the output (or any errors), and then cleans up after itself. It’s like giving our agent hands to type out the code and run it, all in the blink of an eye.
Step 5: Command Center — Putting It All Together
Finally, we need a way to orchestrate all these amazing capabilities. Enter the run method:
def run(self, prompt):
print(f”Generating code for: {prompt}”)
code = self.generate_code(prompt)
print(“Generated code:”)
print(code)
print(“nExecuting code…”)
output, error = self.execute_code(code)
if output:
print(“Output:”)
print(output)
if error:
print(“Error:”)
print(error)
This is the command center of our AI assistant. It takes a prompt, generates the code, executes it, and reports back with the results or any errors. It’s like having a personal assistant who not only understands your requests but carries them out and gives you a full report.
Putting It All Together:
Now that we have all our components, let’s see how we can use our newly minted AI assistant:
if __name__ == “__main__”:
agent = AgentPro()
agent.run(“””make a detailed deck on the best forms of leadership with at
least 10 slides and save it to a pptx called leadership.pptx”””)
With this simple command, we’re asking our agent to create a full presentation on leadership styles, complete with at least 10 slides, and save it as a PowerPoint file.
Our agent will generate the necessary Python code (likely using a library like python-pptx), install any required libraries, execute the code to create the presentation, and then report back with the results or any errors encountered.
We’ve just built the foundation of a powerful AI agent capable of generating and executing Python code on demand. From setting up its “brain” with the OpenAI API, to giving it the power to write and run code, to equipping it with the ability to install necessary tools, we’ve created a versatile digital assistant.
This is just the beginning of what’s possible with custom AI agents. In future installments, we’ll explore how to enhance AgentPro with web searching capabilities, image generation, and even more complex decision-making processes.
Remember, with great power comes great responsibility. Your new AI assistant is a powerful tool, but it’s up to you to guide it wisely. Use it to automate tedious tasks, explore new ideas, and push the boundaries of what’s possible with AI.
Just maybe don’t ask it to write your wedding vows or decide on your next career move — some things are still best left to human intuition!
Stay tuned for Part B, where we’ll teach our agent some new tricks and start to unlock its true potential. Until then, happy coding, and may your AI adventures be bug-free and endlessly exciting!
Follow for Part B!
If you are interested in learning more about this content, please subscribe. You can also connect with me on LinkedIn
About me
Hi! I am Hamza, and I’m thrilled to be your guide on this exciting journey into the world of AI agents. With a background as a Senior Research Scientist at Google and teaching experience at prestigious institutions like Stanford and UCLA, I’ve been at the forefront of AI development and education for years. My passion lies in demystifying complex AI concepts and empowering the next generation of AI practitioners.
Speaking of which, if you’ve enjoyed this deep dive into building AI agents from scratch, you might be interested in taking your LLM knowledge to the next level. I’ve recently developed a comprehensive course titled Enterprise RAG and Multi-Agent Applications on the MAVEN platform. This course is tailored for practitioners who want to push the boundaries of what’s possible with Large Language Models, especially in enterprise settings.
In Enterprise RAG and Multi-Agent Applications we explore cutting-edge techniques that go beyond the basics. From advanced Retrieval-Augmented Generation (RAG) solutions to the latest methods in model optimization and responsible AI practices, this course is designed to equip you with the skills needed to tackle real-world AI challenges.
Whether you’re looking to implement state-of-the-art LLM applications or dive deep into the intricacies of model fine-tuning and ethical AI deployment, this course has got you covered.
Build Your Agents from Scratch was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.