#AI #Prompt #LLM #textgrad
In this blog, I will present my experiments with TextGrad, developed by Stanford University, and demonstrate how TextGrad addresses some renowned complex problems.
## What is TextGrad
TextGrad is an innovative autograd engine, particularly tailored for textual gradients. As a robust framework, it facilitates automatic meticulously implements backpropagation using feedback provided by advanced Large Language Models (LLMs), firmly anchored in the gradient metaphor. From my perspective, the concept of TextGrad closely mirrors the principles of self-reflection because both paradigms depend heavily on feedback from LLMs. The parallel lies in how each system utilizes iterative responses from language models to refine and improve their outputs, thereby enhancing the overall quality and accuracy of the generated content.
## Alice in Wonderland Problem
Since [Nezhurina, et al.2024](https://arxiv.org/abs/2406.02061) demonstrated that a straightforward task can significantly undermine the reasoning capabilities of Large Language Models (LLMs), I repeatedly utilize this particular problem to evaluate the cognitive robustness and reasoning proficiency of LLMs. Therefore, the problem statement is as follows:
>[!info] Problem
>Alice has 3 sisters and she also has 4 brothers. How many sisters does Alice’s brother have?
Let's use `gpt-4o` to answer this question firstly (certainly, it need OPENAI API KEY).
```Python
import textgrad as tg
model = tg.BlackboxLLM("gpt-4o")
question_string = ("Alice has 3 sisters and she also has 4 brothers. How many sisters does Alice’s brother have? ")
question = tg.Variable(question_string,role_description="question to the LLM",requires_grad=False)
answer = model(question)
print(answer.value)
# Alice has 3 sisters and 4 brothers. Since Alice is one of the sisters, her brothers also have the same number of sisters. Therefore, each of Alice's brothers has 3 sisters.
```
`gpt-4o` provides the answer ==Alice has 3 sisters and 4 brothers. Since Alice is one of the sisters, her brothers also have the same number of sisters. Therefore, each of Alice's brothers has 3 sisters.== However, this response is evidently incorrect as it fails to acknowledge Alice herself as one of her brother's sisters.
So, it's time to unveil the enchantment of TextGrad. If you are very familiar with Pytorch, I believe you can easily handle this!
```Python
# Step 2: Define the loss function and the optimizer, just like in PyTorch!
# Here, we don't have SGD, but we have TGD (Textual Gradient Descent)
# that works with "textual gradients".
answer.set_role_description("concise and accurate answer to the question")
optimizer = tg.TGD(engine = "gpt-4o",parameters=[answer])
evaluation_instruction = (f"Here's a question: {question_string}. "
"Evaluate any given answer to this question, "
"be smart, logical, and very critical, "
"Just provide concise feedback.")
# TextLoss is a natural-language specified loss function that describes
# how we want to evaluate the reasoning.
loss_fn = tg.TextLoss(evaluation_instruction)
```
```Python
# Step 3: Do the loss computation, backward pass, and update the punchline.
# Exact same syntax as PyTorch!
"""
Here, we set the number of epochs to 10 and attempt to determine in which epoch GPT-4o can provide correct answers.
"""
for _ in range(10):
loss = loss_fn(answer)
loss.backward()
optimizer.step()
print(answer.value)
```
Now, Let's see how the answer changes.
```Python
"""
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Alice has 3 sisters and 4 brothers. Including Alice, there are 4 sisters in total. Therefore, each of Alice's brothers has 4 sisters.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Since Alice is one of the 4 sisters, each of Alice's brothers has 4 sisters.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Since Alice has 3 sisters, including Alice, each of her brothers has 4 sisters.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Since Alice has 3 sisters, each of her brothers has 3 sisters.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Since Alice has 3 sisters, each of her brothers has 4 sisters, including Alice.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Since Alice has 3 sisters, including Alice herself, each of her brothers has 4 sisters.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Alice has 3 sisters, so each of her brothers has 4 sisters, including Alice.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Alice has 3 sisters, so each of her brothers has 4 sisters, including Alice herself.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Alice has 3 sisters, so each of her brothers has 4 sisters, including Alice.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Alice has 3 sisters. Including Alice herself, each of her brothers has 4 sisters.
"""
```
Actually when `epoch=1`, the model can answer the question correctly. It counts Alice herself as one of the sisters of her brothers! If you still don't believe this result, you can print the answer again to check.
```Python
print(answer.value)
# Alice has 3 sisters. Including Alice herself, each of her brothers has 4 sisters.
```
## Shirts Dry Time Calculation
This challenge is particularly tricky and originates from a post on [Reddit](https://www.reddit.com/r/OpenAI/comments/18q479x/comment/kf444es/).
>[!info] Problems
>If it takes 1 hour to dry 25 shirts under the sun, how long will it take to dry 30 shirts under the sun? Reason step by step
The TextGrad official Github use `gpt-4o` to solve this question as a demo but here, I will try to use `gpt-3.5-turbo` to see whether it can get the right answer.
The steps are very similar to the previous question. First, use `gpt-3.5-turbo` to answer this question.
```Python
import textgrad as tg
model = tg.BlackboxLLM("gpt-3.5-turbo")
question_string = ("If it takes 1 hour to dry 25 shirts under the sun, how long will it take to dry 30 shirts under the sun? Reason step by step ")
question = tg.Variable(question_string,role_description="question to the LLM",requires_grad=False)
answer = model(question)
print(answer.value)
#To solve this problem, we can set up a proportion based on the given information: 1 hour is to 25 shirts as x hours is to 30 shirts. 1 hour / 25 shirts = x hours / 30 shirts Now, we can cross multiply to solve for x: 25 * x = 1 * 30 25x = 30 Now, divide both sides by 25 to solve for x: x = 30 / 25 x = 1.2 hours Therefore, it will take 1.2 hours to dry 30 shirts under the sun.
```
Obviously, this is not the correct solution because we can dry the shirts under the sun simultaneously! So, it is the time to show TextGrad. Given the current context, where nuances in performance between different versions.`gpt-3.5-turbo` is considered inferior to both `gpt-4o` and `gpt-4-turbo`, I decided to configure `epoch=20` in order to efficiently track and display the answer change log. This would help in observing the evolution of responses.
```Python
# Step 2: Define the loss function and the optimizer, just like in PyTorch!
# Here, we don't have SGD, but we have TGD (Textual Gradient Descent)
# that works with "textual gradients".
answer.set_role_description("concise and accurate answer to the question")
optimizer = tg.TGD(engine = "gpt-3.5-turbo",parameters=[answer])
evaluation_instruction = (f"Here's a question: {question_string}. "
"Evaluate any given answer to this question, "
"be smart, logical, and very critical, "
"Just provide concise feedback.")
# TextLoss is a natural-language specified loss function that describes
# how we want to evaluate the reasoning.
loss_fn = tg.TextLoss(evaluation_instruction)
# Step 3: Do the loss computation, backward pass, and update the punchline.
# Exact same syntax as PyTorch!
for _ in range(20):
loss = loss_fn(answer)
loss.backward()
optimizer.step()
print(answer.value)
```
Now, let's see how the answer changes.
```Python
"""
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
To solve this problem, we can set up a proportion assuming that drying time scales linearly with the number of shirts. We use a proportion because we assume that the drying time per shirt remains constant.
1 hour is to 25 shirts as x hours is to 30 shirts.
1 hour / 25 shirts = x hours / 30 shirts
Cross-multiplying and solving for x:
25x = 30
x = 30 / 25
x = 1.2 hours
Therefore, under these assumptions, it will take 1.2 hours to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
To find out how long it will take to dry 30 shirts under the sun, we can set up a proportion: 1 hour is to 25 shirts as x hours is to 30 shirts. By solving the proportion, we get x = 1.2 hours. Therefore, under the assumption that drying time scales linearly with the number of shirts, it will take 1.2 hours to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
To find out how long it will take to dry 30 shirts under the sun, we can set up a proportion: 1 hour is to 25 shirts as x hours is to 30 shirts. By solving the proportion, we get x = 1.2 hours. Therefore, under the assumption that drying time scales linearly with the number of shirts, it will take 1.2 hours to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
To find out how long it will take to dry 30 shirts under the sun, we can set up a proportion: 1 hour is to 25 shirts as x hours is to 30 shirts. By solving the proportion, we get x = 1.2 hours. Therefore, under the assumption that drying time scales linearly with the number of shirts, it will take 1.2 hours to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
To find out how long it will take to dry 30 shirts under the sun, we can set up a proportion: 1 hour is to 25 shirts as x hours is to 30 shirts. By solving the proportion step by step, we get x = 1.2 hours. Therefore, under the assumption that drying time scales linearly with the number of shirts, it will take 1.2 hours to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
To find out how long it will take to dry 30 shirts under the sun, we can assume that drying time scales linearly with the number of shirts. Given this assumption, it would take approximately 1 hour to dry 30 shirts under the sun, provided the shirts are spread out evenly and receive equal sunlight and airflow.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun, provided they are spread out evenly and receive equal sunlight and airflow.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun, provided they are spread out evenly and exposed to equal sunlight and airflow.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun, provided the shirts are spread out evenly. This assumption simplifies the scenario for calculation purposes, but in reality, factors like sunlight exposure and airflow play a more significant role in drying time.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun, provided they are spread out evenly. However, in reality, factors such as sunlight exposure, airflow, and humidity play a more significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. Factors such as sunlight exposure, airflow, and humidity play a significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. Factors such as sunlight exposure, airflow, and humidity play a significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. Factors such as sunlight exposure, airflow, and humidity play a significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. However, in reality, factors such as sunlight exposure, airflow, and humidity play a more significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. However, in reality, factors such as sunlight exposure, airflow, and humidity play a more significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. However, in reality, factors such as sunlight exposure, airflow, and humidity play a more significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. Factors such as sunlight exposure, airflow, and humidity play a significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
Assuming that drying time scales linearly with the number of shirts, it would take approximately 1 hour to dry 30 shirts under the sun. However, in reality, factors such as sunlight exposure, airflow, and humidity play a more significant role in drying time. If 25 shirts take 1 hour to dry, adding 5 more shirts to make it 30 should not significantly change the drying time under the same conditions. Therefore, it would still take approximately 1 hour to dry 30 shirts under the sun.
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
It would still take approximately 1 hour to dry 30 shirts under the sun, assuming the same drying conditions as for 25 shirts. Drying time is more dependent on factors like sunlight exposure, airflow, and shirt distribution rather than the number of shirts.
INFO:textgrad:LLMCall function forward
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:TextualGradientDescent optimizer response
INFO:textgrad:TextualGradientDescent updated text
It would still take approximately 1 hour to dry 30 shirts under the sun, as drying time is more dependent on factors like sunlight exposure, airflow, and shirt distribution rather than the number of shirts.
"""
```
Actually when `epoch=6`, the model can answer the question correctly. It recognizes shirts should be dried evenly under the sun, so the drying time should not be changed even if the number of shirts are increased. You can now print the answer again to check.
```Python
print(answer.value)
# It would still take approximately 1 hour to dry 30 shirts under the sun, as drying time is more dependent on factors like sunlight exposure, airflow, and shirt distribution rather than the number of shirts.
```
## Conclusion
In this blog post, I demonstrate how to use TextGrad effectively to address challenging questions, even with a less advanced model like gpt-3.5-turbo. With a solid understanding of Pytorch, I believe you will find TextGrad quite easy to use.