diff --git a/book/1-crystalstructure-agent-anthropic.ipynb b/book/1-crystalstructure-agent-anthropic.ipynb new file mode 100644 index 0000000..9673838 --- /dev/null +++ b/book/1-crystalstructure-agent-anthropic.ipynb @@ -0,0 +1,440 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "abebe249-91bd-44ad-bcec-79bf4de2e1a6", + "metadata": {}, + "source": [ + "# Crystalstructure Agent\n", + "As a first demonstration the langchain team decided to develop a LLM agent, which can predict the crystal structure of a chemical element, by accessing the reference database of the [Atomic Simulation Environment](https://wiki.fysik.dtu.dk/ase/). While the corresponding python function is simple and only requires a few lines of code, this example already limits the hallucinations of the LLM, by using the [langchain](https://www.langchain.com) framework to interface the LLM with the python function. \n", + "\n", + "In particular, we follow the [Custom Agent](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) tutorial from the Langchain documentation. " + ] + }, + { + "cell_type": "markdown", + "id": "42b74827-5fe4-4200-8623-a650525cf5e7", + "metadata": {}, + "source": [ + "## Python Function\n", + "For this first example, we use [Anthropic](https://www.anthropic.com) as LLM provider but the example can also be adjusted to work with other LLM providers, for more details check the [langchain documentation](https://python.langchain.com/v0.2/docs/integrations/platforms/). We store the Anthropic API key in the `ANTHROPIC_API_KEY` variable: " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "6906a90b-4a3d-4aa5-b649-33e8caf829f3", + "metadata": {}, + "outputs": [], + "source": [ + "from getpass import getpass" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "34ee60c4-924e-4af5-b736-d474c7a1086c", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Enter your Anthropic Token: ········\n" + ] + } + ], + "source": [ + "ANTHROPIC_API_KEY = getpass(prompt='Enter your Anthropic Token:')" + ] + }, + { + "cell_type": "markdown", + "id": "136e7571-ac3b-4587-82b5-3351d6a6d3f2", + "metadata": {}, + "source": [ + "As a next step, we import the corresponding functionality from `ASE` and the `tool` decorator from `langchain`:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "bf956016-ef67-4ffa-93ae-06cb4e242145", + "metadata": {}, + "outputs": [], + "source": [ + "from ase.data import reference_states, atomic_numbers\n", + "from langchain.agents import tool" + ] + }, + { + "cell_type": "markdown", + "id": "3aae31dd-fb84-45f9-81b9-3f1f18786f4f", + "metadata": {}, + "source": [ + "For the python function, it is important to include `type` hints and documentation based on a Docstring for the LLM to understand the functionality of the function. Finally, all data types used as input or output of the function need to have a `JSON` representation so they can be communicated to the LLM. For example, numpy arrays have to be converted to standard python lists. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a985d5ac-7b52-4f7c-bb16-24400ebaba1c", + "metadata": {}, + "outputs": [], + "source": [ + "@tool\n", + "def get_crystal_structure(chemical_symbol: str) -> str:\n", + " \"\"\"Returns the atomic crystal structure of a chemcial symbol\"\"\"\n", + " ref_state = reference_states[atomic_numbers[chemical_symbol]]\n", + " if ref_state is None:\n", + " return \"No crystal structure known.\"\n", + " else:\n", + " return ref_state[\"symmetry\"]" + ] + }, + { + "cell_type": "markdown", + "id": "93b51c33-91c8-42d0-9e89-e0012879f122", + "metadata": {}, + "source": [ + "After applying the `decorator`, the functions can be called using `invoke()`. We validate the functionality for two elements iron (Fe) and gold (Au). " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "6d939068-506e-4fbe-aac9-97ca482d2b6e", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'bcc'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_crystal_structure.invoke(\"Fe\")" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "07aa9ba5-6575-4022-b707-4c5433f86166", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'fcc'" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_crystal_structure.invoke(\"Au\")" + ] + }, + { + "cell_type": "markdown", + "id": "4c888942-335e-43eb-b088-47c155eef0b3", + "metadata": {}, + "source": [ + "## Define Agent\n", + "After the definition of the Python function, the next step is the definition of the agent which the LLM uses to interact with the Python function. In this example the `ChatAnthropic` interface of the `langchain_anthropic` package is used. Depending on your configuration, it might be necessary to install the `langchain_anthropic` package using the following command: \n", + "```\n", + "conda install -c conda-forge langchain-anthropic\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "6529e329-0947-443a-8325-42ef3510803d", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_anthropic import ChatAnthropic\n", + "\n", + "llm = ChatAnthropic(\n", + " model=\"claude-3-5-sonnet-20240620\",\n", + " temperature=0,\n", + " anthropic_api_key=ANTHROPIC_API_KEY,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "752d18e2-4645-4af9-a55c-1adafd08a0af", + "metadata": {}, + "source": [ + "Following the definition of the LLM, the next step is the definition of the prompt. Here we start with a very basis prompt. In the following section on prompt engineering the optimization of the prompt is discussed in more detail." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "c2edad58-849b-424f-8460-92507e9545ac", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n", + "\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are very powerful assistant, but don't know current events.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "367384b0-1fe6-43cf-a2b5-b3689e65b9fa", + "metadata": {}, + "source": [ + "Finally, the different parts are plugged together, by creating an agent which combines the prompt with the python function referenced here as one tool in a list of potentially many tools and an executor to communicate with the agent. The technical details are discussed in the corresponding [langchain Custom Agent tutorial](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/)." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c1e01a3b-8b21-4538-b67d-6d5c40c66f94", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.agents import AgentExecutor\n", + "from langchain.agents.format_scratchpad.openai_tools import (\n", + " format_to_openai_tool_messages,\n", + ")\n", + "from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser\n", + "\n", + "tools = [get_crystal_structure]\n", + "agent = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "markdown", + "id": "b0d455bd-28bf-44c9-97e9-91f9cb3552cc", + "metadata": {}, + "source": [ + "## Conversation\n", + "Once the `AgentExecutor` is defined we can communicate with the LLM using the `stream()` interface. We repeat the test from above and ask for the crystal structure of gold." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "ac9edf53-8963-4b58-bcb6-742106424757", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_crystal_structure` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': 'To answer your question about the crystal structure of gold, I can use the available tool to retrieve this information. Let me do that for you.', 'type': 'text', 'index': 0}, {'id': 'toolu_01CjhjuywsNK48xWcJ7YK2ND', 'input': {}, 'name': 'get_crystal_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3mfcc\u001b[0m\u001b[32;1m\u001b[1;3m[{'text': \"\\n\\nBased on the result from the tool, I can tell you that the crystal structure of gold (Au) is face-centered cubic (fcc).\\n\\nFace-centered cubic (fcc) is one of the most common crystal structures found in metals. In this structure, atoms are arranged at the corners of a cubic lattice, with additional atoms at the center of each face of the cube. This arrangement gives gold its characteristic properties, such as its malleability and ductility.\\n\\nSome key features of the fcc structure in gold include:\\n\\n1. High atomic packing factor, which contributes to gold's high density.\\n2. Good ductility and malleability, allowing gold to be easily shaped into thin sheets or wires.\\n3. Excellent thermal and electrical conductivity.\\n4. Twelve nearest neighbors for each atom, contributing to its stability.\\n\\nThis crystal structure is one of the reasons why gold is so valued in jewelry making and various industrial applications, as it allows for easy manipulation while maintaining its desirable properties.\", 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"What is the crystal structure of gold\"})) # Yeah this worked !!" + ] + }, + { + "cell_type": "markdown", + "id": "43bc4d75-bae7-4fe2-8005-52e2629eec8d", + "metadata": {}, + "source": [ + "With the `verbose=True` parameter the internal steps of the LLM agent are printed in green. As a first step the agent calls the `get_crystal_structure()` already with the converted input parameter, rather than using gold as input it uses the chemical symbol `Au`. The function returns `fcc` and the LLM converts this answer in a sentence a human can understand: \n", + "```\n", + "The crystal structure of gold is face-centered cubic (fcc).\n", + "```\n", + "This example highlights how easy it is these days to make a python function accessible via a LLM for all kinds of users to interact with this python funtion." + ] + }, + { + "cell_type": "markdown", + "id": "bc28370b-6002-4685-9884-44432099b5ea", + "metadata": {}, + "source": [ + "## Prompt Engineering\n", + "Some LLMs like the ChatGPT3.5 model start to hallucinate rather quickly, for example by asking for the crystal structure of a car, it does not ask which elements a car typically consists of, but rather connects car with carbon and replies the crystal structure of a car is diamond, which is obviously wrong. Luckily, the Antropic Claude model does not make this mistake. " + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "7ac99ce9-f93c-406f-8348-f16793b1059e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m[{'text': 'I apologize, but I need to clarify something before I can answer your question. The term \"car\" is not a chemical symbol or element. To determine a crystal structure, we need to specify a chemical element or compound.\\n\\nThe tool available to us can provide information about the crystal structure of chemical elements based on their chemical symbols. For example, we could look up the crystal structure of iron (Fe), carbon (C), or aluminum (Al), which are common elements used in car manufacturing.\\n\\nCould you please specify which chemical element you\\'re interested in learning about? Once you provide a valid chemical symbol, I can use the tool to retrieve its crystal structure information for you.', 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"What is the crystal structure of car\"})) # I did not know cars were made of carbon" + ] + }, + { + "cell_type": "markdown", + "id": "fe544511-f499-4bdf-b24a-67fed6dd7ce4", + "metadata": {}, + "source": [ + "To restrict the hallucination of the agent we extend the system prompt with the following statement:\n", + "```\n", + "For each query vailidate that it contains a chemical element and otherwise cancel. \n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "4ffbd77b-7fc4-47a4-8c7b-4f179d453c9b", + "metadata": {}, + "outputs": [], + "source": [ + "prompt_improved = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " # \"You are very powerful assistant, but don't know current events.\",\n", + " \"You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "d3b4b76d-a455-41cf-89f7-318061c95ac5", + "metadata": {}, + "outputs": [], + "source": [ + "agent_improved = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt_improved\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_improved_executor = AgentExecutor(agent=agent_improved, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "3033a731-5605-4103-9da8-63c56655a53f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m[{'text': 'I apologize, but \"car\" is not a chemical element. To provide information about crystal structures, I need a valid chemical symbol for an element. Could you please provide the chemical symbol for an element you\\'d like to know about? For example, you could ask about elements like carbon (C), iron (Fe), or gold (Au).', 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_improved_executor.stream({\"input\": \"What is the crystal structure of car\"}))" + ] + }, + { + "cell_type": "markdown", + "id": "f23ddd66-9032-41e0-90fb-a574845ce426", + "metadata": {}, + "source": [ + "With the modified system prompt, the all LLM models correctly reply that they were not able to determine the crystal structure of a car, because they fail to determine the chemical element a car consists of. " + ] + }, + { + "cell_type": "markdown", + "id": "4a7f9b5f-cfe3-4b9c-9190-33d9505beedd", + "metadata": {}, + "source": [ + "## Summary\n", + "By following the [Custom Agent](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) tutorial from the Langchain documentation, we were able to create a first simulation agent, which calls specialized python frameworks like [ASE](https://wiki.fysik.dtu.dk/ase/) to address material-science specific questions. Still, it is important to carefully engineer the prompt of the agent, otherwise even these LLMs with access to specialized agents tend to hallucinate." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f7124521-c58d-4700-bb51-7eb97be53588", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/book/1-crystalstructure-agent-groq.ipynb b/book/1-crystalstructure-agent-groq.ipynb new file mode 100644 index 0000000..b3138b7 --- /dev/null +++ b/book/1-crystalstructure-agent-groq.ipynb @@ -0,0 +1,448 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "abebe249-91bd-44ad-bcec-79bf4de2e1a6", + "metadata": {}, + "source": [ + "# Crystalstructure Agent\n", + "As a first demonstration the langchain team decided to develop a LLM agent, which can predict the crystal structure of a chemical element, by accessing the reference database of the [Atomic Simulation Environment](https://wiki.fysik.dtu.dk/ase/). While the corresponding python function is simple and only requires a few lines of code, this example already limits the hallucinations of the LLM, by using the [langchain](https://www.langchain.com) framework to interface the LLM with the python function. \n", + "\n", + "In particular, we follow the [Custom Agent](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) tutorial from the Langchain documentation. " + ] + }, + { + "cell_type": "markdown", + "id": "42b74827-5fe4-4200-8623-a650525cf5e7", + "metadata": {}, + "source": [ + "## Python Function\n", + "For this first example, we use [Groq](https://groq.com) as LLM provider but the example can also be adjusted to work with other LLM providers, for more details check the [langchain documentation](https://python.langchain.com/v0.2/docs/integrations/platforms/). We store the Groq API key in the `GROQ_API_KEY` variable: " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "6906a90b-4a3d-4aa5-b649-33e8caf829f3", + "metadata": {}, + "outputs": [], + "source": [ + "from getpass import getpass" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "34ee60c4-924e-4af5-b736-d474c7a1086c", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Enter your Groq Token: ········\n" + ] + } + ], + "source": [ + "GROQ_API_KEY = getpass(prompt='Enter your Groq Token:')" + ] + }, + { + "cell_type": "markdown", + "id": "136e7571-ac3b-4587-82b5-3351d6a6d3f2", + "metadata": {}, + "source": [ + "As a next step, we import the corresponding functionality from `ASE` and the `tool` decorator from `langchain`:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "bf956016-ef67-4ffa-93ae-06cb4e242145", + "metadata": {}, + "outputs": [], + "source": [ + "from ase.data import reference_states, atomic_numbers\n", + "from langchain.agents import tool" + ] + }, + { + "cell_type": "markdown", + "id": "3aae31dd-fb84-45f9-81b9-3f1f18786f4f", + "metadata": {}, + "source": [ + "For the python function, it is important to include `type` hints and documentation based on a Docstring for the LLM to understand the functionality of the function. Finally, all data types used as input or output of the function need to have a `JSON` representation so they can be communicated to the LLM. For example, numpy arrays have to be converted to standard python lists. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a985d5ac-7b52-4f7c-bb16-24400ebaba1c", + "metadata": {}, + "outputs": [], + "source": [ + "@tool\n", + "def get_crystal_structure(chemical_symbol: str) -> str:\n", + " \"\"\"Returns the atomic crystal structure of a chemcial symbol\"\"\"\n", + " ref_state = reference_states[atomic_numbers[chemical_symbol]]\n", + " if ref_state is None:\n", + " return \"No crystal structure known.\"\n", + " else:\n", + " return ref_state[\"symmetry\"]" + ] + }, + { + "cell_type": "markdown", + "id": "93b51c33-91c8-42d0-9e89-e0012879f122", + "metadata": {}, + "source": [ + "After applying the `decorator`, the functions can be called using `invoke()`. We validate the functionality for two elements iron (Fe) and gold (Au). " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "6d939068-506e-4fbe-aac9-97ca482d2b6e", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'bcc'" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_crystal_structure.invoke(\"Fe\")" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "07aa9ba5-6575-4022-b707-4c5433f86166", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'fcc'" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "get_crystal_structure.invoke(\"Au\")" + ] + }, + { + "cell_type": "markdown", + "id": "4c888942-335e-43eb-b088-47c155eef0b3", + "metadata": {}, + "source": [ + "## Define Agent\n", + "After the definition of the Python function, the next step is the definition of the agent which the LLM uses to interact with the Python function. In this example the `ChatGroq` interface of the `langchain_groq` package is used. Depending on your configuration, it might be necessary to install the `langchain_groq` package using the following command: \n", + "```\n", + "conda install -c conda-forge langchain-groq\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "6529e329-0947-443a-8325-42ef3510803d", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_groq import ChatGroq\n", + "\n", + "llm = ChatGroq(\n", + " model=\"llama3-70b-8192\",\n", + " temperature=0,\n", + " groq_api_key=GROQ_API_KEY,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "752d18e2-4645-4af9-a55c-1adafd08a0af", + "metadata": {}, + "source": [ + "Following the definition of the LLM, the next step is the definition of the prompt. Here we start with a very basis prompt. In the following section on prompt engineering the optimization of the prompt is discussed in more detail." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "c2edad58-849b-424f-8460-92507e9545ac", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n", + "\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are very powerful assistant, but don't know current events.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "367384b0-1fe6-43cf-a2b5-b3689e65b9fa", + "metadata": {}, + "source": [ + "Finally, the different parts are plugged together, by creating an agent which combines the prompt with the python function referenced here as one tool in a list of potentially many tools and an executor to communicate with the agent. The technical details are discussed in the corresponding [langchain Custom Agent tutorial](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/)." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c1e01a3b-8b21-4538-b67d-6d5c40c66f94", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.agents import AgentExecutor\n", + "from langchain.agents.format_scratchpad.openai_tools import (\n", + " format_to_openai_tool_messages,\n", + ")\n", + "from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser\n", + "\n", + "tools = [get_crystal_structure]\n", + "agent = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "markdown", + "id": "b0d455bd-28bf-44c9-97e9-91f9cb3552cc", + "metadata": {}, + "source": [ + "## Conversation\n", + "Once the `AgentExecutor` is defined we can communicate with the LLM using the `stream()` interface. We repeat the test from above and ask for the crystal structure of gold." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "ac9edf53-8963-4b58-bcb6-742106424757", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_crystal_structure` with `{'chemical_symbol': 'Au'}`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3mfcc\u001b[0m\u001b[32;1m\u001b[1;3mThe crystal structure of gold is Face-Centered Cubic (FCC).\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"What is the crystal structure of gold\"})) # Yeah this worked !!" + ] + }, + { + "cell_type": "markdown", + "id": "43bc4d75-bae7-4fe2-8005-52e2629eec8d", + "metadata": {}, + "source": [ + "With the `verbose=True` parameter the internal steps of the LLM agent are printed in green. As a first step the agent calls the `get_crystal_structure()` already with the converted input parameter, rather than using gold as input it uses the chemical symbol `Au`. The function returns `fcc` and the LLM converts this answer in a sentence a human can understand: \n", + "```\n", + "The crystal structure of gold is face-centered cubic (fcc).\n", + "```\n", + "This example highlights how easy it is these days to make a python function accessible via a LLM for all kinds of users to interact with this python funtion." + ] + }, + { + "cell_type": "markdown", + "id": "bc28370b-6002-4685-9884-44432099b5ea", + "metadata": {}, + "source": [ + "## Prompt Engineering\n", + "Some LLMs like the ChatGPT3.5 model start to hallucinate rather quickly, for example by asking for the crystal structure of a car, it does not ask which elements a car typically consists of, but rather connects car with carbon and replies the crystal structure of a car is diamond, which is obviously wrong. Luckily, the Antropic Claude model does not make this mistake. " + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "7ac99ce9-f93c-406f-8348-f16793b1059e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_crystal_structure` with `{'chemical_symbol': 'C'}`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3mdiamond\u001b[0m\u001b[32;1m\u001b[1;3mThe crystal structure of carbon (car) is diamond.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"What is the crystal structure of car\"})) # I did not know cars were made of carbon" + ] + }, + { + "cell_type": "markdown", + "id": "fe544511-f499-4bdf-b24a-67fed6dd7ce4", + "metadata": {}, + "source": [ + "To restrict the hallucination of the agent we extend the system prompt with the following statement:\n", + "```\n", + "For each query vailidate that it contains a chemical element and otherwise cancel. \n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "4ffbd77b-7fc4-47a4-8c7b-4f179d453c9b", + "metadata": {}, + "outputs": [], + "source": [ + "prompt_improved = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " # \"You are very powerful assistant, but don't know current events.\",\n", + " \"You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "d3b4b76d-a455-41cf-89f7-318061c95ac5", + "metadata": {}, + "outputs": [], + "source": [ + "agent_improved = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt_improved\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_improved_executor = AgentExecutor(agent=agent_improved, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "3033a731-5605-4103-9da8-63c56655a53f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_crystal_structure` with `{'chemical_symbol': 'C'}`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3mdiamond\u001b[0m\u001b[32;1m\u001b[1;3mThe crystal structure of carbon is indeed diamond.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_improved_executor.stream({\"input\": \"What is the crystal structure of car\"}))" + ] + }, + { + "cell_type": "markdown", + "id": "f23ddd66-9032-41e0-90fb-a574845ce426", + "metadata": {}, + "source": [ + "With the modified system prompt, the most LLM models correctly reply that they were not able to determine the crystal structure of a car, because they fail to determine the chemical element a car consists of. Unfortunatley, in the example of the public LLama model this is not the case so further optimization of the prompt would be required." + ] + }, + { + "cell_type": "markdown", + "id": "4a7f9b5f-cfe3-4b9c-9190-33d9505beedd", + "metadata": {}, + "source": [ + "## Summary\n", + "By following the [Custom Agent](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) tutorial from the Langchain documentation, we were able to create a first simulation agent, which calls specialized python frameworks like [ASE](https://wiki.fysik.dtu.dk/ase/) to address material-science specific questions. Still, it is important to carefully engineer the prompt of the agent, otherwise even these LLMs with access to specialized agents tend to hallucinate." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f7124521-c58d-4700-bb51-7eb97be53588", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/book/2-energy-volume-curve-agent-anthropic.ipynb b/book/2-energy-volume-curve-agent-anthropic.ipynb new file mode 100644 index 0000000..28fe7ba --- /dev/null +++ b/book/2-energy-volume-curve-agent-anthropic.ipynb @@ -0,0 +1,377 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a3fb43db-e147-4245-8c60-cbbb8fd349a6", + "metadata": {}, + "source": [ + "# Energy Volume Curve Agent\n", + "Based on the previous tutorial, the next step is to extend the simple crystal structure agent to address the benchmark challenge of calculating an energy-versus-volume curve using the EMT simulation code. Again, this demonstration is based on the Langchain tutorial for [custom agents](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/)." + ] + }, + { + "cell_type": "markdown", + "id": "abebe249-91bd-44ad-bcec-79bf4de2e1a6", + "metadata": {}, + "source": [ + "As a first step, we import a number of python modules, these consist of the [ASE](https://wiki.fysik.dtu.dk/ase/), some standard python libraries as well as a number of tools from [langchain](https://www.langchain.com):" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "cbc2afa1-d50b-4c30-9e0a-5dbd8cac7b3b", + "metadata": {}, + "outputs": [], + "source": [ + "from ase.atoms import Atoms\n", + "from ase.build import bulk\n", + "from ase.calculators.emt import EMT\n", + "from ase.constraints import UnitCellFilter\n", + "from ase.eos import calculate_eos, plot\n", + "from ase.optimize import LBFGS" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c2e3de94-c93d-490d-90ba-d4ba6e9e3d6d", + "metadata": {}, + "outputs": [], + "source": [ + "from getpass import getpass\n", + "from typing import List" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b9b96907-ad13-4a4b-862f-bf3f887dd52a", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.agents import AgentExecutor, tool\n", + "from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages\n", + "from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser\n", + "from langchain_core.pydantic_v1 import BaseModel\n", + "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n", + "from langchain_anthropic import ChatAnthropic" + ] + }, + { + "cell_type": "markdown", + "id": "c0e2a114-486a-424c-9371-4b12258f06d5", + "metadata": {}, + "source": [ + "## Multiple Functions\n", + "To connect multiple python functions, it is essential to convert all inputs and outputs to `JSON` compatible data types as the communication to the LLM in the background happens in terms of web requests using the `JSON` format. This especially applies to python functions which return python objects. In the case of [ASE](https://wiki.fysik.dtu.dk/ase/) a typical python object is the `ase.atoms.Atoms()` object. To convert the `Atoms()` object to JSON format, we use a [pydantic](https://pydantic.dev) data class as suggested by [langchain](https://www.langchain.com):" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "25f967bd-f431-4780-ab45-21495f40bce3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'title': 'AtomsDict',\n", + " 'type': 'object',\n", + " 'properties': {'numbers': {'title': 'Numbers',\n", + " 'type': 'array',\n", + " 'items': {'type': 'integer'}},\n", + " 'positions': {'title': 'Positions',\n", + " 'type': 'array',\n", + " 'items': {'type': 'array', 'items': {'type': 'number'}}},\n", + " 'cell': {'title': 'Cell',\n", + " 'type': 'array',\n", + " 'items': {'type': 'array', 'items': {'type': 'number'}}},\n", + " 'pbc': {'title': 'Pbc', 'type': 'array', 'items': {'type': 'boolean'}}},\n", + " 'required': ['numbers', 'positions', 'cell', 'pbc']}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "class AtomsDict(BaseModel):\n", + " numbers: List[int]\n", + " positions: List[List[float]]\n", + " cell: List[List[float]]\n", + " pbc: List[bool]\n", + "\n", + "AtomsDict.schema()" + ] + }, + { + "cell_type": "markdown", + "id": "30085089-4b70-40c9-9a92-62caa4c0aa22", + "metadata": {}, + "source": [ + "In terms of functions, two functions are defined: A `get_equilibirum_lattice()` function which for a given chemical element returns the optimized equilibrium structure as `AtomsDict`. The second function is a `plot_equation_of_state()` function, which takes the already optimized structure as `AtomsDict` as input and then uses the [ASE](https://wiki.fysik.dtu.dk/ase/) internal functionality to plot the energy volume curve. \n", + "\n", + "The important point here is that `get_equilibirum_lattice()` would typically just return an `ase.atoms.Atoms()` object, but for compatibility with the LLM it has to be converted to an `AtomsDict` which can be converted to JSON. Finally, in the `plot_equation_of_state()` function the `AtomsDict` is again converted to an `ase.atoms.Atoms()` object to continue the calculation. This is currently a bit tedious. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d244a7a4-d564-4611-aea4-10ff733315ef", + "metadata": {}, + "outputs": [], + "source": [ + "@tool \n", + "def get_equilibirum_lattice(chemical_symbol: str) -> AtomsDict:\n", + " \"\"\"Returns equilibrium atoms dictionary for a given chemical symbol\"\"\"\n", + " atoms = bulk(name=chemical_symbol)\n", + " atoms.calc = EMT()\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n", + " ase_optimizer_obj.run(fmax=0.000001)\n", + " return AtomsDict(**{k: v.tolist() for k, v in atoms.todict().items()})" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "33454233-8bff-4578-a0dc-99bf001a3fb3", + "metadata": {}, + "outputs": [], + "source": [ + "@tool\n", + "def plot_equation_of_state(atom_dict: AtomsDict) -> str:\n", + " \"\"\"Returns plot of equation of state of chemical symbol for a given atoms dictionary\"\"\"\n", + " atoms = Atoms(**atom_dict.dict())\n", + " atoms.calc = EMT()\n", + " eos = calculate_eos(atoms)\n", + " plotdata = eos.getplotdata()\n", + " return plot(*plotdata)" + ] + }, + { + "cell_type": "markdown", + "id": "8ae5f04f-b9b9-40ad-9306-de17b66c11b4", + "metadata": {}, + "source": [ + "Finally, the functions converted to LLM tools using the `tool` decorator can again be tested using the `invoke()` function. It takes a python dictionary as input to address the different arguments individually. With this test, the correct implementation of the python functions is validated:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "cf1a636e-3507-42db-b9e0-761053dfeff4", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " Step Time Energy fmax\n", + "LBFGS: 0 07:48:58 -0.001502 0.161794\n", + "LBFGS: 1 07:48:58 -0.002533 0.135471\n", + "LBFGS: 2 07:48:58 -0.004879 0.005452\n", + "LBFGS: 3 07:48:58 -0.004883 0.000157\n", + "LBFGS: 4 07:48:58 -0.004883 0.000000\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/9p/rztyv06d0xv4h26cyv8nrw3m0000gq/T/ipykernel_2250/2598102435.py:6: FutureWarning: Import UnitCellFilter from ase.filters\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n" + ] + }, + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "el = \"Al\"\n", + "structure_dict = get_equilibirum_lattice.invoke({\"chemical_symbol\": el})\n", + "plot_equation_of_state.invoke({\"atom_dict\": structure_dict})" + ] + }, + { + "cell_type": "markdown", + "id": "f3881566-c046-46b0-b016-8081ec2a2e7b", + "metadata": {}, + "source": [ + "## Agents\n", + "Following the same Langchain tutorial for [custom agents](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) as before a LLM agent is constructed which combines the system prompt, the python functions `get_equilibirum_lattice()` and `plot_equation_of_state()` as tools and the `OpenAIToolsAgentOutputParser()`:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "6529e329-0947-443a-8325-42ef3510803d", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Enter your Anthropic Token: ········\n" + ] + } + ], + "source": [ + "ANTHROPIC_API_KEY = getpass(prompt='Enter your Anthropic Token:')\n", + "llm = ChatAnthropic(model=\"claude-3-5-sonnet-20240620\", temperature=0, anthropic_api_key=ANTHROPIC_API_KEY,)\n", + "tools = [get_equilibirum_lattice, plot_equation_of_state]\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")\n", + "agent = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "markdown", + "id": "30aff6cc-c41b-4346-9542-88f0d7833856", + "metadata": {}, + "source": [ + "## User Interactions\n", + "The user can then interact with the agent by asking it to `Plot the equation of state of gold`. In the background the agent first calls the `get_equilibirum_lattice()` with the parameter `Au` for the chemical symbol of gold and afterwards `plot_equation_of_state()` with the atomistic structure returned by `get_equilibirum_lattice()`. " + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c2edad58-849b-424f-8460-92507e9545ac", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_equilibirum_lattice` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"Certainly! I'd be happy to plot the equation of state for gold. To do this, we'll need to use two functions: first, we'll get the equilibrium lattice for gold, and then we'll use that to plot the equation of state.\\n\\nLet's start by getting the equilibrium lattice for gold:\", 'type': 'text', 'index': 0}, {'id': 'toolu_017dXVwAGh4D1LPEY24xDbZL', 'input': {}, 'name': 'get_equilibirum_lattice', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:49:04 0.002606 0.308859\n", + "LBFGS: 1 07:49:04 0.000032 0.077808\n", + "LBFGS: 2 07:49:04 -0.000135 0.003099\n", + "LBFGS: 3 07:49:04 -0.000135 0.000029\n", + "LBFGS: 4 07:49:04 -0.000135 0.000000\n", + "\u001b[36;1m\u001b[1;3mnumbers=[79] positions=[[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]] cell=[[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]] pbc=[True, True, True]\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/9p/rztyv06d0xv4h26cyv8nrw3m0000gq/T/ipykernel_2250/2598102435.py:6: FutureWarning: Import UnitCellFilter from ase.filters\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `plot_equation_of_state` with `{'atom_dict': {'numbers': [79], 'positions': [[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]], 'cell': [[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]], 'pbc': [True, True, True]}}`\n", + "responded: [{'text': \"\\n\\nGreat! Now that we have the equilibrium lattice for gold, let's use this information to plot the equation of state:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01FAAFetx8gyYiSeE4v2siuj', 'input': {}, 'name': 'plot_equation_of_state', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"atom_dict\": {\"numbers\": [79], \"positions\": [[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]], \"cell\": [[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]], \"pbc\": [true, true, true]}}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mAxes(0.125,0.11;0.775x0.77)\u001b[0m\u001b[32;1m\u001b[1;3m[{'text': \"\\n\\nThe equation of state for gold has been successfully plotted. The result is represented by an Axes object, which indicates that a graph has been generated.\\n\\nThis plot typically shows the relationship between the volume of the gold crystal and its energy. It helps in understanding the structural and thermodynamic properties of gold under different conditions.\\n\\nThe x-axis usually represents the volume (or sometimes the lattice constant), while the y-axis represents the total energy of the system. The minimum point on this curve corresponds to the equilibrium volume and energy of the gold crystal.\\n\\nIs there anything specific about the equation of state for gold that you'd like to know or discuss further?\", 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"Plot the equation of state of gold\"}))" + ] + }, + { + "cell_type": "markdown", + "id": "555d589b-8607-410f-8a37-12aefd2fec59", + "metadata": {}, + "source": [ + "## Summary\n", + "The important point in this example is, that the order of execution of the Python functions is not defined by the user. Instead the LLM automatically determines that `plot_equation_of_state()` needs an `AtomsDict()` object as input and that `get_equilibirum_lattice()` returns such a `AtomsDict()` object as an output, so it makes sense to call `get_equilibirum_lattice()` first and `plot_equation_of_state()` second. The same principles apply to LLM agents with a larger number of python functions. \n", + "\n", + "The limiting point at the moment is that the LLMs are web services, so all Python objects have to be converted to `JSON` to be communicated to the LLMs. This restricts the choice of Python objects and requires the development of specialized data classes to construct those interfaces between different Python functions. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3bfcac8e-aa17-414e-a83d-fd766eefb460", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/book/2-energy-volume-curve-agent-groq.ipynb b/book/2-energy-volume-curve-agent-groq.ipynb new file mode 100644 index 0000000..6f4fe00 --- /dev/null +++ b/book/2-energy-volume-curve-agent-groq.ipynb @@ -0,0 +1,363 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a3fb43db-e147-4245-8c60-cbbb8fd349a6", + "metadata": {}, + "source": [ + "# Energy Volume Curve Agent\n", + "Based on the previous tutorial, the next step is to extend the simple crystal structure agent to address the benchmark challenge of calculating an energy-versus-volume curve using the EMT simulation code. Again, this demonstration is based on the Langchain tutorial for [custom agents](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/)." + ] + }, + { + "cell_type": "markdown", + "id": "abebe249-91bd-44ad-bcec-79bf4de2e1a6", + "metadata": {}, + "source": [ + "As a first step, we import a number of python modules, these consist of the [ASE](https://wiki.fysik.dtu.dk/ase/), some standard python libraries as well as a number of tools from [langchain](https://www.langchain.com):" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "cbc2afa1-d50b-4c30-9e0a-5dbd8cac7b3b", + "metadata": {}, + "outputs": [], + "source": [ + "from ase.atoms import Atoms\n", + "from ase.build import bulk\n", + "from ase.calculators.emt import EMT\n", + "from ase.constraints import UnitCellFilter\n", + "from ase.eos import calculate_eos, plot\n", + "from ase.optimize import LBFGS" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "c2e3de94-c93d-490d-90ba-d4ba6e9e3d6d", + "metadata": {}, + "outputs": [], + "source": [ + "from getpass import getpass\n", + "from typing import List" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b9b96907-ad13-4a4b-862f-bf3f887dd52a", + "metadata": {}, + "outputs": [], + "source": [ + "from langchain.agents import AgentExecutor, tool\n", + "from langchain.agents.format_scratchpad.openai_tools import format_to_openai_tool_messages\n", + "from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser\n", + "from langchain_core.pydantic_v1 import BaseModel\n", + "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n", + "from langchain_groq import ChatGroq" + ] + }, + { + "cell_type": "markdown", + "id": "c0e2a114-486a-424c-9371-4b12258f06d5", + "metadata": {}, + "source": [ + "## Multiple Functions\n", + "To connect multiple python functions, it is essential to convert all inputs and outputs to `JSON` compatible data types as the communication to the LLM in the background happens in terms of web requests using the `JSON` format. This especially applies to python functions which return python objects. In the case of [ASE](https://wiki.fysik.dtu.dk/ase/) a typical python object is the `ase.atoms.Atoms()` object. To convert the `Atoms()` object to JSON format, we use a [pydantic](https://pydantic.dev) data class as suggested by [langchain](https://www.langchain.com):" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "25f967bd-f431-4780-ab45-21495f40bce3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{'title': 'AtomsDict',\n", + " 'type': 'object',\n", + " 'properties': {'numbers': {'title': 'Numbers',\n", + " 'type': 'array',\n", + " 'items': {'type': 'integer'}},\n", + " 'positions': {'title': 'Positions',\n", + " 'type': 'array',\n", + " 'items': {'type': 'array', 'items': {'type': 'number'}}},\n", + " 'cell': {'title': 'Cell',\n", + " 'type': 'array',\n", + " 'items': {'type': 'array', 'items': {'type': 'number'}}},\n", + " 'pbc': {'title': 'Pbc', 'type': 'array', 'items': {'type': 'boolean'}}},\n", + " 'required': ['numbers', 'positions', 'cell', 'pbc']}" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "class AtomsDict(BaseModel):\n", + " numbers: List[int]\n", + " positions: List[List[float]]\n", + " cell: List[List[float]]\n", + " pbc: List[bool]\n", + "\n", + "AtomsDict.schema()" + ] + }, + { + "cell_type": "markdown", + "id": "30085089-4b70-40c9-9a92-62caa4c0aa22", + "metadata": {}, + "source": [ + "In terms of functions, two functions are defined: A `get_equilibirum_lattice()` function which for a given chemical element returns the optimized equilibrium structure as `AtomsDict`. The second function is a `plot_equation_of_state()` function, which takes the already optimized structure as `AtomsDict` as input and then uses the [ASE](https://wiki.fysik.dtu.dk/ase/) internal functionality to plot the energy volume curve. \n", + "\n", + "The important point here is that `get_equilibirum_lattice()` would typically just return an `ase.atoms.Atoms()` object, but for compatibility with the LLM it has to be converted to an `AtomsDict` which can be converted to JSON. Finally, in the `plot_equation_of_state()` function the `AtomsDict` is again converted to an `ase.atoms.Atoms()` object to continue the calculation. This is currently a bit tedious. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d244a7a4-d564-4611-aea4-10ff733315ef", + "metadata": {}, + "outputs": [], + "source": [ + "@tool \n", + "def get_equilibirum_lattice(chemical_symbol: str) -> AtomsDict:\n", + " \"\"\"Returns equilibrium atoms dictionary for a given chemical symbol\"\"\"\n", + " atoms = bulk(name=chemical_symbol)\n", + " atoms.calc = EMT()\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n", + " ase_optimizer_obj.run(fmax=0.000001)\n", + " return AtomsDict(**{k: v.tolist() for k, v in atoms.todict().items()})" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "33454233-8bff-4578-a0dc-99bf001a3fb3", + "metadata": {}, + "outputs": [], + "source": [ + "@tool\n", + "def plot_equation_of_state(atom_dict: AtomsDict) -> str:\n", + " \"\"\"Returns plot of equation of state of chemical symbol for a given atoms dictionary\"\"\"\n", + " atoms = Atoms(**atom_dict.dict())\n", + " atoms.calc = EMT()\n", + " eos = calculate_eos(atoms)\n", + " plotdata = eos.getplotdata()\n", + " return plot(*plotdata)" + ] + }, + { + "cell_type": "markdown", + "id": "8ae5f04f-b9b9-40ad-9306-de17b66c11b4", + "metadata": {}, + "source": [ + "Finally, the functions converted to LLM tools using the `tool` decorator can again be tested using the `invoke()` function. It takes a python dictionary as input to address the different arguments individually. With this test, the correct implementation of the python functions is validated:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "cf1a636e-3507-42db-b9e0-761053dfeff4", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " Step Time Energy fmax\n", + "LBFGS: 0 07:51:18 -0.001502 0.161794\n", + "LBFGS: 1 07:51:18 -0.002533 0.135471\n", + "LBFGS: 2 07:51:18 -0.004879 0.005452\n", + "LBFGS: 3 07:51:18 -0.004883 0.000157\n", + "LBFGS: 4 07:51:18 -0.004883 0.000000\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/9p/rztyv06d0xv4h26cyv8nrw3m0000gq/T/ipykernel_2816/2598102435.py:6: FutureWarning: Import UnitCellFilter from ase.filters\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n" + ] + }, + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "el = \"Al\"\n", + "structure_dict = get_equilibirum_lattice.invoke({\"chemical_symbol\": el})\n", + "plot_equation_of_state.invoke({\"atom_dict\": structure_dict})" + ] + }, + { + "cell_type": "markdown", + "id": "f3881566-c046-46b0-b016-8081ec2a2e7b", + "metadata": {}, + "source": [ + "## Agents\n", + "Following the same Langchain tutorial for [custom agents](https://python.langchain.com/docs/modules/agents/how_to/custom_agent/) as before a LLM agent is constructed which combines the system prompt, the python functions `get_equilibirum_lattice()` and `plot_equation_of_state()` as tools and the `OpenAIToolsAgentOutputParser()`:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "6529e329-0947-443a-8325-42ef3510803d", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Enter your Groq Token: ········\n" + ] + } + ], + "source": [ + "GROQ_API_KEY = getpass(prompt='Enter your Groq Token:')\n", + "llm = ChatGroq(model=\"llama3-70b-8192\", temperature=0, groq_api_key=GROQ_API_KEY)\n", + "tools = [get_equilibirum_lattice, plot_equation_of_state]\n", + "prompt = ChatPromptTemplate.from_messages(\n", + " [\n", + " (\n", + " \"system\",\n", + " \"You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.\",\n", + " ),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n", + " ]\n", + ")\n", + "agent = (\n", + " {\n", + " \"input\": lambda x: x[\"input\"],\n", + " \"agent_scratchpad\": lambda x: format_to_openai_tool_messages(\n", + " x[\"intermediate_steps\"]\n", + " ),\n", + " }\n", + " | prompt\n", + " | llm.bind_tools(tools)\n", + " | OpenAIToolsAgentOutputParser()\n", + ")\n", + "agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)" + ] + }, + { + "cell_type": "markdown", + "id": "30aff6cc-c41b-4346-9542-88f0d7833856", + "metadata": {}, + "source": [ + "## User Interactions\n", + "The user can then interact with the agent by asking it to `Plot the equation of state of gold`. In the background the agent first calls the `get_equilibirum_lattice()` with the parameter `Au` for the chemical symbol of gold and afterwards `plot_equation_of_state()` with the atomistic structure returned by `get_equilibirum_lattice()`. " + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c2edad58-849b-424f-8460-92507e9545ac", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_equilibirum_lattice` with `{'chemical_symbol': 'Au'}`\n", + "\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:51:22 0.002606 0.308859\n", + "LBFGS: 1 07:51:22 0.000032 0.077808\n", + "LBFGS: 2 07:51:22 -0.000135 0.003099\n", + "LBFGS: 3 07:51:22 -0.000135 0.000029\n", + "LBFGS: 4 07:51:22 -0.000135 0.000000\n", + "\u001b[36;1m\u001b[1;3mnumbers=[79] positions=[[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]] cell=[[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]] pbc=[True, True, True]\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/var/folders/9p/rztyv06d0xv4h26cyv8nrw3m0000gq/T/ipykernel_2816/2598102435.py:6: FutureWarning: Import UnitCellFilter from ase.filters\n", + " ase_optimizer_obj = LBFGS(UnitCellFilter(atoms))\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m{\"tool_calls\":[{\"id\":\"pending\",\"type\":\"function\",\"function\":{\"name\":\"plot_equation_of_state\"},\"parameters\":{\"atom_dict\":{\"numbers\":[79],\"positions\":[[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]],\"cell\":[[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]],\"pbc\":[True, True, True]}}}]}]}\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + } + ], + "source": [ + "lst = list(agent_executor.stream({\"input\": \"Plot the equation of state of gold\"}))" + ] + }, + { + "cell_type": "markdown", + "id": "555d589b-8607-410f-8a37-12aefd2fec59", + "metadata": {}, + "source": [ + "## Summary\n", + "The important point in this example is, that the order of execution of the Python functions is not defined by the user. Instead the LLM automatically determines that `plot_equation_of_state()` needs an `AtomsDict()` object as input and that `get_equilibirum_lattice()` returns such a `AtomsDict()` object as an output, so it makes sense to call `get_equilibirum_lattice()` first and `plot_equation_of_state()` second. The same principles apply to LLM agents with a larger number of python functions. \n", + "\n", + "The limiting point at the moment is that the LLMs are web services, so all Python objects have to be converted to `JSON` to be communicated to the LLMs. This restricts the choice of Python objects and requires the development of specialized data classes to construct those interfaces between different Python functions. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3bfcac8e-aa17-414e-a83d-fd766eefb460", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/book/3-hackathon-demonstration-anthropic.ipynb b/book/3-hackathon-demonstration-anthropic.ipynb new file mode 100644 index 0000000..65486a4 --- /dev/null +++ b/book/3-hackathon-demonstration-anthropic.ipynb @@ -0,0 +1,587 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "d7616c03-7a5b-463a-b2db-0e0501619993", + "metadata": {}, + "source": [ + "# Hackathon Demonstration\n", + "Beyond the two previous examples, which closely followed the Langchain [custom agent](https://python.langchain.com/v0.1/docs/modules/agents/how_to/custom_agent/) tutorial, the goal for the final presentation was to highlight the use of `LangSim` in production. For this purpose, the functionality was transferred from the Jupyter notebook to a Python module. As a first step the `LangSim` module is loaded using the `%reload_ext` jupyter magic command:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "db5f8988-a11f-4f4d-8a8c-74f419ddcd80", + "metadata": {}, + "outputs": [], + "source": [ + "%reload_ext langsim" + ] + }, + { + "cell_type": "markdown", + "id": "01688907-23ea-42fd-ab8c-0d4f6a918465", + "metadata": {}, + "source": [ + "Then the [Anthropic](https://www.anthropic.com) API key is set to the `LANGSIM_API_KEY` environment variable so it can be used by `LangSim` internally, in addition we set the variables `LANGSIM_PROVIDER` to select `anthropic` and `LANGSIM_MODEL` to select `claude-3-5-sonnet-20240620`:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "a743c1f8-63a7-49a7-b77a-4cdf85d49b84", + "metadata": {}, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Enter your API key ········\n" + ] + } + ], + "source": [ + "import os, getpass\n", + "os.environ[\"LANGSIM_PROVIDER\"] = \"anthropic\" \n", + "os.environ[\"LANGSIM_API_KEY\"] = getpass.getpass(\"Enter your API key\")\n", + "os.environ[\"LANGSIM_MODEL\"] = \"claude-3-5-sonnet-20240620\"" + ] + }, + { + "cell_type": "markdown", + "id": "aa95152b-aa79-4a9b-900d-28a0b3b067fe", + "metadata": {}, + "source": [ + "## Dialog\n", + "From this point on the user can chat with the `LangSim` interface using the `chat` jupyter magic. As a first challenge the user communicates their interest to find a material with a bulk modulus in the range of 200 GPa; following with a clarification of this statement by asking the LLM to calculate the bulk modulus for the noble metals. In contrast to the previous example, the LLM is tasked with repeating the same calculation multiple times and then identify the chemical element, which most closely matches the targeted bulk modulus of 200GPa. None of the required logic is implemented explicitly, rather the LLM only has access to a `get_equilibrium_lattice()` function to calculate the equilibrium crystal structure and a `get_bulk_modulus()` function, which calculates the bulk modulus for a given equilibrium crystal structure." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "2b698ffd-3b16-4937-ae97-8e9f44a95bb7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m[{'text': \"Certainly! I'd be happy to help you compute the bulk modulus for the noble metals. The noble metals typically include gold (Au), silver (Ag), platinum (Pt), palladium (Pd), rhodium (Rh), iridium (Ir), and sometimes copper (Cu) is included as well.\\n\\nTo calculate the bulk modulus, we need to use one of the available models: either EMT (Effective Medium Theory) or MACE (Machine Learning Force Field). Before we proceed, I need to know which model you'd prefer to use for these calculations. \\n\\nWould you like to use:\\n1. EMT (Effective Medium Theory)\\n2. MACE (Machine Learning Force Field)\\n\\nPlease let me know your preference, and then I can proceed with the calculations for the noble metals.\", 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/markdown": [ + "Certainly! I'd be happy to help you compute the bulk modulus for the noble metals. The noble metals typically include gold (Au), silver (Ag), platinum (Pt), palladium (Pd), rhodium (Rh), iridium (Ir), and sometimes copper (Cu) is included as well.\n", + "\n", + "To calculate the bulk modulus, we need to use one of the available models: either EMT (Effective Medium Theory) or MACE (Machine Learning Force Field). Before we proceed, I need to know which model you'd prefer to use for these calculations. \n", + "\n", + "Would you like to use:\n", + "1. EMT (Effective Medium Theory)\n", + "2. MACE (Machine Learning Force Field)\n", + "\n", + "Please let me know your preference, and then I can proceed with the calculations for the noble metals." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%chat\n", + "I am looking for a material with a bulk modulus in the range of 200 GPa.\n", + "Can you compute the bulk modulus for the noble metals for me?" + ] + }, + { + "cell_type": "markdown", + "id": "837750e4-d51d-4ea9-bb7e-be3fd7b1f63c", + "metadata": {}, + "source": [ + "The LLM is instructed by the system prompt to clarify which simulation method it should use to calculate a given material property. At the time of the demonstration, two simulation methods were implemented, [EMT](https://wiki.fysik.dtu.dk/ase/ase/calculators/emt.html) and [MACE](https://mace-docs.readthedocs.io). As the user in this demonstration is not aware of the technical details of these simulation methods, he asks for clarification. This highlights how LLMs can use the context of the previous conversation to answer a given question:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "d96cabcc-a36b-4d40-9999-808ecdc746dc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m[{'text': \"I apologize for not providing more information about the methods earlier. Let me explain the two available models for calculating the bulk modulus:\\n\\n1. EMT (Effective Medium Theory):\\n EMT is a computationally efficient, analytical model that describes the macroscopic properties of composite materials. It's a simpler and faster method, but it may be less accurate for some systems, especially those with complex electronic structures.\\n\\n2. MACE (Machine Learning Force Field):\\n MACE is a more advanced machine learning-based force field for predicting many-body atomic interactions. It covers the entire periodic table and is generally more accurate than EMT, especially for a wider range of elements and structures. However, it may be computationally more intensive.\\n\\nFor noble metals, both methods can provide reasonable results, but MACE might be more accurate. However, the choice depends on your specific needs:\\n\\n- If you need quick estimates and computational efficiency is a priority, EMT might be preferable.\\n- If you need higher accuracy and don't mind slightly longer computation times, MACE would be the better choice.\\n\\nWhich method would you prefer to use for calculating the bulk modulus of noble metals? Once you decide, I can proceed with the calculations.\", 'type': 'text', 'index': 0}]\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/markdown": [ + "I apologize for not providing more information about the methods earlier. Let me explain the two available models for calculating the bulk modulus:\n", + "\n", + "1. EMT (Effective Medium Theory):\n", + " EMT is a computationally efficient, analytical model that describes the macroscopic properties of composite materials. It's a simpler and faster method, but it may be less accurate for some systems, especially those with complex electronic structures.\n", + "\n", + "2. MACE (Machine Learning Force Field):\n", + " MACE is a more advanced machine learning-based force field for predicting many-body atomic interactions. It covers the entire periodic table and is generally more accurate than EMT, especially for a wider range of elements and structures. However, it may be computationally more intensive.\n", + "\n", + "For noble metals, both methods can provide reasonable results, but MACE might be more accurate. However, the choice depends on your specific needs:\n", + "\n", + "- If you need quick estimates and computational efficiency is a priority, EMT might be preferable.\n", + "- If you need higher accuracy and don't mind slightly longer computation times, MACE would be the better choice.\n", + "\n", + "Which method would you prefer to use for calculating the bulk modulus of noble metals? Once you decide, I can proceed with the calculations." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%chat \n", + "What are those methods?" + ] + }, + { + "cell_type": "markdown", + "id": "ba002cb9-b406-49dd-a716-c57904d48472", + "metadata": {}, + "source": [ + "Based on the explanation of the LLM, the user chooses the [EMT](https://wiki.fysik.dtu.dk/ase/ase/calculators/emt.html) method, primarily because it is computationally more efficient to calculate a large number of calculations in a short amount of time. The LLM then executes two calculations for each of the three noble metals silver (Ag), gold (Au) and platinum (Pt). These calculations are the calculation of the equilibrium structure `get_equilibrium_lattice()` which in addition to the chemical symbol now also takes the name of the chosen calculator as an input and the calculation of the bulk modulus `get_bulk_modulus()` which takes the optimized structure as `AtomsDict()` object as an input in addition to the chosen calculator as string. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "3caea2fc-a3de-400c-91cb-ae53c616ece1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"Certainly! We'll use the EMT (Effective Medium Theory) method to calculate the bulk modulus for the noble metals. I'll compute this for gold (Au), silver (Ag), platinum (Pt), palladium (Pd), rhodium (Rh), and iridium (Ir). Let's start with these calculations.\\n\\nFor each metal, we need to first get the bulk structure, then equilibrate it, and finally calculate the bulk modulus. I'll do this for each metal one by one.\\n\\nLet's begin with gold (Au):\", 'type': 'text', 'index': 0}, {'id': 'toolu_01G3s6u3824duQkUYC2S48vR', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[79] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_equilibrated_structure` with `{'atom_dict': {'numbers': [79], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'text': \"\\n\\nNow that we have the bulk structure for gold, let's equilibrate it using the EMT calculator:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01BUiM5erRnzLRsAhQ6hEC1m', 'input': {}, 'name': 'get_atom_dict_equilibrated_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"atom_dict\": {\"numbers\": [79], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:56:23 0.002606 0.308859\n", + "LBFGS: 1 07:56:23 0.000032 0.077808\n", + "LBFGS: 2 07:56:23 -0.000135 0.003099\n", + "LBFGS: 3 07:56:23 -0.000135 0.000029\n", + "LBFGS: 4 07:56:23 -0.000135 0.000000\n", + "\u001b[38;5;200m\u001b[1;3mnumbers=[79] positions=[[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]] cell=[[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [79], 'positions': [[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]], 'cell': [[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'text': \"\\n\\nNow that we have the equilibrated structure for gold, let's calculate its bulk modulus:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01BuZFY3zvENzYWh9Jasus7M', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"atom_dict\": {\"numbers\": [79], \"positions\": [[4.761270571021482e-17, -3.44317321286765e-17, -2.0729599738876008e-16]], \"cell\": [[7.040904860568557e-17, 2.028082809705617, 2.0280828097056176], [2.028082809705617, 1.0384771021574885e-16, 2.0280828097056176], [2.028082809705617, 2.028082809705618, 4.963320342464553e-17]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m173.8383045764123\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Ag'}`\n", + "responded: [{'text': \"\\n\\nThe bulk modulus of gold (Au) calculated using EMT is approximately 173.84 GPa.\\n\\nNow, let's repeat this process for the other noble metals: silver (Ag), platinum (Pt), palladium (Pd), rhodium (Rh), and iridium (Ir).\", 'type': 'text', 'index': 0}, {'id': 'toolu_01RFH3aZTKbthvQDzM5U6ttV', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Ag\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[47] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_equilibrated_structure` with `{'atom_dict': {'numbers': [47], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01HsAhBjQsREjnh5x7SRFEot', 'input': {}, 'name': 'get_atom_dict_equilibrated_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [47], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:56:31 0.001584 0.198735\n", + "LBFGS: 1 07:56:31 0.000254 0.113272\n", + "LBFGS: 2 07:56:31 -0.000366 0.004325\n", + "LBFGS: 3 07:56:31 -0.000367 0.000089\n", + "LBFGS: 4 07:56:31 -0.000367 0.000000\n", + "\u001b[38;5;200m\u001b[1;3mnumbers=[47] positions=[[3.632034897593557e-17, -7.643082441891503e-17, -4.3481411845850515e-16]] cell=[[9.839176801279394e-17, 2.031776305380803, 2.0317763053808036], [2.031776305380803, -6.293048194449085e-17, 2.0317763053808036], [2.031776305380803, 2.031776305380803, -2.945414531846451e-17]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [47], 'positions': [[3.632034897593557e-17, -7.643082441891503e-17, -4.3481411845850515e-16]], 'cell': [[9.839176801279394e-17, 2.031776305380803, 2.0317763053808036], [2.031776305380803, -6.293048194449085e-17, 2.0317763053808036], [2.031776305380803, 2.031776305380803, -2.945414531846451e-17]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01SjbgmFaXuiJAr2e3pzc4w9', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [47], \"positions\": [[3.632034897593557e-17, -7.643082441891503e-17, -4.3481411845850515e-16]], \"cell\": [[9.839176801279394e-17, 2.031776305380803, 2.0317763053808036], [2.031776305380803, -6.293048194449085e-17, 2.0317763053808036], [2.031776305380803, 2.031776305380803, -2.945414531846451e-17]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m100.1580099647698\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pt'}`\n", + "responded: [{'id': 'toolu_01SAFc88U6jiHVaeKAGUo6AM', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pt\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[78] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_equilibrated_structure` with `{'atom_dict': {'numbers': [78], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_014fK8mbWdbCc6QHFxSPf6Ui', 'input': {}, 'name': 'get_atom_dict_equilibrated_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [78], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:56:38 -0.000125 0.036183\n", + "LBFGS: 1 07:56:38 -0.000149 0.004500\n", + "LBFGS: 2 07:56:38 -0.000150 0.000013\n", + "LBFGS: 3 07:56:38 -0.000150 0.000000\n", + "\u001b[38;5;200m\u001b[1;3mnumbers=[78] positions=[[1.7335088009500337e-17, 1.1611829961508788e-16, 1.1092203033984381e-16]] cell=[[4.60232654901218e-18, 1.9609007393798525, 1.960900739379854], [1.9609007393798532, -3.193849078504324e-16, 1.9609007393798543], [1.9609007393798532, 1.9609007393798528, 1.704915243852484e-16]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [78], 'positions': [[1.7335088009500337e-17, 1.1611829961508788e-16, 1.1092203033984381e-16]], 'cell': [[4.60232654901218e-18, 1.9609007393798525, 1.960900739379854], [1.9609007393798532, -3.193849078504324e-16, 1.9609007393798543], [1.9609007393798532, 1.9609007393798528, 1.704915243852484e-16]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01EkTZs8zu44MA7mAASvd2hM', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [78], \"positions\": [[1.7335088009500337e-17, 1.1611829961508788e-16, 1.1092203033984381e-16]], \"cell\": [[4.60232654901218e-18, 1.9609007393798525, 1.960900739379854], [1.9609007393798532, -3.193849078504324e-16, 1.9609007393798543], [1.9609007393798532, 1.9609007393798528, 1.704915243852484e-16]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m278.2966136639589\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pd'}`\n", + "responded: [{'id': 'toolu_01BGCFWyDttt9iN8q4NNyrcj', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pd\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[46] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_equilibrated_structure` with `{'atom_dict': {'numbers': [46], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_014tysKpH352XUFXQZfuQg8p', 'input': {}, 'name': 'get_atom_dict_equilibrated_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [46], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m Step Time Energy fmax\n", + "LBFGS: 0 07:56:45 0.000342 0.140218\n", + "LBFGS: 1 07:56:45 -0.000211 0.043682\n", + "LBFGS: 2 07:56:45 -0.000270 0.000742\n", + "LBFGS: 3 07:56:45 -0.000270 0.000004\n", + "LBFGS: 4 07:56:45 -0.000270 0.000000\n", + "\u001b[38;5;200m\u001b[1;3mnumbers=[46] positions=[[-1.6927811771902953e-17, -1.2503517079340123e-16, -7.690125956122276e-17]] cell=[[-2.451382079594262e-16, 1.9393703009539969, 1.9393703009539975], [1.939370300953997, 2.8937020560458515e-17, 1.9393703009539973], [1.9393703009539969, 1.9393703009539969, 2.7783509637994643e-16]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [46], 'positions': [[-1.6927811771902953e-17, -1.2503517079340123e-16, -7.690125956122276e-17]], 'cell': [[-2.451382079594262e-16, 1.9393703009539969, 1.9393703009539975], [1.939370300953997, 2.8937020560458515e-17, 1.9393703009539973], [1.9393703009539969, 1.9393703009539969, 2.7783509637994643e-16]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01MC9td1UE2qdbPnW9PPD5mr', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [46], \"positions\": [[-1.6927811771902953e-17, -1.2503517079340123e-16, -7.690125956122276e-17]], \"cell\": [[-2.451382079594262e-16, 1.9393703009539969, 1.9393703009539975], [1.939370300953997, 2.8937020560458515e-17, 1.9393703009539973], [1.9393703009539969, 1.9393703009539969, 2.7783509637994643e-16]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m179.20795176571716\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Rh'}`\n", + "responded: [{'id': 'toolu_01NogAQ8aHYZMn9Gr59R1a4D', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Rh\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[45] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.9, 1.9], [1.9, 0.0, 1.9], [1.9, 1.9, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_equilibrated_structure` with `{'atom_dict': {'numbers': [45], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.9, 1.9], [1.9, 0.0, 1.9], [1.9, 1.9, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_019s65ryZK38VAk2TzvPudX4', 'input': {}, 'name': 'get_atom_dict_equilibrated_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [45], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.9, 1.9], [1.9, 0.0, 1.9], [1.9, 1.9, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[38;5;200m\u001b[1;3mAn exception occurred: {}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Stopping agent prematurely due to triggering stop condition\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Ir'}`\n", + "responded: [{'text': \"\\n\\nI apologize for the error. It seems that the EMT calculator doesn't have parameters for rhodium (Rh). Let's skip rhodium and move on to iridium (Ir).\", 'type': 'text', 'index': 0}, {'id': 'toolu_01CjHj3icBnVDpTuqSurEy4x', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Ir\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[77] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.92, 1.92], [1.92, 0.0, 1.92], [1.92, 1.92, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/markdown": [ + "Agent stopped due to max iterations." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%chat\n", + "Let's take EMT then" + ] + }, + { + "cell_type": "markdown", + "id": "6aab7bc7-026a-412b-be3c-89391f07cfd3", + "metadata": {}, + "source": [ + "The results are summarized in a short list. This demonstration highlights that `LangSim` can be used without prior technical experience in calculating material properties. From students to teachers and professors, it is a handy tool for everybody who wants to have quick access to material properties computed with one of the standard methods like [EMT](https://wiki.fysik.dtu.dk/ase/ase/calculators/emt.html). Obviously, the [MACE](https://mace-docs.readthedocs.io) method is more accurate than the [EMT](https://wiki.fysik.dtu.dk/ase/ase/calculators/emt.html) method, but also computationally more expensive. \n", + "\n", + "Still, to further highlight the real-life application of a tool like `LangSim` the next step after the completion of these calculations is the comparison to an experimental reference. For this purpose, the `get_experimental_elastic_property_wikipedia()` function is implemented which downloads the experimental values for the elastic properties from the wikipedia site for [elastic properties](https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page))." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "bfae3bb2-6ad1-4e1f-bab4-b1c51fc7044a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"Certainly! I'll calculate the bulk modulus for the noble metals using the EMT (Effective Medium Theory) method as you've requested. Then, we'll compare these results to experimental values. Let's start with the calculations for gold (Au), silver (Ag), platinum (Pt), palladium (Pd), copper (Cu), and iridium (Ir). \\n\\nLet's begin with the calculations:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01Xa7Td1GhBnm4iFmXE4KKs9', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[79] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [79], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'text': \"\\n\\nNow that we have the bulk structure for gold, let's calculate its bulk modulus:\", 'type': 'text', 'index': 0}, {'id': 'toolu_0158ZDyvXUJ1HDrjGtrhn5YL', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"atom_dict\": {\"numbers\": [79], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m173.84249328565886\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Ag'}`\n", + "responded: [{'text': \"\\n\\nThe calculated bulk modulus for gold (Au) using EMT is approximately 173.84 GPa.\\n\\nNow, let's repeat this process for the other noble metals:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01RGKXnHKrrNPEo1utQsidX6', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Ag\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[47] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [47], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_016QSFTz8QzQC6RfjfWfseWt', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [47], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m100.1601184287953\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pt'}`\n", + "responded: [{'id': 'toolu_0117HVu9RSpUFL9aigQ7sPYS', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pt\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[78] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [78], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_011uYUfGxXMg3BnsiXFVuQ7L', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [78], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m278.296219466303\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pd'}`\n", + "responded: [{'id': 'toolu_01SB7ycqtfzN1PZUYAS75nW1', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pd\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[46] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [46], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_014anyQJeJSbzVpEkARPMyYT', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [46], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m179.24432817947203\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Cu'}`\n", + "responded: [{'id': 'toolu_01Da3sTehBt7hSKHrQBnoBkr', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Cu\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[29] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [29], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_017st1WeGeimCrWdp8SoRKmv', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [29], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m134.48011624554215\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"\\n\\nNow, let's compare these calculated values with experimental data:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01VVVf344F93n8C9cxidWiz5', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 78.0, 'poissons_ratio': 0.44, 'bulk_modulus': 180.0, 'shear_modulus': 27.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Ag'}`\n", + "responded: [{'id': 'toolu_016hTX9eUKGQ6yCCa8Jvf2n3', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Ag\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 83.0, 'poissons_ratio': 0.37, 'bulk_modulus': 100.0, 'shear_modulus': 30.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Pt'}`\n", + "responded: [{'id': 'toolu_014gvyswXvWuxLYXiDoaJZC4', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pt\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 168.0, 'poissons_ratio': 0.38, 'bulk_modulus': 230.0, 'shear_modulus': 61.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Pd'}`\n", + "responded: [{'id': 'toolu_016VnJRPHqK7nAmGnxB2dvmX', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pd\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 121.0, 'poissons_ratio': 0.39, 'bulk_modulus': 180.0, 'shear_modulus': 44.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Cu'}`\n", + "responded: [{'id': 'toolu_01MEacCFNGgw5pkp2239HSH3', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Cu\"}'}]\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Stopping agent prematurely due to triggering stop condition\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'youngs_modulus': 130.0, 'poissons_ratio': 0.34, 'bulk_modulus': 140.0, 'shear_modulus': 48.0}\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/markdown": [ + "Agent stopped due to max iterations." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%chat\n", + "How does this compare to experiment?" + ] + }, + { + "cell_type": "markdown", + "id": "9f9cb1aa-cf38-49be-8654-7b77cc6ef207", + "metadata": {}, + "source": [ + "Finally, as the last step the user tasks the LLM to compare the theoretical predictions and the experimental results to give an indication of the trustworthiness of these calculations. Given the optimization for computational efficiency, the result for Platinum with an error of over 20% is insufficient for any real life application. Still, this can be addressed by switching to the [MACE](https://mace-docs.readthedocs.io) model. " + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "2a6c69a8-73bb-40af-a7f1-500d97f6279b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"Certainly! I'll calculate the bulk modulus for the noble metals using the EMT (Effective Medium Theory) method and then compare it to experimental values. After that, I'll show you the deviation in percent. Let's go through this step by step.\\n\\nFirst, let's calculate the bulk modulus for each noble metal using EMT. Then, we'll fetch the experimental values and calculate the deviation.\\n\\nLet's start with gold (Au), silver (Ag), copper (Cu), platinum (Pt), and palladium (Pd). We'll do this one by one:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01DBmt6HcKaVxRAf2cjG7LUK', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[79] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [79], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'text': \"\\n\\nNow that we have the bulk structure for gold, let's calculate its bulk modulus:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01VtFGEKMWWWBqxYNnzgKjaa', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"atom_dict\": {\"numbers\": [79], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.04, 2.04], [2.04, 0.0, 2.04], [2.04, 2.04, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m173.84249328565886\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Ag'}`\n", + "responded: [{'text': \"\\n\\nGreat, we have the calculated bulk modulus for gold using EMT. Let's do the same for the other noble metals:\", 'type': 'text', 'index': 0}, {'id': 'toolu_01Jj29VPfeaMeVERUyyR7jYx', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Ag\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[47] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [47], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01Cd7piaoBh2HPF9eBP9jGus', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [47], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 2.045, 2.045], [2.045, 0.0, 2.045], [2.045, 2.045, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m100.1601184287953\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Cu'}`\n", + "responded: [{'id': 'toolu_01Bfb7yN9J1EXHFSpMGssfWj', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Cu\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[29] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [29], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_01GjJzbKhRdC7f7vvjZMG56x', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [29], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.805, 1.805], [1.805, 0.0, 1.805], [1.805, 1.805, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m134.48011624554215\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pt'}`\n", + "responded: [{'id': 'toolu_01YKLLjGYvLgeLKH2QGPmiXK', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pt\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[78] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [78], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_014w76fEpgzKJSGWH66SLuUy', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [78], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.96, 1.96], [1.96, 0.0, 1.96], [1.96, 1.96, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m278.296219466303\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_atom_dict_bulk_structure` with `{'chemical_symbol': 'Pd'}`\n", + "responded: [{'id': 'toolu_01T7jyw2zwZz4TPDyaNDUgRt', 'input': {}, 'name': 'get_atom_dict_bulk_structure', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pd\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3mnumbers=[46] positions=[[0.0, 0.0, 0.0]] cell=[[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]] pbc=[True, True, True]\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_bulk_modulus` with `{'atom_dict': {'numbers': [46], 'positions': [[0.0, 0.0, 0.0]], 'cell': [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], 'pbc': [True, True, True]}, 'calculator_str': 'emt'}`\n", + "responded: [{'id': 'toolu_015KxgZAkvoScYbpNwkijzq6', 'input': {}, 'name': 'get_bulk_modulus', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"atom_dict\": {\"numbers\": [46], \"positions\": [[0.0, 0.0, 0.0]], \"cell\": [[0.0, 1.945, 1.945], [1.945, 0.0, 1.945], [1.945, 1.945, 0.0]], \"pbc\": [true, true, true]}, \"calculator_str\": \"emt\"}'}]\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m179.24432817947203\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Au'}`\n", + "responded: [{'text': \"\\n\\nNow that we have the calculated values, let's fetch the experimental values for comparison:\", 'type': 'text', 'index': 0}, {'id': 'toolu_014Vy9kzBSAESKGiLQTQ6qbp', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 1, 'partial_json': '{\"chemical_symbol\": \"Au\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 78.0, 'poissons_ratio': 0.44, 'bulk_modulus': 180.0, 'shear_modulus': 27.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Ag'}`\n", + "responded: [{'id': 'toolu_01DvPXNBzLede1ufX8bFswkV', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Ag\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 83.0, 'poissons_ratio': 0.37, 'bulk_modulus': 100.0, 'shear_modulus': 30.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Cu'}`\n", + "responded: [{'id': 'toolu_013Z68xM7RrHa1963srGfoPr', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Cu\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 130.0, 'poissons_ratio': 0.34, 'bulk_modulus': 140.0, 'shear_modulus': 48.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Pt'}`\n", + "responded: [{'id': 'toolu_01UgV48Ju7BFYcJeUCgzgzCE', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pt\"}'}]\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'youngs_modulus': 168.0, 'poissons_ratio': 0.38, 'bulk_modulus': 230.0, 'shear_modulus': 61.0}\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_experimental_elastic_property_wikipedia` with `{'chemical_symbol': 'Pd'}`\n", + "responded: [{'id': 'toolu_011Tr2F4H2EewvwooKHGAkty', 'input': {}, 'name': 'get_experimental_elastic_property_wikipedia', 'type': 'tool_use', 'index': 0, 'partial_json': '{\"chemical_symbol\": \"Pd\"}'}]\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Stopping agent prematurely due to triggering stop condition\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'youngs_modulus': 121.0, 'poissons_ratio': 0.39, 'bulk_modulus': 180.0, 'shear_modulus': 44.0}\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n" + ] + }, + { + "data": { + "text/markdown": [ + "Agent stopped due to max iterations." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "%%chat\n", + "Can you show me the deviation in percent?" + ] + }, + { + "cell_type": "markdown", + "id": "a48a2446-186b-4b7b-b76d-6994301df696", + "metadata": {}, + "source": [ + "## Summary\n", + "The jury was convinced that this demonstration of the capabilities of `LangSim` highlights a future direction in the application of LLMs for materials and chemistry. So the `LangSim` team won the [first prize](https://medium.com/@blaiszik/llms-to-accelerate-discovery-in-materials-science-and-chemistry-refections-on-a-hackathon-b8364ca32242) sponsored by [RadicalAI](https://www.radical-ai.com). \n", + "\n", + "In comparison to the previous examples, the highlight of this presentation, besides hiding the technical complexity by developing a python module, was the capability of the LLM to repeat the workflow consisting of two calculations, the `get_equilibrium_lattice()` function and the `get_bulk_modulus()` function, over three elements without the need to define any additional functionality. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dabd4ab1-f3df-4081-b95a-a3c091103fcc", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.5" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}