Skip to content

Commit

Permalink
Updated eval set
Browse files Browse the repository at this point in the history
  • Loading branch information
dividor committed Jul 10, 2024
1 parent 1824b49 commit 8a9fa3d
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions flows/chainlit-ui-evaluation/data.jsonl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{"test_scenario": "Assistant smalltalk, salutation", "query": "Hi", "chat_history": "[]", "context": "Hello! How can I assist you today?", "output": "['Hello! How can I assist you today?', '']"}
{"test_scenario": "Text answer from memory", "query": "What is the total population of Mali", "chat_history": "[{\"author\": \"user\",\"content\": \"Hi!\"}]", "context": "The answer is:\n\n \n The answer is: **17,907,114.0**\n\n Metadata for the answer:\n {\"params\": {\"country_code\": \"MLI\"}, \"attribution\": \"https://data.humdata.org/dataset/ce21c7db-d8f0-40f8-adc2-452d2d2d105c\", \"data_url\": \"https://data.humdata.org/dataset/ce21c7db-d8f0-40f8-adc2-452d2d2d105c/resource/6f243ba2-4d4a-4663-a7c4-e917dbbde73a/download/mli_pop_adm0_v2.csv\", \"time_period\": {\"start\": \"2018-01-01\", \"end\": \"2018-12-31T23:59:59\"}}", "output": "['The answer is: 17,907,114.0', '\u2705 A human approved this data recipe; Source; Raw data; 2018-01-01 to 2018-12-31']"}
{"test_scenario": "Image answer from memory", "query": "plot a line chart of fatalities by month for Chad using HDX data as an image", "chat_history": "[]", "context": "The answer is:\n\n Image cksum: 6a410014fde98dc5bde69c24e6d64cc1\nImage description: {'content': 'The image is a line graph titled \"Fatalities by Month for Chad.\" It displays the total number of fatalities over time, with the x-axis representing the months from January 2008 to January 2024, and the y-axis representing the total number of fatalities, ranging from 0 to 500.\\n\\nKey observations from the graph:\\n- There are several peaks indicating months with high fatalities.\\n- Notable spikes occur around mid-2008, early 2009, mid-2015, early 2021, and mid-2021.\\n- The highest peak appears to be around early 2021, reaching close to 500 fatalities.\\n- There are periods of relatively low fatalities, particularly between 2010 and 2014.\\n\\nTo determine if this image is relevant to the user query, more context about the query is needed. If the query pertains to historical data on fatalities in Chad, trends in violence or conflict, or similar topics, then this image is highly relevant.'}\n \n\n Metadata for the answer:\n {\"params\": {\"country_code\": \"TCD\"}, \"attribution\": \"https://data.humdata.org/dataset/b009f9b0-aa65-49c5-b188-a33daade0f4a\", \"data_url\": \"https://data.humdata.org/dataset/b009f9b0-aa65-49c5-b188-a33daade0f4a/resource/bb78c035-ec19-4503-b325-0673749c2eb4/download/chad_hrp_political_violence_events_and_fatalities_by_month-year_as-of-29may2024.xlsx\"}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a line graph titled \"Fatalities by Month for Chad,\" depicting the total number of fatalities per month from January 2008 to January 2024. The x-axis represents the months, while the y-axis represents the total fatalities.\\n\\nKey observations:\\n- There are several peaks indicating months with high fatalities.\\n- Notable spikes occur around mid-2008, early 2009, mid-2015, early 2021, and mid-2021.\\n- The highest peak is observed around early 2021, with fatalities exceeding 500.\\n- There are periods of relatively low fatalities, particularly between 2010 and 2014.\\n- Overall, the graph shows fluctuating fatality numbers with occasional significant spikes.', '\u2705 A human approved this data recipe; Source; Raw data']"}
{"test_scenario": "Image answer from recipe", "query": "Plot population pyramids for Nigeria", "chat_history": "[]", "context": "The answer is:\n\n Image cksum: 7940162caf0e79eba9caae30c2955a6e\nImage description: {'content': \"The image is a population pyramid for Nigeria (NGA). It is a bar chart that displays the distribution of various age groups in the population, divided by gender. The x-axis represents the population in millions, with males on the left side (in blue) and females on the right side (in pink). The y-axis represents the age range, divided into 5-year intervals from 0-4 up to 80+.\\n\\nKey features of the population pyramid:\\n- The base of the pyramid (0-4 age range) is the widest, indicating a high number of young children.\\n- As the age range increases, the width of the bars decreases, showing a tapering effect typical of a youthful population.\\n- The population decreases steadily with age, with the smallest population in the 80+ age range.\\n- The pyramid shows a relatively balanced distribution between males and females across most age groups.\\n\\nThis image is relevant to a user query related to demographic analysis, population studies, or understanding the age and gender distribution of Nigeria's population.\"}\n \n\n Metadata for the answer:\n {'params': {'adm0_code': 'NGA'}, 'attribution': 'https://data.humdata.org/dataset/a7c3de5e-ff27-4746-99cd-05f2ad9b1066', 'data_url': 'https://data.humdata.org/dataset/a7c3de5e-ff27-4746-99cd-05f2ad9b1066/resource/562e7757-0683-4d61-87bd-a7c94af2ee38/download/nga_admpop_adm2_2020.csv', 'time_period': {'start': '2020-01-01', 'end': '2020-12-31T23:59:59'}}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a population pyramid for Nigeria (NGA). It displays the population distribution by age and gender. The horizontal axis represents the population in millions, with males on the left (in blue) and females on the right (in pink). The vertical axis represents different age ranges, starting from 0-4 at the bottom to 80+ at the top.\\n\\nKey observations:\\n- The pyramid has a broad base, indicating a large number of young people (0-14 age range).\\n- The population decreases steadily with age, showing a typical pyramid shape.\\n- There are more males than females in the younger age groups, but the difference is not very pronounced.\\n- The population in older age groups (60 and above) is significantly smaller compared to the younger age groups.', '\u2705 A human approved this data recipe; Source; Raw data; 2020-01-01 to 2020-12-31']"}
{"test_scenario": "Assistant smalltalk, salutation", "query": "Hi", "chat_history": "[]", "context": "Hello! How can I assist you today?", "output": "['Hello! How can I assist you today?']"}
{"test_scenario": "Text answer from memory", "query": "What is the total population of Mali", "chat_history": "[]", "context": "The answer is:\n\n \n The answer is: **17,907,114.0**\n\n Metadata for the answer:\n {\"params\": {\"country_code\": \"MLI\"}, \"attribution\": \"https://data.humdata.org/dataset/ce21c7db-d8f0-40f8-adc2-452d2d2d105c\", \"data_url\": \"https://data.humdata.org/dataset/ce21c7db-d8f0-40f8-adc2-452d2d2d105c/resource/6f243ba2-4d4a-4663-a7c4-e917dbbde73a/download/mli_pop_adm0_v2.csv\", \"time_period\": {\"start\": \"2018-01-01\", \"end\": \"2018-12-31T23:59:59\"}}", "output": "['The answer is: 17,907,114.0', '\u2705 A human approved this data recipe; Source; Raw data; 2018-01-01 to 2018-12-31']"}
{"test_scenario": "Image answer from memory", "query": "plot a line chart of fatalities by month for Chad using HDX data as an image", "chat_history": "[]", "context": "The answer is:\n\n Image cksum: 6a410014fde98dc5bde69c24e6d64cc1\nImage description: {'content': 'The image is a line graph titled \"Fatalities by Month for Chad.\" It displays the total number of fatalities over time, with the x-axis representing the months from January 2008 to January 2024, and the y-axis representing the total number of fatalities, ranging from 0 to 500.\\n\\nKey observations from the graph:\\n- There are several peaks indicating months with high fatalities.\\n- Notable spikes occur around mid-2008, early 2009, mid-2015, early 2021, and mid-2021.\\n- The highest peak appears to be around early 2021, reaching close to 500 fatalities.\\n- There are periods of relatively low fatalities, particularly between 2010 and 2014.\\n\\nTo determine if this image is relevant to the user query, more context about the query is needed. If the query pertains to historical data on fatalities in Chad, trends in violence or conflict, or similar topics, then this image is highly relevant.'}\n \n\n Metadata for the answer:\n {\"params\": {\"country_code\": \"TCD\"}, \"attribution\": \"https://data.humdata.org/dataset/b009f9b0-aa65-49c5-b188-a33daade0f4a\", \"data_url\": \"https://data.humdata.org/dataset/b009f9b0-aa65-49c5-b188-a33daade0f4a/resource/bb78c035-ec19-4503-b325-0673749c2eb4/download/chad_hrp_political_violence_events_and_fatalities_by_month-year_as-of-29may2024.xlsx\"}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a line graph titled \"Fatalities by Month for Chad,\" depicting the total number of fatalities per month from January 2008 to January 2024. The x-axis represents the months, while the y-axis represents the total fatalities.\\n\\nKey observations:\\n- There are several peaks indicating months with high fatalities.\\n- Notable spikes occur around mid-2008, early 2009, mid-2015, early 2021, and mid-2021.\\n- The highest peak is observed around early 2021, with fatalities exceeding 500.\\n- There are periods of relatively low fatalities, particularly between 2010 and 2014.\\n- Overall, the graph shows fluctuations in fatalities over the years, with some months experiencing significant increases.', '\u2705 A human approved this data recipe; Source; Raw data']"}
{"test_scenario": "Image answer from recipe", "query": "Plot population pyramids for Nigeria", "chat_history": "[]", "context": "The answer is:\n\n Image cksum: 7940162caf0e79eba9caae30c2955a6e\nImage description: {'content': \"The image is a population pyramid for Nigeria (NGA). It is a bar chart that displays the distribution of various age groups in the population, divided by gender. The x-axis represents the population in millions, with males on the left side (in blue) and females on the right side (in pink). The y-axis represents the age range, divided into 5-year intervals from 0-4 up to 80+.\\n\\nKey features of the population pyramid:\\n- The base of the pyramid (0-4 age range) is the widest, indicating a high number of young children.\\n- As the age range increases, the width of the bars decreases, showing a tapering effect typical of a youthful population.\\n- The population decreases steadily with age, with the smallest population in the 80+ age range.\\n- The pyramid shows a relatively balanced distribution between males and females across most age groups.\\n\\nThis image is relevant to a user query related to demographic analysis, population studies, or understanding the age and gender distribution of Nigeria's population.\"}\n \n\n Metadata for the answer:\n {'params': {'adm0_code': 'NGA'}, 'attribution': 'https://data.humdata.org/dataset/a7c3de5e-ff27-4746-99cd-05f2ad9b1066', 'data_url': 'https://data.humdata.org/dataset/a7c3de5e-ff27-4746-99cd-05f2ad9b1066/resource/562e7757-0683-4d61-87bd-a7c94af2ee38/download/nga_admpop_adm2_2020.csv', 'time_period': {'start': '2020-01-01', 'end': '2020-12-31T23:59:59'}}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a population pyramid for Nigeria (NGA). It displays the population distribution by age and gender. The horizontal axis represents the population in millions, with males on the left (in blue) and females on the right (in pink). The vertical axis represents different age ranges, starting from 0-4 at the bottom to 80+ at the top.\\n\\nKey observations:\\n1. The pyramid has a broad base, indicating a large number of young people (0-4 age range).\\n2. The population decreases steadily with increasing age, forming a typical pyramid shape.\\n3. There are more males than females in the younger age groups, but the difference is not very pronounced.\\n4. The population of both genders decreases significantly in the older age groups (60+).', '\u2705 A human approved this data recipe; Source; Raw data; 2020-01-01 to 2020-12-31']"}
{"test_scenario": "Assistant on-the-fly SQL, text answer", "query": "How many rows does the population table have for Nigeria", "chat_history": "[]", "context": "There are **43,794** rows of data in the population table for Nigeria.", "output": "['The population table has 43,794 rows for Nigeria.']"}
{"test_scenario": "Assistant created image (simple)", "query": "Plot f{x}=10", "chat_history": "[]", "context": "Image cksum: 3f4dafc66e68dc03e3ef6d2f02a85bc7\nImage description: {'content': 'The image is a plot of the function \\\\( f(x) = 10 \\\\). Here are the details of the plot:\\n\\n- The title of the plot is \"Plot of f(x) = 10\".\\n- The x-axis ranges from -10 to 10.\\n- The y-axis ranges from 0 to 10.\\n- The function \\\\( f(x) = 10 \\\\) is represented by a horizontal orange line at \\\\( y = 10 \\\\).\\n- There is a legend in the plot that labels the orange line as \"f(x) = 10\".\\n- The x-axis is labeled \"x\" and the y-axis is labeled \"f(x)\".\\n- The plot has grid lines for better readability.\\n\\nThe plot is relevant if the user query is about visualizing or understanding the function \\\\( f(x) = 10 \\\\), which is a constant function.'}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a plot of the function \\\\( f(x) = 10 \\\\). The graph is a horizontal line at \\\\( y = 10 \\\\) that extends from \\\\( x = -10 \\\\) to \\\\( x = 10 \\\\). The x-axis ranges from -10 to 10, and the y-axis ranges from 0 to 10. The plot includes a legend indicating that the line represents \\\\( f(x) = 10 \\\\).', '\ud83e\udd16 Caution: LLM Analysis']"}
{"test_scenario": "Assistant created image (simple)", "query": "Plot f{x}=10", "chat_history": "[]", "context": "Image cksum: 3f4dafc66e68dc03e3ef6d2f02a85bc7\nImage description: {'content': 'The image is a plot of the function \\\\( f(x) = 10 \\\\). Here are the details of the plot:\\n\\n- The title of the plot is \"Plot of f(x) = 10\".\\n- The x-axis ranges from -10 to 10.\\n- The y-axis ranges from 0 to 10.\\n- The function \\\\( f(x) = 10 \\\\) is represented by a horizontal orange line at \\\\( y = 10 \\\\).\\n- There is a legend in the plot that labels the orange line as \"f(x) = 10\".\\n- The x-axis is labeled \"x\" and the y-axis is labeled \"f(x)\".\\n- The plot has grid lines for better readability.\\n\\nThe plot is relevant if the user query is about visualizing or understanding the function \\\\( f(x) = 10 \\\\), which is a constant function.'}", "output": "['*AN IMAGE WAS OUTPUT, HERE IS ITS LLM-GENERATED DESCRIPTION* ... The image is a plot of the function \\\\( f(x) = 10 \\\\). The graph shows a horizontal line at \\\\( f(x) = 10 \\\\) across the range of \\\\( x \\\\) values from -10 to 10. The x-axis is labeled \"x\" and the y-axis is labeled \"f(x)\". There is a legend in the plot indicating the line represents \\\\( f(x) = 10 \\\\). The plot title is \"Plot of f(x) = 10\".']"}

0 comments on commit 8a9fa3d

Please sign in to comment.