Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

Evaluating ChatGPT’s Ability to Answer Questions in Natural Science and Engineering through Empirical Testing

Evaluation of ChatGPT’s Answering Capabilities in Natural Science and Engineering Domains: A Study at Delft University of Technology

In our recent study, we delved into the capabilities of ChatGPT within the natural science and engineering domains. The study involved a diverse group of participants from different faculties at Delft University of Technology, including assistant professors, associate professors, full professors, lecturers, Ph.D. students, postdoctoral researchers, and others.

Our evaluation focused on assessing ChatGPT’s answering capabilities across various skill categories and educational levels. The results, as depicted in Figure 1, highlighted several key findings. Firstly, ChatGPT received higher scores for basic and scientific skills compared to skills beyond scientific knowledge. Participants rated the question relatedness of the answers and the level of English highly. However, the model’s critical attitude scored lowest among the assessment criteria, suggesting the need for further verification of results.

Moreover, the assessment of scientific correctness revealed that ChatGPT can provide mostly correct answers for Bachelor level questions and partly correct answers for Master and Ph.D. level questions. It was interesting to note the impact of the answers generated by ChatGPT, with participants mentioning various potential impacts ranging from environmental to safety concerns.

Further analysis of the study variables, including skill categories and educational levels, showed significant influences on the assessment scores. Scientific skills were rated higher than skills beyond scientific knowledge, and answers for lower educational levels received better ratings. Faculty, however, did not show a significant influence on the assessment rating.

The study also included free text comments from participants, providing additional insights into the perceived quality of ChatGPT’s answers. Comments ranged from critiques about lack of detail to comparisons with student answers. Some participants raised concerns about the sources of training data used by ChatGPT and its implications on the generated answers. Emotional reactions were also observed, with a mix of neutral, positive, and negative sentiments expressed in the comments.

Overall, our study sheds light on the strengths and weaknesses of ChatGPT in answering questions related to natural science and engineering. While the model demonstrates competence in certain areas, further improvements are needed, especially in critical thinking and ensuring scientific correctness. As AI continues to shape the future of education and research, studies like ours provide valuable insights for enhancing the capabilities of AI-powered tools in academic settings.

Latest

Optimize LLM with Databricks Unity Catalog and Amazon SageMaker AI

Ensuring Data Governance in LLM Fine-Tuning with Amazon SageMaker...

I Subscribed to Gemini, ChatGPT, and Claude—Here’s the Clear Winner

The Evolving Role of AI Assistants in Streamlining Our...

Guest Post by Dr. Ingo Keller from the National Robotarium

Bridging the Gaps: Addressing Fragmentation in the Robotics Industry The...

Claude AI for Small Businesses: An Overview of New Plugins and Features

Unlocking Efficiency: How Claude AI Empowers Small Businesses with...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

Running Your ML Notebook on Databricks: A Step-by-Step Guide

A Step-by-Step Guide to Hosting Machine Learning Notebooks in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

I Subscribed to Gemini, ChatGPT, and Claude—Here’s the Clear Winner

The Evolving Role of AI Assistants in Streamlining Our Daily Tasks Why Claude Stands Out in the AI Assistant Landscape In the ever-evolving world of artificial...

ChatGPT Introduces ‘Trusted Contact’ Feature

OpenAI Introduces Trusted Contact Feature to Support Users in Crisis Addressing Suicidal Thoughts with New Safeguards in ChatGPT OpenAI's Response to Suicidal Conversations: Introducing the Trusted...

I Used ChatGPT to Overcome Daily Decision-Making Anxiety, and My Stress...

Breaking Free from the Chains of Overthinking: Strategies for Everyday Decision-Making Breaking the Cycle of Overthinking: A Practical Guide I often find myself tangled in a...