AI-Assisted Testing

Software generated with the support of artificial intelligence (AI) is now achieving a level of quality and precision that is fundamentally changing development processes. This creates opportunities to redefine traditional software development, from code and testing to documentation.

The AI Revolution Does Not Stop at Testing

Software testing has always been a central component of software development, as systematic testing is the only way to ensure that the developed software runs reliably and robustly. The emergence of modern AI technologies is opening up enormous potential, particularly in this area. Today, AI can not only detect anomalies, but can even understand the entire code and write code itself. This provides the best conditions for using AI to test software, as this is where large volumes of data, clear processes, and recurring patterns come together.

But despite the impressive possibilities, the use of AI is not an end in itself. Simply replacing traditional automation with AI does not automatically make sense and does not always deliver long-term added value. Rather, it is crucial to understand exactly which fields of application can benefit from the use of AI in testing. Because where AI is used in a targeted manner, development cycles can be drastically shortened, while the quality of the tested software can be significantly increased at the same time.

But which fields of application in testing are best suited for the use of AI?

Traditional, Manual Test Processes

To understand where AI delivers real added value in software testing, it is worth first taking a look at how traditional testing processes work. Traditionally, test cases are derived manually from requirements. The human testers define scenarios, write step sequences, maintain test case libraries, and then evaluate the results.

The traditional approach comes with some essential challenges:

Manual effort: A bottleneck arises above all with continuous changes and further development of the software.
Maintenance effort: Changes to the software quickly make existing test cases obsolete.
Human error: People overlook complex edge cases or conclusions from the result.

Fields of Application for AI in the Testing Process

Artificial intelligence opens up new possibilities here. It does not replace systematic test processes themselves, but improves the speed, depth, and precision of the process steps. The most important fields of application:

Test case generation	Result analysis
AI can use requirements, code, or historical data as a basis to automatically derive test cases. Existing test case libraries can be automatically adapted and test case templates can be modified to create a wide range of variants. Artificial intelligence can also be the key to identifying edge cases and minimizing unknown risks. Natural language processing (NLP) enables AI to generate testable scenarios directly from requirements documents. In CI/CT pipelines, AI can also decide which tests are relevant for a given code change to accelerate feedback cycles.	Instead of just reporting pass/fail, AI can recognize patterns in error messages and logs. For example, failures can be grouped for quick root cause analysis. In addition, AI can make recommendations for test case changes based on the test results and implement them automatically if necessary. Areas of code that are frequently changed can be identified to help avoid past errors. Last but not least, so-called "flaky tests", which sporadically deliver different results with identical code, can be identified.

Test case generation

Result analysis

AI can use requirements, code, or historical data as a basis to automatically derive test cases. Existing test case libraries can be automatically adapted and test case templates can be modified to create a wide range of variants. Artificial intelligence can also be the key to identifying edge cases and minimizing unknown risks. Natural language processing (NLP) enables AI to generate testable scenarios directly from requirements documents.

In CI/CT pipelines, AI can also decide which tests are relevant for a given code change to accelerate feedback cycles.

Instead of just reporting pass/fail, AI can recognize patterns in error messages and logs. For example, failures can be grouped for quick root cause analysis. In addition, AI can make recommendations for test case changes based on the test results and implement them automatically if necessary.

Areas of code that are frequently changed can be identified to help avoid past errors. Last but not least, so-called "flaky tests", which sporadically deliver different results with identical code, can be identified.

这听起来很有意思？欢迎沟通交流：

联系我们

But how do I actually use AI in my testing tool chain and what do I need for that?

Prerequisites for the Efficient Use of AI

Many testing teams face a similar starting point. Over the years, proprietary test tools have accumulated, often with components developed in-house that are poorly documented or contain specialized domain-specific languages (DSL). These tool chains are usually manageable for humans, but often not for AI. LLMs need to understand syntax in order to interpret concepts and recognize patterns. The most obvious solution is often to try to train an LLM specifically on the proprietary syntax. However, this is exactly the wrong approach, because a dedicated model ultimately has to be operated, maintained, and continuously retrained. In the long term, this approach leads to a technical dead end with great dependency on the proprietary solution, while the full potential of AI is not exploited.

If AI is a native component of the development environment, your own prompts to the AI (right-hand side) are converted directly into code (left-hand side). Changes can also be specified and recognized directly.

How can AI support the testing process?

The future-proof approach is not to adapt AI to the tool chain, but to adapt the tool chain to AI. Established LLMs such as ChatGPT, Claude, or Copilot are already anchored in most company systems. These AI systems are powerful, scalable, and constantly up-to-date. Adjustments or conversion of the tool chains can initially mean a greater effort. However, this is necessary in order to create a good basis for the long-term use of different AIs. To enable these AIs to operate the testing tool chain, various approaches lead to success:

Generic test language: Test cases should be written in widely used, standardized languages. Python has long proven to be very suitable for this purpose. Not only because Python, as a script-based programming language, is good at mapping test sequences, but also because it is the most widely used language and is therefore mastered by LLMs.
Documentation: AI needs information in order to work contextually and correctly. Requirements, interface descriptions, and tool documentation must be accessible to the AI. NLP-based AI in particular benefits greatly from being able to analyze rich technical artifacts. When providing documentation, care should be taken to ensure that it is provided in a format that is optimized for processing by AI.
Open source: Open-source tools generally use common standards, are openly accessible, and therefore offer AI significantly more points of contact than proprietary black box software. This approach has already proven successful with open-source software (OSS) such as Pytest or Robot Framework.

All of these approaches are combined in the dSPACE Test Automation SDK and further extended by creating an interface to different test benches via APIs. In this way, it is possible to execute test cases written in generic OSS frameworks (Pytest, Robot Framework, etc.) on proprietary test benches. The Test Automation SDK is the key to enabling LLMs to execute test cases on common simulation HW/SW (VEOS, SCALEXIO, etc.) and interpret the results.

Is that all or what potential does AI still have in testing in the future?

Use of Agentic AI

Current developments in testing clearly show that AI can speed up traditional test processes, improve quality at the same time, and automate repetitive tasks. The keys to a future-proof testing tool chain are generic test languages, good documentation, and open frameworks. This is the only way that AI can really develop its strengths in test case generation and result analysis without running into proprietary dead ends.

But we are only at the beginning. The next evolutionary step is agentic AI systems, i.e., AI agents that not only generate test cases, but can also act autonomously, execute tools, and orchestrate complex processes. The Model Context Protocol (MCP) is a standardized approach that enables AI agents to interact with test frameworks and existing tools in a structured manner. This makes a future tangible in which AI not only suggests test cases, but also performs complete test tasks. Analyze requirements, generate test cases, change code, initiate pipelines, evaluate results, and suggest improvements autonomously. Open tools with MCP are therefore the key building block for shift-left in software development with AI agents. Only if agentic AI has access to robust, deterministic test tools during autonomous software development can it be ensured that the resulting software is not only developed quickly, but above all is also of high quality.