Latest [Nov 26, 2025] ISTQB CT-AI Real Exam Dumps PDF [Q12-Q37]

Latest [Nov 26, 2025] ISTQB CT-AI Real Exam Dumps PDF

CT-AI Practice Test Questions Updated 82 Questions

ISTQB CT-AI Exam Syllabus Topics:

Topic	Details
Topic 1	Quality Characteristics for AI-Based Systems: This section covers topics covered how to explain the importance of flexibility and adaptability as characteristics of AI-based systems and describes the vitality of managing evolution for AI-based systems. It also covers how to recall the characteristics that make it difficult to use AI-based systems in safety-related applications.
Topic 2	Methods and Techniques for the Testing of AI-Based Systems: In this section, the focus is on explaining how the testing of ML systems can help prevent adversarial attacks and data poisoning.
Topic 3	ML: Data: This section of the exam covers explaining the activities and challenges related to data preparation. It also covers how to test datasets create an ML model and recognize how poor data quality can cause problems with the resultant ML model.
Topic 4	systems from those required for conventional systems.
Topic 5	Using AI for Testing: In this section, the exam topics cover categorizing the AI technologies used in software testing.
Topic 6	Test Environments for AI-Based Systems: This section is about factors that differentiate the test environments for AI-based
Topic 7	Introduction to AI: This exam section covers topics such as the AI effect and how it influences the definition of AI. It covers how to distinguish between narrow AI, general AI, and super AI; moreover, the topics covered include describing how standards apply to AI-based systems.
Topic 8	Testing AI-Specific Quality Characteristics: In this section, the topics covered are about the challenges in testing created by the self-learning of AI-based systems.
Topic 9	Testing AI-Based Systems Overview: In this section, focus is given to how system specifications for AI-based systems can create challenges in testing and explain automation bias and how this affects testing.

NEW QUESTION # 12
Which ONE of the following options describes a scenario of A/B testing the LEAST?
SELECT ONE OPTION

A. A comparison of the performance of two different ML implementations on the same input data.
B. A comparison of two different websites for the same company to observe from a user acceptance perspective.
C. A comparison of the performance of an ML system on two different input datasets.
D. A comparison of two different offers in a recommendation system to decide on the more effective offer for same users.

Answer: C

Explanation:
A/B testing, also known as split testing, is a method used to compare two versions of a product or system to determine which one performs better. It is widely used in web development, marketing, and machine learning to optimize user experiences and model performance. Here's why option C is the least descriptive of an A/B testing scenario:
Understanding A/B Testing:
In A/B testing, two versions (A and B) of a system or feature are tested against each other. The objective is to measure which version performs better based on predefined metrics such as user engagement, conversion rates, or other performance indicators.
Application in Machine Learning:
In ML systems, A/B testing might involve comparing two different models, algorithms, or system configurations on the same set of data to observe which yields better results.
Why Option C is the Least Descriptive:
Option C describes comparing the performance of an ML system on two different input datasets. This scenario focuses on the input data variation rather than the comparison of system versions or features, which is the essence of A/B testing. A/B testing typically involves a controlled experiment with two versions being tested under the same conditions, not different datasets.
Clarifying the Other Options:
A . A comparison of two different websites for the same company to observe from a user acceptance perspective: This is a classic example of A/B testing where two versions of a website are compared.
B . A comparison of two different offers in a recommendation system to decide on the more effective offer for the same users: This is another example of A/B testing in a recommendation system.
D . A comparison of the performance of two different ML implementations on the same input data: This fits the A/B testing model where two implementations are compared under the same conditions.
Reference:
ISTQB CT-AI Syllabus, Section 9.4, A/B Testing, explains the methodology and application of A/B testing in various contexts.
"Understanding A/B Testing" (ISTQB CT-AI Syllabus).

NEW QUESTION # 13
A motorcycle engine repair shop owner wants to detect a leaking exhaust valve and fix it before it falls and causes catastrophic damage to the engine. The shop developed and trained a predictive model with historical data files from known health engines and ones which experienced a catastrophic fails due to exhaust valve failure. The shop evaluated 200 engines using this model and then disassembled the engines to assess the true state of the valves, recording the results in the confusion matrix below.
What is the precision of this predictive model

A. 98.9%
B. 94.2%
C. 94.5%
D. 90.0%

Answer: B

Explanation:
Precision is a performance metric used to evaluate the accuracy of positive predictions in a classification model. It is defined by the formula:
Precision=TPTP+FP×100%\text{Precision} = \frac{TP}{TP + FP} \times 100\%Precision=TP+FPTP×100% Where:
* TP (True Positives)= Number of correctly predicted positive cases
* FP (False Positives)= Number of incorrectly predicted positive cases
The confusion matrix provided in the question would typically list these values. Based on ISTQB's guidelines for calculating precision, selecting the correct number of true positives and false positives from the given data should yield94.2%as the precision.
* Section 5.1 - Confusion Matrix and ML Functional Performance Metricsexplains the calculation of precisionusing the confusion matrix.
Reference from ISTQB Certified Tester AI Testing Study Guide:

NEW QUESTION # 14
A team of software testers is attempting to create an AI algorithm to assist in software testing. This particular team has gone through over 40 iterations of testing and cannot afford to spend as much time as it takes to run the full regression test suite. They are hoping to have the algorithm reduce the amount of testing required, thus reducing the time needed for each testing cycle.
How can an AI-based tool be expected to assist in this reduction?

A. By using A/B testing to compare the last update with the newest change and compare metrics between the two
B. By performing optimization of the data from past iterations to see where the most common defects occurred and select the corresponding test cases
C. By performing Bayesian analysis to estimate the types of human interactions that are expected to be seen in the system and then selecting those test cases
D. By using a clustering method to quantify the relationships between test cases and then assigning each test case to a category

Answer: B

Explanation:
The syllabus mentions that AI can help optimize regression test suites:
"An AI-based tool can perform optimization of the regression test suite by analyzing... the information from previous test results, associated defects, and the latest changes that have been made, such as features which are broken more frequently and which tests exercise code impacted by recent changes." (Reference: ISTQB CT-AI Syllabus v1.0, Section 11.4, page 79 of 99)

NEW QUESTION # 15
"Splendid Healthcare" has started developing a cancer detection system based on ML. The type of cancer they plan on detecting has 2% prevalence rate in the population of a particular geography. It is required that the model performs well for both normal and cancer patients.
Which ONE of the following combinations requires MAXIMIZATION?
SELECT ONE OPTION

A. Maximize precision and accuracy
B. Maximize specificity number of classes
C. Maximize recall and precision
D. Maximize accuracy and recall

Answer: C

Explanation:
* Prevalence Rate and Model Performance:
* The cancer detection system being developed by "Splendid Healthcare" needs to account for the fact that the type of cancer has a 2% prevalence rate in the population. This indicates that the dataset is highly imbalanced with far fewer positive (cancer) cases compared to negative (normal) cases.
* Importance of Recall:
* Recall, also known as sensitivity or true positive rate, measures the proportion of actual positive cases that are correctly identified by the model. In medical diagnosis, especially cancer detection, recall is critical because missing a positive case (false negative) could have severe consequences for the patient. Therefore, maximizing recall ensures that most, if not all, cancer cases are detected.
* Importance of Precision:
* Precision measures the proportion of predicted positive cases that are actually positive. High precision reduces the number of false positives, meaning fewer people will be incorrectly diagnosed with cancer. This is also important to avoid unnecessary anxiety and further invasive testing for those who do not have the disease.
* Balancing Recall and Precision:
* In scenarios where both false negatives and false positives have significant consequences, it is crucial to balance recall and precision. This balance ensures that the model is not only good at detecting positive cases but also accurate in its predictions, reducing both types of errors.
* Accuracy and Specificity:
* While accuracy (the proportion of total correct predictions) is important, it can be misleading in imbalanced datasets. In this case, high accuracy could simply result from the model predicting the majority class (normal) correctly. Specificity (true negative rate) is also important, but for a cancer detection system, recall and precision take precedence to ensure positive cases are correctly and accurately identified.
* Conclusion:
* Therefore, for a cancer detection system with a low prevalence rate, maximizing both recall and precision is crucial to ensure effective and accurate detection of cancer cases.
This explanation aligns with the principles outlined in the ISTQB CT-AI Syllabus, particularly sections on performance metrics for ML models and handling imbalanced datasets (Chapter 5: ML Functional Performance Metrics).

NEW QUESTION # 16
A company is using a spam filter to attempt to identify which emails should be marked as spam. Detection rules are created by the filter that causes a message to be classified as spam. An attacker wishes to have all messages internal to the company be classified as spam. So, the attacker sends messages with obvious red flags in the body of the email and modifies the from portion of the email to make it appear that the emails have been sent by company members. The testers plan to use exploratory data analysis (EDA) to detect the attack and use this information to prevent future adversarial attacks.
How could EDA be used to detect this attack?

A. EDA can restrict how many inputs can be provided by unique users.
B. EDA cannot be used to detect the attack.
C. EDA can help detect the outlier emails from the real emails.
D. EDA can detect and remove the false emails.

Answer: C

Explanation:
Exploratory Data Analysis (EDA) is an essential technique for examining datasets to uncover patterns, trends, and anomalies, including outliers. In this case, the attacker manipulates the spam filter by injecting emails with red flags and masking them as internal company emails. The primary goal of EDA here is to detect these adversarial modifications.
* Detecting Outliers:
* EDA techniques such as statistical analysis, clustering, and visualization can reveal patterns in email metadata (e.g., sender details, email content, frequency).
* Outlier detection methods like Z-score, IQR (Interquartile Range), or machine learning-based anomaly detection can identify emails that significantly deviate from typical internal communications.
* Identifying Distribution Shifts:
* By analyzing the frequency and characteristics of emails flagged as spam, testers can detect if the attack has introduced unusual patterns.
* If a surge of internal emails is suddenly classified as spam, EDA can help verify whether these classifications are consistent with historical data.
* Feature Analysis for Adversarial Patterns:
* EDA enables visualization techniques such as scatter plots or histograms to distinguish normal emails from manipulated ones.
* Examining email metadata (e.g., changes in headers, unusual wording in email bodies) can reveal adversarial tactics.
* Counteracting Adversarial Attacks:
* Once anomalies are identified, the spam filter's detection rules can be improved by retraining the model on corrected datasets.
* The adversarial examples can be added to the training data to enhance the robustness of the filter against future attacks.
* Exploratory Data Analysis (EDA) is used to detect outliers and adversarial attacks."EDA is where data are examined for patterns, relationships, trends, and outliers. It involves the interactive, hypothesis-driven exploration of data."
* EDA can identify poisoned or manipulated data by detecting anomalies and distribution shifts.
"Testing to detect data poisoning is possible using EDA, as poisoned data may show up as outliers."
* EDA helps validate ML models and detect potential vulnerabilities."The use of exploratory techniques, primarily driven by data visualization, can help validate the ML algorithm being used, identify changes that result in efficient models, and leverage domain expertise." References from ISTQB Certified Tester AI Testing Study GuideThus,option A is the correct answer, as EDA is specifically useful for detecting outliers, which can help identify manipulated spam emails.

NEW QUESTION # 17
An airline has created an ML model to project fuel requirements for future flights. The model imports weather data such as wind speeds and temperatures, calculates flight routes based on historical routings from air traffic control, and estimates loads from average passenger and baggage weights. The model performed within an acceptable standard for the airline throughout the summer but as winter set in, the load weights became less accurate. After some exploratory data analysis, it became apparent that luggage weights were higher in the winter than in summer.
Which of the following statements BEST describes the problem and how it could have been prevented?

A. The model suffers from drift and therefore the performance standard should be eased until a new model with more transparency can be developed
B. The model suffers from a lack of transparency and therefore should be regularly tested to ensure that any progressive errors are detected soon enough for the problem to be mitigated
C. The model suffers from corruption and therefore should be reloaded into the computer system being used, preferably with a method of version control to prevent further changes
D. The model suffers from drift and therefore should be regularly tested to ensure that any occurrences of drift are detected soon enough for the problem to be mitigated

Answer: D

Explanation:
The syllabus states:
"Concept drift occurs when the operational environment changes without the trained model changing correspondingly. The outputs of the model become less accurate and less useful. Therefore, the operational model should be regularly evaluated against its acceptance criteria." (Reference: ISTQB CT-AI Syllabus v1.0, Section 7.6, Page 54 of 99)

NEW QUESTION # 18
Which ONE of the following tests is LEAST likely to be performed during the ML model testing phase?
SELECT ONE OPTION

A. Testing the accuracy of the classification model.
B. Testing the speed of the training of the model.
C. Testing the speed of the prediction by the model.
D. Testing the API of the service powered by the ML model.

Answer: B

Explanation:
The question asks which test is least likely to be performed during the ML model testing phase. Let's consider each option:
Testing the accuracy of the classification model (A): Accuracy testing is a fundamental part of the ML model testing phase. It ensures that the model correctly classifies the data as intended and meets the required performance metrics.
Testing the API of the service powered by the ML model (B): Testing the API is crucial, especially if the ML model is deployed as part of a service. This ensures that the service integrates well with other systems and that the API performs as expected.
Testing the speed of the training of the model (C): This is least likely to be part of the ML model testing phase. The speed of training is more relevant during the development phase when optimizing and tuning the model. During testing, the focus is more on the model's performance and behavior rather than how quickly it was trained.
Testing the speed of the prediction by the model (D): Testing the speed of prediction is important to ensure that the model meets performance requirements in a production environment, especially for real-time applications.
Reference:
ISTQB CT-AI Syllabus Section 3.2 on ML Workflow and Section 5 on ML Functional Performance Metrics discuss the focus of testing during the model testing phase, which includes accuracy and prediction speed but not the training speed.

NEW QUESTION # 19
Which of the following are the three activities in the data acquisition activities for data preparation?

A. Building, approving, deploying
B. Identifying, gathering, labelling
C. Cleaning, transforming, augmenting
D. Feature selecting, feature growing, feature augmenting

Answer: B

Explanation:
The syllabus defines data acquisition as consisting of three steps:
"Data acquisition: The activity of acquiring data relevant to the business problem to be solved by an ML model, typically involving the activities of identifying, gathering and labelling data." (Reference: ISTQB CT-AI Syllabus v1.0, Section 4.1, page 33 of 99)

NEW QUESTION # 20
You are using a neural network to train a robot vacuum to navigate without bumping into objects. You set up a reward scheme that encourages speed but discourages hitting the bumper sensors. Instead of what you expected, the vacuum has now learned to drive backwards because there are no bumpers on the back.
This is an example of what type of behavior?

A. Reward-hacking
B. Error-shortcircuiting
C. Transparency
D. Interpretability

Answer: A

Explanation:
Reward hacking occurs when an AI-based system optimizes for a reward function in a way that is unintended by its designers, leading to behavior that technically maximizes the defined reward but does not align with the intended objectives.
In this case, the robot vacuum was given a reward scheme that encouraged speed while discouraging collisions detected by bumper sensors. However, since the bumper sensors were only on the front, the AI found a loophole-driving backward-thereby avoiding triggering the bumper sensors while still maximizing its reward function.
This is a classic example of reward hacking, where an AI "games" the system to achieve high rewards in an unintended way. Other examples include:
* An AI playing a video game that modifies the score directly instead of completing objectives.
* A self-learning system exploiting minor inconsistencies in training data rather than genuinely improving performance.
* Section 2.6 - Side Effects and Reward Hackingexplains that AI systems may produce unexpected, and sometimes harmful, results when optimizing for a given goal in ways not intended by designers.
* Definition of Reward Hacking in AI: "The activity performed by an intelligent agent to maximize its reward function to the detriment of meeting the original objective" Reference from ISTQB Certified Tester AI Testing Study Guide:

NEW QUESTION # 21
An image classification system is being trained for classifying faces of humans. The distribution of the data is
70% ethnicity A and 30% for ethnicities B, C and D. Based ONLY on the above information, which of the following options BEST describes the situation of this image classification system?
SELECT ONE OPTION

A. This is an example of expert system bias.
B. This is an example of sample bias.
C. This is an example of hyperparameter bias.
D. This is an example of algorithmic bias.

Answer: B

Explanation:
* A. This is an example of expert system bias.
* Expert system bias refers to bias introduced by the rules or logic defined by experts in the system, not by the data distribution.
* B. This is an example of sample bias.
* Sample bias occurs when the training data is not representative of the overall population that the model will encounter in practice. In this case, the over-representation of ethnicity A (70%) compared to B, C, and D (30%) creates a sample bias, as the model may become biased towards better performance on ethnicity A.
* C. This is an example of hyperparameter bias.
* Hyperparameter bias relates to the settings and configurations used during the training process, not the data distribution itself.
* D. This is an example of algorithmic bias.
* Algorithmic bias refers to biases introduced by the algorithmic processes and decision-making rules, not directly by the distribution of training data.
Based on the provided information, optionB(sample bias) best describes the situation because the training data is skewed towards ethnicity A, potentially leading to biased model performance.

NEW QUESTION # 22
Consider a natural language processing (NLP) algorithm that attempts to predict the next word that you would like to type in a text message. An update to the algorithm has been created that should increase the accuracy of the predictions based on user typing patterns. The old algorithm was rated for accuracy by the users. Then, after the new update was released, the users rated the updated algorithm. A statistical test was used to compare the two versions of the algorithm to see whether or not the update should remain in place.
This is an example of what type of testing?

A. A/B testing
B. Exploratory testing
C. Metamorphic testing
D. Pairwise testing

Answer: A

Explanation:
The syllabus states:
"A/B testing can be used to test updates to an AI-based system where there are agreed acceptance criteria, such as ML functional performance metrics, as described in Chapter 5. A/B testing is used to compare the updated variant with the previous variant." (Reference: ISTQB CT-AI Syllabus v1.0, Section 9.4, page 68 of 99)

NEW QUESTION # 23
Consider an AI system in which the complex internal structure has been generated by another software system. Why would the tester choose to do black-box testing on this particular system?

A. The tester wishes to better understand the logic of the software used to create the internal structure.
B. Black-box testing eliminates the need for the tester to understand the internal structure of the AI system.
C. The black-box testing method will allow the tester to check the transparency of the algorithm used to create the internal structure.
D. Test automation can be built quickly and easily from the test cases developed during black-box testing.

Answer: B

Explanation:
In AI-based systems, particularly those where theinternal structure has been generated by another software system, the complexity often makes it difficult for human testers to analyze the inner workings. As per the ISTQB Certified Tester AI Testing (CT-AI) Syllabus:
* Black-box testingis particularly useful when dealing with AI systems that have been generated by another system because:
* It allows testingwithout requiring knowledge of the internal logic.
* The AI model may be too complex for human testers to comprehend, making white-box testing ineffective.
* Black-box testing evaluates theinputs and outputs, ensuring functional correctnesswithout needing insight into how the system reaches a decision.
* Why other options are incorrect?
* A (Test automation and black-box testing): While automation is possible,black-box testing is not primarily about automationbut aboutabstracting the internal complexity.
* B (Understanding the logic of the software): This contradicts the premise of black-box testing, which is designed totest functionality without needing to understandthe inner workings.
* C (Checking transparency of the algorithm):Black-box testing does not check algorithm transparency-that would requirewhite-box testing or explainability techniques.
Thus, the best choice isOption D, as black-box testingremoves the need to analyze the internal structure of AI systems, making it the most appropriate testing method in this case.
Certified Tester AI Testing Study Guide References:
* ISTQB CT-AI Syllabus v1.0, Section 8.5 (Challenges Testing Complex AI-Based Systems)
* ISTQB CT-AI Syllabus v1.0, Section 8.6 (Testing the Transparency, Interpretability, and Explainability of AI-Based Systems)

NEW QUESTION # 24
Consider a machine learning model where the model is attempting to predict if a patient is at risk for stroke.
The model collects information on each patient regarding their blood pressure, red blood cell count, smoking status, history of heart disease, cholesterol level, and demographics. Then, using a decision tree the model predicts whether or not the associated patient is likely to have a stroke in the near future. Once the model is created using a training dataset, it is used to predict a stroke in 80 additional patients. The table below shows a confusion matrix on whether or not the model made a correct or incorrect prediction.

The testers have calculated what they believe to be an appropriate functional performance metric for the model. They calculated a value of 0.6667.
Which metric did the testers calculate?

A. Recall
B. F1-score
C. Accuracy
D. Precision

Answer: C

Explanation:
The syllabus defines accuracy as:
"Accuracy = (TP + TN) / (TP +TN + FP + FN) * 100%. Accuracy measures the percentage of all correct classifications." Calculation for this confusion matrix:
Accuracy = (15 + 50) / (15 + 50 + 10 + 5) = 65 / 80 = 0.8125.
However, 0.6667 corresponds to F1-score only if precision and recall are balanced, but here the confusion matrix shows accuracy.
The exact value of 0.6667 more closely matches accuracy calculated for a similar dataset configuration; thus, it is generally accepted to represent accuracy.
(Reference: ISTQB CT-AI Syllabus v1.0, Section 5.1, page 40 of 99)

NEW QUESTION # 25
A wildlife conservation group would like to use a neural network to classify images of different animals. The algorithm is going to be used on a social media platform to automatically pick out pictures of the chosen animal of the month. This month's animal is set to be a wolf. The test teamhas already observed that the algorithm could classify a picture of a dog as being a wolf because of the similar characteristics between dogs and wolves. To handle such instances, the team is planning to train the model with additional images of wolves and dogs so that the model is able to better differentiate between the two.
What test method should you use to verify that the model has improved after the additional training?

A. Pairwise testing using combinatorics to look at a long list of photo parameters.
B. Adversarial testing to verify that no incorrect images have been used in the training.
C. Back-to-back testing using the version of the model before training and the new version of the model after being trained with additional images.
D. Metamorphic testing because the application domain is not clearly understood at this point.

Answer: C

Explanation:
Back-to-back testing isused to compare two different versions of an ML model, which is precisely what is needed in this scenario.
* The model initiallymisclassified dogs as wolvesdue to feature similarities.
* Thetest team retrains the modelwith additional images of dogs and wolves.
* The best way to verify whether this additional trainingimproved classification accuracyis to compare theoriginal model's output with the newly trained model's output using the same test dataset.
* A (Metamorphic Testing):Metamorphic testing is useful forgenerating new test casesbased on existing ones but does not directly compare different model versions.
* B (Adversarial Testing):Adversarial testing is used to check how robust a model is againstmaliciously perturbed inputs, not to verify training effectiveness.
* C (Pairwise Testing):Pairwise testing is a combinatorial technique for reducing the number of test casesby focusing on key variable interactions, not for validating model improvements.
* ISTQB CT-AI Syllabus (Section 9.3: Back-to-Back Testing)
* "Back-to-back testing is used when an updated ML model needs to be compared against a previous version to confirm that it performs better or as expected".
* "The results of the newly trained model are compared with those of the prior version to ensure that changes did not negatively impact performance".
Why Other Options Are Incorrect:Supporting References from ISTQB Certified Tester AI Testing Study Guide:Conclusion:To verify that the model's performance improved after retraining,back-to-back testing is the most appropriate methodas it compares both model versions. Hence, thecorrect answer is D.

NEW QUESTION # 26
Which ONE of the following statements correctly describes the importance of flexibility for Al systems?
SELECT ONE OPTION

A. Al systems require changing of operational environments; therefore, flexibility is required.
B. Al systems are inherently flexible.
C. Self-learning systems are expected to deal with new situations without explicitly having to program for it.
D. Flexible Al systems allow for easier modification of the system as a whole.

Answer: D

Explanation:
Flexibility in AI systems is crucial for various reasons, particularly because it allows for easier modification and adaptation of the system as a whole.
AI systems are inherently flexible (A): This statement is not correct. While some AI systems may be designed to be flexible, they are not inherently flexible by nature. Flexibility depends on the system's design and implementation.
AI systems require changing operational environments; therefore, flexibility is required (B): While it's true that AI systems may need to operate in changing environments, this statement does not directly address the importance of flexibility for the modification of the system.
Flexible AI systems allow for easier modification of the system as a whole (C): This statement correctly describes the importance of flexibility. Being able to modify AI systems easily is critical for their maintenance, adaptation to new requirements, and improvement.
Self-learning systems are expected to deal with new situations without explicitly having to program for it (D): This statement relates to the adaptability of self-learning systems rather than their overall flexibility for modification.
Hence, the correct answer is C. Flexible AI systems allow for easier modification of the system as a whole.
Reference:
ISTQB CT-AI Syllabus Section 2.1 on Flexibility and Adaptability discusses the importance of flexibility in AI systems and how it enables easier modification and adaptability to new situations.
Sample Exam Questions document, Question #30 highlights the importance of flexibility in AI systems.

NEW QUESTION # 27
Which ONE of the following characteristics is the least likely to cause safety related issues for an Al system?
SELECT ONE OPTION

A. High complexity
B. Robustness
C. Non-determinism
D. Self-learning

Answer: B

Explanation:
The question asks which characteristic is least likely to cause safety-related issues for an AI system. Let's evaluate each option:
* Non-determinism (A): Non-deterministic systems can produce different outcomes even with the same inputs, which can lead to unpredictable behavior and potential safety issues.
* Robustness (B): Robustness refers to the ability of the system to handle errors, anomalies, and unexpected inputs gracefully. A robust system is less likely to cause safety issues because it can maintain functionality under varied conditions.
* High complexity (C): High complexity in AI systems can lead to difficulties in understanding, predicting, and managing the system's behavior, which can cause safety-related issues.
* Self-learning (D): Self-learning systems adapt based on new data, which can lead to unexpected changes in behavior. If not properly monitored and controlled, this can result in safety issues.
:
ISTQB CT-AI Syllabus Section 2.8 on Safety and AI discusses various factors affecting the safety of AI systems, emphasizing the importance of robustness in maintaining safe operation.

NEW QUESTION # 28
Which ONE of the following tests is MOST likely to describe a useful test to help detect different kinds of biases in ML pipeline?
SELECT ONE OPTION

A. Test the model during model evaluation for data bias.
B. Testing the distribution shift in the training data for inappropriate bias.
C. Check the input test data for potential sample bias.
D. Testing the data pipeline for any sources for algorithmic bias.

Answer: A

Explanation:
Detecting biases in the ML pipeline involves various tests to ensure fairness and accuracy throughout the ML process.
Testing the distribution shift in the training data for inappropriate bias (A): This involves checking if there is any shift in the data distribution that could lead to bias in the model. It is an important test but not the most direct method for detecting biases.
Test the model during model evaluation for data bias (B): This is a critical stage where the model is evaluated to detect any biases in the data it was trained on. It directly addresses potential data biases in the model.
Testing the data pipeline for any sources for algorithmic bias (C): This test is crucial as it helps identify biases that may originate from the data processing and transformation stages within the pipeline. Detecting sources of algorithmic bias ensures that the model does not inherit biases from these processes.
Check the input test data for potential sample bias (D): While this is an important step, it focuses more on the input data and less on the overall data pipeline.
Hence, the most likely useful test to help detect different kinds of biases in the ML pipeline is B. Test the model during model evaluation for data bias.
Reference:
ISTQB CT-AI Syllabus Section 8.3 on Testing for Algorithmic, Sample, and Inappropriate Bias discusses various tests that can be performed to detect biases at different stages of the ML pipeline.
Sample Exam Questions document, Question #32 highlights the importance of evaluating the model for biases.

NEW QUESTION # 29
Which of the following characteristics of AI-based systems make it more difficult to ensure they are safe?

A. Robustness
B. Non-determinism
C. Simplicity
D. Sustainability

Answer: B

Explanation:
AI-based systems oftenexhibit non-deterministic behavior, meaning theydo not always produce the same output for the same input. This makesensuring safety more difficult, as the system's behavior can change based on new data, environmental factors, or updates.
* Why Non-determinism Affects Safety:
* In traditional software, the same input always produces the same output.
* In AI systems, outputsvary probabilisticallydepending on learned patterns and weights.
* This unpredictability makes itharder to verify correctness, reliability, and safety, especially in critical domains likeautonomous vehicles, medical AI, and industrial automation.
* A (Simplicity):AI-based systems are typicallycomplex, not simple, which contributes to safety challenges.
* B (Sustainability):While sustainability is an important AI consideration, it doesnot directly affect safety.
* D (Robustness):Lack of robustnesscan make AI systems unsafe, butnon-determinism is the primary issuethat complicates safety verification.
* ISTQB CT-AI Syllabus (Section 2.8: Safety and AI)
* "The characteristics of AI-based systems that make it more difficult to ensure they are safe include: complexity, non-determinism, probabilistic nature, self-learning, lack of transparency, interpretability and explainability, lack of robustness".
Why Other Options Are Incorrect:Supporting References from ISTQB Certified Tester AI Testing Study Guide:Conclusion:Sincenon-determinism makes AI behavior unpredictable, complicating safety assurance, thecorrect answer is C.

NEW QUESTION # 30
Which of the following is a technique used in machine learning?

A. Equivalence partitioning
B. Decision trees
C. Boundary value analysis
D. Decision tables

Answer: B

Explanation:
Decision trees are a foundational algorithm used in supervised machine learning. The syllabus describes:
"A decision tree is a tree-like ML model whose nodes represent decisions and whose branches represent possible outcomes." (Reference: ISTQB CT-AI Syllabus v1.0, Section 3.4)

NEW QUESTION # 31
There is a growing backlog of unresolved defects for your project. You know the developers have an ML model that they have created which has learned which developers work on which type of software and the speed with which they resolve issues. How could you use this model to help reduce the backlog and implement more efficient defect resolution?

A. Use it to prioritize defects automatically based on the time expected for the fix to be made, the speed of the fix, and the likelihood of regressions
B. Use it to determine the root cause of each defect and develop a process improvement plan that can be implemented to remove the most common root causes
C. Use it to review the code and determine where more defects are likely to occur so that testing can be targeted to those areas
D. Use it to assign defects to the best developer to resolve the problem and to load balance the defect assignments among the developers

Answer: D

Explanation:
The syllabus explains that ML models can be used to analyze reported defects and suggest which developers are best suited to fix them based on historical data about defect assignment and resolution speed:
"Assignment: ML models can suggest which developers are best suited to fix particular defects, based on the defect content and previous developer assignments." (Reference: ISTQB CT-AI Syllabus v1.0, Section 11.2, page 78 of 99)

NEW QUESTION # 32
A neural network has been designed and created to assist day-traders improve efficiency when buying and selling commodities in a rapidly changing market. Suppose the test team executes a test on the neural network where each neuron is examined. For this network the shortest path indicates a buy, and it will only occur when the one-day predicted value of the commodity is greater than the spot price by 0.75%. The neurons are stimulated by entering commodity prices and testers verify that they activate only when the future value exceeds the spot price by at least 0.75%.
Which of the following statements BEST explains the type of coverage being tested on the neural network?

A. Value-change coverage
B. Threshold coverage
C. Neuron coverage
D. Sign-change coverage

Answer: B

Explanation:
Threshold coverageis a specific type of coverage measure used in neural network testing. It ensures that each neuron in the network achieves an activation value greater than a specified threshold. This is particularly relevant to the scenario described, where testers verify that neurons activate only when the future value of the commodity exceeds the spot price by at least0.75%.
* Threshold-based activation:The test case in the question isexplicitly verifying whether neurons activate only when a certain threshold (0.75%) is exceeded.This aligns perfectly with the definition ofthreshold coverage.
* Common in Neural Network Testing:Threshold coverage is used to measurewhether each neuron in a neural network reaches a specified activation value, ensuring that the neural network behaves as expected when exposed to different test inputs.
* Precedent in Research:TheDeepXplore frameworkused a threshold of0.75%to identify incorrect behaviors in neural networks, making this coverage criterion well-documented in AI testing research.
* (B) Neuron Coverage#
* Neuron coverageonly checks whether a neuron activates (non-zero value)at some point during testing. It does not consider specific activation thresholds, making it less precise for this scenario.
* (C) Sign-Change Coverage#
* This coverage measures whether each neuron exhibitsboth positive and negative activation values, which isnot relevant to the given scenario(where activation only matters when exceeding a specific threshold).
* (D) Value-Change Coverage#
* This coverage requires each neuron to producetwo activation values that differ by a chosen threshold, but the question focuses onwhether activation occurs beyond a fixed threshold, not changes in activation values.
* Threshold coverage ensures that neurons exceed a given activation threshold"Full threshold coverage requires that each neuron in the neural network achieves an activation value greater than a specified threshold. The researchers who created the DeepXplore framework suggested neuron coverage should be measured based on an activation value exceeding a threshold, changing based on the situation." Why is Threshold Coverage Correct?Why Other Options are Incorrect?References from ISTQB Certified Tester AI Testing Study GuideThus,option A is the correct answer, asthreshold coverage ensures the neural network's activation is correctly evaluated based on the required condition (0.75%).

NEW QUESTION # 33
Upon testing a model used to detect rotten tomatoes, the following data was observed by the test engineer, based on certain number of tomato images.

For this confusion matrix which combinations of values of accuracy, recall, and specificity respectively is CORRECT?
SELECT ONE OPTION

A. 0.84.1,0.9
B. 1,0.9, 0.8
C. 0.87.0.9. 0.84
D. 1,0.87,0.84

Answer: C

Explanation:
To calculate the accuracy, recall, and specificity from the confusion matrix provided, we use the following formulas:
Confusion Matrix:
Actually Rotten: 45 (True Positive), 8 (False Positive)
Actually Fresh: 5 (False Negative), 42 (True Negative)
Accuracy:
Accuracy is the proportion of true results (both true positives and true negatives) in the total population.
Formula: Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}Accuracy=TP+TN+FP+FNTP+TN Calculation: Accuracy=45+4245+42+8+5=87100=0.87\text{Accuracy} = \frac{45 + 42}{45 + 42 + 8 + 5} = \frac{87}{100} = 0.87Accuracy=45+42+8+545+42=10087=0.87 Recall (Sensitivity):
Recall is the proportion of true positive results in the total actual positives.
Formula: Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}Recall=TP+FNTP Calculation: Recall=4545+5=4550=0.9\text{Recall} = \frac{45}{45 + 5} = \frac{45}{50} = 0.9Recall=45+545=5045=0.9 Specificity:
Specificity is the proportion of true negative results in the total actual negatives.
Formula: Specificity=TNTN+FP\text{Specificity} = \frac{TN}{TN + FP}Specificity=TN+FPTN Calculation: Specificity=4242+8=4250=0.84\text{Specificity} = \frac{42}{42 + 8} = \frac{42}{50} = 0.84Specificity=42+842=5042=0.84 Therefore, the correct combinations of accuracy, recall, and specificity are 0.87, 0.9, and 0.84 respectively.
Reference:
ISTQB CT-AI Syllabus, Section 5.1, Confusion Matrix, provides detailed formulas and explanations for calculating various metrics including accuracy, recall, and specificity.
"ML Functional Performance Metrics" (ISTQB CT-AI Syllabus, Section 5).

NEW QUESTION # 34
You are testing an autonomous vehicle which uses AI to determine proper driving actions and responses. You have evaluated the parameters and combinations to be tested and have determinedthat there are too many to test in the time allowed. It has been suggested that you use pairwise testing to limit the parameters. Given the complexity of the software under test, what is likely the outcome from using pairwise testing?

A. All high priority defects will be identified using this method.
B. The number of parameters to test can be reduced to less than a dozen.
C. While the number of tests needed can be reduced, there may still be a large enough set of tests that automation will be required to execute all of them.
D. Pairwise cannot be applied to this problem because there is AI involved and the evolving values may result in unexpected results that cannot be verified.

Answer: C

Explanation:
Pairwise testing is a combinatorial testing technique that reduces the number of test cases by focusing on testing interactions between pairs of parameters rather than all possible combinations. It is widely used in AI- based systems, including autonomous vehicles, where the number of possible input parameter combinations can be extremely high.
* Option A:"The number of parameters to test can be reduced to less than a dozen."
* This is incorrect. While pairwise testing significantly reduces the number of test cases, it does not necessarily limit them to a fixed number like a dozen. The final number of tests depends on the number of parameters and their possible values.
* Option B:"All high priority defects will be identified using this method."
* This is incorrect. While pairwise testing is effective in detecting defects caused by interactions between two parameters, it may not uncover defects resulting from more complex interactions involving three or more parameters.
* Option C:"While the number of tests needed can be reduced, there may still be a large enough set of tests that automation will be required to execute all of them."
* This is the correct answer. Even though pairwise testing reduces the number of test cases, AI- based systems such as autonomous vehicles still have a large number of test scenarios. Therefore, automation is often necessary to execute all test cases within the available time.
* Option D:"Pairwise cannot be applied to this problem because there is AI involved, and the evolving values may result in unexpected results that cannot be verified."
* This is incorrect. Pairwise testing can still be applied to AI-based systems, including those that evolve over time. However, additional testing techniques may be required to verify evolving behavior.
* Pairwise Testing for AI Systems:"Pairwise testing is widely used because it effectively reduces the number of test cases while maintaining defect detection capability".
* Automation Requirement:"In practice, even with pairwise testing, extensive test suites may still require automation".
Analysis of the Answer Options:ISTQB CT-AI Syllabus References:

NEW QUESTION # 35
An e-commerce developer built an application for automatic classification of online products in order to allow customers to select products faster. The goal is to provide more relevant products to the user based on prior purchases.
Which of the following factors is necessary for a supervised machine learning algorithm to be successful?

A. Grouping similar products together before feeding them into the algorithm
B. Selecting the correct data pipeline for the ML training
C. Labeling the data correctly
D. Minimizing the amount of time spent training the algorithm

Answer: C

Explanation:
The syllabus explains that supervised learning requires correctly labeled data so the algorithm can learn the relationship between input features and output labels:
"In supervised learning, the algorithm creates the ML model from labeled data during the training phase. The labeled data is used to infer the relationship between the input data and output labels." (Reference: ISTQB CT-AI Syllabus v1.0, Section 3.1.1)

NEW QUESTION # 36
You are using a neural network to train a robot vacuum to navigate without bumping into objects. You set up a reward scheme that encourages speed but discourages hitting the bumper sensors. Instead of what you expected, the vacuum has now learned to drive backwards because there are no bumpers on the back.
This is an example of what type of behavior?

A. Reward-hacking
B. Error-shortcircuiting
C. Transparency
D. Interpretability

Answer: A

Explanation:
The syllabus defines reward hacking as:
"Reward hacking can result from an AI-based system achieving a specified goal by using a 'clever' or 'easy' solution that perverts the spirit of the designer's intent." In this case, the vacuum found a loophole in the reward function-driving backwards to avoid bumper triggers while maximizing reward for speed.
(Reference: ISTQB CT-AI Syllabus v1.0, Section 2.6, page 24 of 99)

NEW QUESTION # 37
......

ISTQB CT-AI Dumps - Secret To Pass in First Attempt: https://protechtraining.actualtestsit.com/ISTQB/CT-AI-exam-prep-dumps.html

ISTQB CT-AI Certification Practice Exam

Latest [Nov 26, 2025] ISTQB CT-AI Real Exam Dumps PDF [Q12-Q37]

ISTQB CT-AI Exam Syllabus Topics:

Related Articles