# Use the Statistical Tests Function

The Statistical Tests Function can help you determine how well a set of data matches a certain distribution, especially normal distribution. You can do this by analyzing the data's characteristics and calculating corresponding p-values.

You have the option of the using the following types of statistical tests.

The Jarque-Bera test measures how a dataset's skewness and kurtosis compare to that of a normal distribution, with its value indicating the likelihood of non-normality.

**Formula: **JB = (n/6) (skewness^2 + (1/4)(kurtosis-3)^2)

This test assesses the fit of a data sample to a specified cumulative distribution by analyzing the squared differences, with a low p-value indicating a poor fit.

**Formula: **cramerVonMises = (1/(12*n)) + sum(((2*i - 1)/(2*n) - Phi(Z))^2)

The Anderson-Darling test focuses on the tails of the distribution to evaluate if the data conforms to a specified distribution, typically giving more weight to outliers.

**Formula:** A^2 = -n - (1/n) * sum((2*i - 1) * log(Phi(Z_i)) + (2*(n - i) + 1) * log(1 - Phi(Z_(n-i))))

This test combines skewness and kurtosis to examine if the shape of a dataset's distribution aligns with a normal distribution, looking for deviations in symmetry and sharpness of the highest point.

The Kolmogorov-Smirnov test, particularly the Lilliefors modification, evaluates the normality of a dataset by comparing the empirical distribution function with the cumulative normal distribution.

**Formula: **K = max(D+, D-) * sqrt(n)

Review the following scenario for the Statistical Tests function. Then, you will simulate PLC data and calculate the corresponding test values for the collected data.

In a chemical manufacturing plant, quality control engineers use the Statistical Tests Function to ensure that the mixture ratios of raw materials are consistent with the required standards for product batches. By analyzing characteristics such as consistency and concentration, the tests determine if the batch data deviates from normal distribution, which is critical for product quality.

Follow the steps to Connect a Device and configure the following parameters:

**Device Type**: Simulator**Driver Name**: Generator**Enable Alias Topics**: Select the checkbox.

After connecting the device, add the following tags. See Add Tags to learn more.

**Name**: Select**S - Random value generator****Value Type:**Select**float64****Polling Interval**: Enter**1****Tag Name**: Enter**input1****Min_value**: Enter**20****Max_value**: Enter**30**

You can now create the analytics flows using data from the device and tag you previously created.

**To create an analytics flow with the Statistical Function Processor:**

- In Manufacturing Connect Edge, navigate to
**Analytics**. On the analytics canvas, click

**Add processor**. The*Create a processor*dialog box displays.- Select
**DataHub Subscribe**. In the

*Topic*field, click the**Search**icon, select the device you previously created, and then select the alias topic for the**input1**tag.- Click
**Save**. - Click
**Add processor**again and select the**Statistical Function**processor. The*Edit a Processor*dialog box appears.**Window Size:**Enter a value that represents the range to apply the statistical tests. For this example, we input a value of**100**.- Select the
**Jarque Bera**,**Anderson Darling**,**Cramer Von Mises**,**D Agostino Pearson**, and**Kolmogorov Smirnov**checkboxes. Click

**Save**.

- Connect the
*DataHub Subscribe*processor (tag:*input1*) to the*Statistical Function*processor with a wire and use the**events**connection. On the analytics canvas, click

**Save**. The configured analytics flows should look like the following:

Click the **View **icon in the *Statistical Function *processor to view the output values.

The tests suggest the sample data may not fit a normal distribution, as indicated by the p-values and test statistics provided.