How-To Guides
Analytics Guides

Use the Statistical Tests Function

9min
the statistical tests function can help you determine how well a set of data matches a certain distribution, especially normal distribution you can do this by analyzing the data's characteristics and calculating corresponding p values understanding different types of statistical tests you have the option of the using the following types of statistical tests jarque bera test the jarque bera test measures how a dataset's skewness and kurtosis compare to that of a normal distribution, with its value indicating the likelihood of non normality formula jb = (n/6) (skewness^2 + (1/4)(kurtosis 3)^2) cramer von mises test this test assesses the fit of a data sample to a specified cumulative distribution by analyzing the squared differences, with a low p value indicating a poor fit formula cramervonmises = (1/(12 n)) + sum(((2 i 1)/(2 n) phi(z))^2) anderson darling test the anderson darling test focuses on the tails of the distribution to evaluate if the data conforms to a specified distribution, typically giving more weight to outliers formula a^2 = n (1/n) sum((2 i 1) log(phi(z i)) + (2 (n i) + 1) log(1 phi(z (n i)))) d'agostino pearson test this test combines skewness and kurtosis to examine if the shape of a dataset's distribution aligns with a normal distribution, looking for deviations in symmetry and sharpness of the highest point kolmogorov smirnov test the kolmogorov smirnov test, particularly the lilliefors modification, evaluates the normality of a dataset by comparing the empirical distribution function with the cumulative normal distribution formula k = max(d+, d ) sqrt(n) user scenario review the following scenario for the statistical tests function then, you will simulate plc data and calculate the corresponding test values for the collected data in a chemical manufacturing plant, quality control engineers use the statistical tests function to ensure that the mixture ratios of raw materials are consistent with the required standards for product batches by analyzing characteristics such as consistency and concentration, the tests determine if the batch data deviates from normal distribution, which is critical for product quality step 1 add a device follow the steps to connect a device docid\ nm1lqfefya dsiffitity and configure the following parameters device type simulator driver name generator enable alias topics select the checkbox step 2 add tags after connecting the device, add the following tags see add tags docid\ h5heqicxrcy3nch9kbg9i to learn more tag 1 input1 name select s random value generator value type select float64 polling interval enter 1 tag name enter input1 min value enter 20 max value enter 30 step 3 create analytics flows you can now create the analytics flows using data from the device and tag you previously created to create an analytics flow with the statistical function processor in manufacturing connect edge, navigate to analytics on the analytics canvas, click add processor the create a processor dialog box displays select datahub subscribe in the topic field, click the search icon, select the device you previously created, and then select the alias topic for the input1 tag click save click add processor again and select the statistical function processor the edit a processor dialog box appears window size enter a value that represents the range to apply the statistical tests for this example, we input a value of 100 select the jarque bera , anderson darling , cramer von mises , d agostino pearson , and kolmogorov smirnov checkboxes click save connect the datahub subscribe processor (tag input1 ) to the statistical function processor with a wire and use the events connection on the analytics canvas, click save the configured analytics flows should look like the following step 4 view output of processor click the view icon in the statistical function processor to view the output values the tests suggest the sample data may not fit a normal distribution, as indicated by the p values and test statistics provided