Delivery in day(s): 3
MIS352 Business Process Management Assignment
The purpose of this review is to define how one is able to identify and detect accurately business process drift. The review gives different techniques in which the business is prone to changes and how to identify those changes. Process worker may start executing a process differently in order to adjust to changes in workload, season, guidelines or regulation governing the business. By use of early detection method based on event logs also known as business process drift detection, analysts are able act on these changes, (Bevilacqua, Ciarapica & Giacchetta, 2009). The methods used in business process drift detection are mostly based on the idea of extraction from a trace. In that, in case there are two different occurrences of business process over a period of time analysts can achieve a suitable level of accuracy by tracing the consumer behaviour and frequency of occurrence for a particular technique and hence adjust it.
The review aims at looking at the various important aspects of business processes which are the drift detection methods, the disadvantages of those methods and also the relevance of the methods. The review also gives related articles which are relevant to the review of this paper. This review is structured as follows; a brief description of business processes different methods of analyzing and synthesizing these processes, their relevance, and also how to perform a statistical analysis, (Harmon & Trends, 2010).
The main aim of business process drift detection is to formulate techniques which help to identify a drift in the business process. It entails different methodologies that are able to show the drift before or after it has occurred. A drift maybe due to a competitive environment, supply, demand and also the regulatory environment. The existing methods are based mainly on extracting patterns from traces as stated before. For example; one event S occurs more frequently than another R or even R occurs more than once in a pattern, (Houy, Fettke & Loos, 2010).
The above article proposes an automated and scalable method for detecting concept drift in business process. The drift methods dwell mainly on the following important aspects; first, there is simply the identification of the point whereby there is a significant difference in the observed processes between and after that particular point. In fact it addresses the question of where are these processes the same or when there is trace equivalence. It also shows how to detect drift in a stream of runs. This is whereby there is monitoring of any statistical significant changes in the distribution which are divided into a reference and a detection population. It also brings out the aspect of use of the adaptive window and finally the evaluation on synthetic logs, (Röglinger, Pöppelbuß & Becker 2012).
Drift detection methods; this drift detection methods dwell mainly on the statistical view point. These methods can be used from event logs to partial order runs. It consists basically of a set of traces each requiring and capturing the event sequences. It also contain three other definitions;
a)The event log trace-it can be considered as B being an event log and C being a set of event occurrence and µ;B>L as a labeling function. An event trace ΦEB is defined in terms of an order f E(0,n-1) since the trace defines a total order of events the relationship results to concurrent partial order run, (Hallerbach, Bauer & Reichert, 2010).
b)The alpha concurrency- These is basically a symmetric relation which is applied over labels and over event occurrence. As stated before the concurrent relationship is transformed to a partial order representation of events.
1. However, this method is highly stressed in that it does not capture event concurrency.
2. Bos et al.(1,3) “Dealing with concept driftsin process mining.” (2015)
Detecting process drift using statistical testing over feature vectors makes the method unable to identify certain types of drifts such as inserting a conditional move.
Statistical Testing Over Runs; This method is used mainly to detect drift in a stream of runs. The test is done on two populations which are of the same size and built on the latest runs. These runs are then divided into two; the reference and the detection population ,hence resulting to two transfixed windows which are also called the reference and the detection window. From these two windows ,a sequence is obtained in which for every new run observed, both the reference and the detection window are run in the right order to show a new run and perform a new statistical test.
The main disadvantage of this method from Accorsi ( “Discovering workflow changes with time-based trace clustering. In:Data-Driven Process Discovery and Analysis.” Springer (2012) 154–168) is that a drift detection based on trace clustering is heavily dependent on the choice of window size. This leads to a situation whereby a low window size causes a false positives and a high window size causes low negatives since drifts happening in the window go undetected, (Weske 2012).
Evaluation on synthetic logs; A tool that can read a complete event logs is established so as to be used in the statistical test. So as to ensure accuracy in evaluating a drift the measurement of both mean of recall and delay are taken. Secondly, they are computed in terms of average number of log traces and finally from the result, the drift is realized from the information. In this method, the goodness is measured in terms of accuracy and scalability in a variety of settings, it can complete event traces. The traces are then dynamically used to update the alpha –relationships for each pair of events, and then transformed to a partial order run, resulting to a stream of runs, (Van Der Aalst 2013).
To assess the accuracy two measurements are used, the F-score measured as the harmonic mean and the mean delay. The latter then computes the average and then detects the drift. This measures not only how late the drift happens but also how far the log traces are able to detect the drift.
Impact of window size on accuracy; In this experiment fixed window sizes ranging from 30- 160 traces in measurement of 30 against each of the 72 logs are used . The F-score obtained with four log sizes (2500 to 10000traces) whereby in each log the F-score averaged produced 18 change patterns. It is evident that the F-score increases as the window size grows until it reaches a plateau point at the size of 150.As shown in the linear graphs below.
From these results it is evident that the more the data points used including the reference and detection windows, the more accurate it becomes leading to a detection of all concept drift, with few or no false positives, (Zur Muehlen & Recker, 2013).
An interesting feature from the findings is that after an initial high mean delay , the mean delay grows very slowly as the window size increases. This proves that the research method is resilient in terms of mean delay to increase in window size, having a relative low delay of around 40 traces when the window size is 50 or above.
The conclusion from these findings is that the obtained accuracy with the traces representation is consistently lower than the one with runs hence confirms the insinuations.
Impact of adaptive window size on accuracy
The adaptive window method on F-score and mean delay was assessed and found to be average of the three log sizes of 5000,7500 and 10000 traces, with the obtained results using an adaptive window. For example a fixed window size of 25, with those obtained in an adaptive window initialized to 25 traces. However, the size log of 2500 traces was not used to avoid the interplay between window size and number of drift observed in logs.
From the results obtained it is evident that the adaptive window outperforms the fixed window both in terms of f-score and mean delay. The ability to dynamically change the window size based on the variations allows an adequate number of runs to be obtained i.e. not too large or too small, with reference to detection windows to perform test statistically.
To conclude the findings it is found that it leads to a lower mean delay as the window size is shrank when the variation are low since in these cases a low number of runs is sufficient to perform the statistical test. The main advantage is that the adaptive window method overcomes the low accuracy obtained when the window size values are low as 25 traces and a mean delay of 28. This also results to the conclusion that keeping the mean delay as low as possible becomes essential to obtain as many drift as possible.
Accuracy change per pattern
To further test the accuracy the relative level of f-score and mean delay for each 12 simply change pattern and 6 composite change pattern are checked.. The window size is then fixed to 100 traces then used to provide the best trade off in terms of f-score ,mean delay and averaged results obtained with the adaptive windows initialized to 100 traces over three long size of 5000,7500 and 10000 traces.
Secondly, the method undergoes a sensible lower f-score both for fixed and adaptive windows for the frequency change pattern(fr).this pattern results to the modification of frequency of certain events relation in the log.the low f-score is due to a low precision, (Rao, Mansingh & Osei-Bryson 2012).
Data set generation
On data set generation a data set of 18 logs, one of each change pattern is combine with the base log of 18 altered logs in the interleaving manner.To simulate gradual changes ,the base log is combined with an altered log in a different manner.It simply starts by sampling traces from the base log only so that the number of traces increases. Thus increasing the probability of sampling from the altered log until only the traces from the altered log are sampled. This is also known as probability gradual drift.
To sample the behaviour of two logs the linear probability function with a slope of 0.2% are used.In other terms,from a probability starting at 1(rep increase) by 0.02 every time a new trace is analysed to reach zero after 500 traces, each gradual drift interval includs 25% of the behaviour of each log so that an accurate portion of each drift is known, (Trkman 2010).
Creteria for evaluating accuracy
To asess the accuracy of the gradual drift detection method, the f-score and the mean delay,are defined in a slightly different way.To compute prcision and recall for the f-score it is said that a detected drift is true positive and if it includes the central point of the interval of the actual drift is considered a false positive.To verify this if the actual drift happens between the numbers 751 and 1250 the median would be the trace number 1000. Which in this case a gradual drift that it would detect entails the median 1000, (McCormack & Johnson, 2016).
.However, if a comparison is made for sudden drift detection, the method achieves an f-score that is below 0.7 and a relatively higher mean delay for three patterns.For particularity, the lowest accuracy is derived with the fr pattern which is in line with the results for sudden drift detection, (Scheer 2012).
To conclude it is evident that an automated method for detecting sudden and gradual drift in business processes from traces is outlined. It also gives an evaluation over synthetic logs and showed that the method accurately discovers typical process changes, (Neiger, Rotaru & Churilov, 2009). It also shows that gradual drift detection methods mainly rely on assumptions that a gradual drift is delimited to two consecutive sudden drifts, in a way that distribution of runs in between is a linear mixture of distributions of runs before the first and second drift. The accuracy established is what shows that the assumption generally holds in place.designing a more sophisticated method would lift the avenue for future research and analysis, (Papageorgiou 2009).
It also holds that the accuracy for any data trace should be evaluated so as to detect drift in the data trace.T he different methods outlined in the review provides the relevant techniques to determine the drift in business processe. this aanalysis proves to be impotrant in business processes management since it allows for one to predict changes in business process and detrmine the legitimacy and outcome of the business process environment.This could be in terms of supply, demand, or change in regulatory environment.
1. Bevilacqua, M., Ciarapica, F.E. and Giacchetta, G., 2009. Business process reengineering of a supply chain and a traceability system: A case study. Journal of Food Engineering, 93(1), pp.13-22.
2. Harmon, P. and Trends, B.P., 2010. Business process change: A guide for business managers and BPM and Six Sigma professionals. Elsevier.
3. Houy, C., Fettke, P. and Loos, P., 2010. Empirical research in business process management–analysis of an emerging field of research. Business Process Management Journal, 16(4), pp.619-661.
4. Hallerbach, A., Bauer, T. and Reichert, M., 2010. Capturing variability in business process models: the Provop approach. Journal of Software Maintenance and Evolution: Research and Practice, 22(6?7), pp.519-546.
5. McCormack, K.P. and Johnson, W.C., 2016. Supply chain networks and business process orientation: advanced strategies and best practices. CRC Press.
6. Neiger, D., Rotaru, K. and Churilov, L., 2009. Supply chain risk identification with value-focused process engineering. Journal of operations management, 27(2), pp.154-168.
7. Papageorgiou, L.G., 2009. Supply chain optimisation for the process industries: Advances and opportunities. Computers & Chemical Engineering, 33(12), pp.1931-1938.
8. Rao, L., Mansingh, G. and Osei-Bryson, K.M., 2012. Building ontology based knowledge maps to assist business process re-engineering. Decision Support Systems, 52(3), pp.577-589.
9. Röglinger, M., Pöppelbuß, J. and Becker, J., 2012. Maturity models inbusiness process management. Business Process Management Journal, 18(2), pp.328-346.
10. Scheer, A.W., 2012. Business process engineering: reference models for industrial enterprises. Springer Science & Business Media.
11. Trkman, P., 2010. The critical success factors of business process management. International journal of information management, 30(2), pp.125-134.
12. Van Der Aalst, W.M., 2013. Business process management: a comprehensive survey. ISRN Software Engineering, 2013.
13. Weske, M., 2012. Business process management architectures. In Business Process Management (pp. 333-371). Springer, Berlin, Heidelberg.
14. Zur Muehlen, M. and Recker, J., 2013. How much language is enough? Theoretical and practical use of the business process modeling notation. In Seminal Contributions to Information Systems Engineering (pp. 429-443). Springer, Berlin, Heidelberg.