-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathmodel.xml
138 lines (131 loc) · 6.24 KB
/
model.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
<project name='project' pubsub='auto' threads='1' use-tagged-token='true'>
<description>
This project is a basic demonstration of using the Change Detection
(ChangeDetection) algorithm. With change detection, a stream of measures
is monitored and an alert is raised when values deviate from what is
expected. Often point estimates of the measures are calculated. Then,
differences between consecutive measures are inspected to determine
whether they deviate from a pre-defined threshold. However, point
estimates do not capture changes in the underlying distribution of the
tracked measures. To capture changes in the underlying distribution,
histogram intersection is used.
SAS Event Stream Processing uses Kullback-Leibler divergence (KL divergence)
to detect changes through histogram intersection. KL divergence measures
how one probability distribution is different from a second, reference
distribution over the same variable. It can be used in the fields of
applied statistics and machine learning.
This project contains a single continuous query consisting of the following:
1. A source window that receives input data
2. a calculate window that runs the change detection algorithm on that
data
</description>
<contqueries>
<contquery name='contquery' trace='w_calculate'>
<windows>
<window-source name='w_data' insert-only='true' index='pi_EMPTY'>
<description>
This source window receives the included example input data file via
a file socket connector. The input data stream is placed into two fields
for each observation: an ID that acts as the data stream's key, named id;
and an x coordinate of data, named x.
</description>
<schema>
<fields>
<field name='id' type='int64' key='true'/>
<field name='x' type='double'/>
</fields>
</schema>
<connectors>
<connector class='fs' name='publisher'>
<properties>
<property name='type'>pub</property>
<property name='fstype'>csv</property>
<property name='fsname'>input.csv</property>
<property name='transactional'>true</property>
<property name='blocksize'>1</property>
</properties>
</connector>
</connectors>
</window-source>
<window-calculate name='w_calculate' algorithm='ChangeDetection' index='pi_EMPTY'>
<description>
This calculate window receives data from the source window and applies the
ChangeDetection algorithm to compute KL divergence. In this example, the
following algorithm parameters govern the ChangeDetection algorithm:
maxBins: Specifies the maximum number of bins in the histogram
for both the reference window and the sliding window.
slidingAlpha: Specifies the fading factor for the sliding window.
The acceptable value range is between 0 and 1
(including 1).
refWindowSize: Specifies the size of the reference window.
maxEvalSteps: Specifies the maximum number of steps before performing
a new evaluation.
adaptiveEval: Specifies whether to use the adaptive evaluation step
size or not.
showEval: Specifies whether to show evaluation events regardless
of whether a change is detected.
showAll: Specifies whether to show all events, regardless of
whether an evaluation occurs.
Additionally, the following input-map and output-map define this calculate
window:
input: Specifies the input variable for change detection.
evaluatedOut: Specifies the name of the output variable that indicates
whether an evaluation occurred.
changeValueOut: Specifies the name of the output variable that contains
the change value (the difference in the two KL divergence
values).
changeDetectedOut: Specifies the name of the output variable that indicates
whether a change has been detected.
The resulting output is written to the file result.out via a file socket connector.
</description>
<schema>
<fields>
<field name='id' type='int64' key='true'/>
<field name='x' type='double'/>
<field name='changeVal' type='double'/>
<field name='eval' type='int32'/>
<field name='changeDetected' type='int32'/>
</fields>
</schema>
<parameters>
<properties>
<property name="maxBins">100</property>
<property name="slidingAlpha">0.997</property>
<property name="refWindowSize">500</property>
<property name="maxEvalSteps">300</property>
<property name="adaptiveEval">1</property>
<property name="showEval">1</property>
<property name="showAll">0</property>
</properties>
</parameters>
<input-map>
<properties>
<property name="input">x</property>
</properties>
</input-map>
<output-map>
<properties>
<property name='evaluatedOut'>eval</property>
<property name='changeValueOut'>changeVal</property>
<property name='changeDetectedOut'>changeDetected</property>
</properties>
</output-map>
<connectors>
<connector class='fs' name='subscriber'>
<properties>
<property name='type'>sub</property>
<property name='fstype'>csv</property>
<property name='fsname'>result.out</property>
<property name='snapshot'>true</property>
<property name='header'>true</property>
</properties>
</connector>
</connectors>
</window-calculate>
</windows>
<edges>
<edge source='w_data' target='w_calculate' role='data'/>
</edges>
</contquery>
</contqueries>
</project>