@@ -292,9 +292,14 @@ image::nifi-kafka-druid-water-level-data/nifi_1.png[]
292
292
293
293
Log in with the username `admin` and password `adminadmin`.
294
294
295
+ image::nifi-kafka-druid-water-level-data/nifi_13.png[]
296
+
297
+ The NiFi workflow contains the `IngestWaterLevelsToKafka` Process Group. Enter the Process Group by double-clicking on it or
298
+ right-clicking and selecting `Enter Group`.
299
+
295
300
image::nifi-kafka-druid-water-level-data/nifi_2.png[]
296
301
297
- As you can see, the NiFi workflow consists of lots of components. It is split into three main components:
302
+ As you can see, the Process Group consists of lots of components. It is split into three main components:
298
303
299
304
. On the left is the first part bulk-loading all the known stations
300
305
. In the middle is the second part bulk-loading the historical data for the last 30 days
@@ -318,19 +323,19 @@ The left workflows works as follows:
318
323
measurements of the last 30 days.
319
324
. `Add station_uuid` will add the attribute `station_uuid` to the JSON list of measurements returned from the
320
325
{pegelonline}[PEGELONLINE web service], which is missing.
321
- . `PublishKafkaRecord_2_6 ` finally emits every measurement as a Kafka record to the topic `measurements`.
326
+ . `PublishKafka ` finally emits every measurement as a Kafka record to the topic `measurements`.
322
327
323
328
The right side works similarly but is executed in an endless loop to stream the data in near-realtime. Double-click on
324
329
the `Get station list` processor to show the processor details.
325
330
326
331
image::nifi-kafka-druid-water-level-data/nifi_5.png[]
327
332
328
- Head over to the tab `PROPERTIES` .
333
+ Head over to the `Properties` tab .
329
334
330
335
image::nifi-kafka-druid-water-level-data/nifi_6.png[]
331
336
332
337
Here, you can see the setting `HTTP URL`, which specifies the download URL from where the JSON file containing the
333
- stations is retrieved. Close the processor details popup by clicking `OK `. You can also have a detailed view of the
338
+ stations is retrieved. Close the processor details popup by clicking `Close `. You can also have a detailed view of the
334
339
`Produce station records` processor by double-clicking it.
335
340
336
341
image::nifi-kafka-druid-water-level-data/nifi_7.png[]
@@ -347,11 +352,11 @@ image::nifi-kafka-druid-water-level-data/nifi_9.png[]
347
352
348
353
The `HTTP URL` does contain the `$\{station_uuid\}` placeholder, which gets replaced for every station.
349
354
350
- Double-click the `PublishKafkaRecord_2_6 ` processor.
355
+ Double-click the `PublishKafka ` processor.
351
356
352
357
image::nifi-kafka-druid-water-level-data/nifi_10.png[]
353
358
354
- You can also see the number of produced records by right-clicking on `PublishKafkaRecord_2_6 ` and selecting
359
+ You can also see the number of produced records by right-clicking on `PublishKafka ` and selecting
355
360
`View status history`.
356
361
357
362
image::nifi-kafka-druid-water-level-data/nifi_11.png[]
@@ -385,13 +390,14 @@ By clicking on `Supervisors` at the top you can see the running ingestion jobs.
385
390
image::nifi-kafka-druid-water-level-data/druid_2.png[]
386
391
387
392
After clicking on the magnification glass to the right side of the `RUNNING` supervisor, you can see additional
388
- information (here, the supervisor `measurements` was chosen). On the tab `Statistics ` on the left, you can see the
393
+ information (here, the supervisor `measurements` was chosen). On the tab `Task stats ` on the left, you can see the
389
394
number of processed records as well as the number of errors.
390
395
391
396
image::nifi-kafka-druid-water-level-data/druid_3.png[]
392
397
393
- The statistics show that Druid ingested `2435` records during the last minute and has already ingested ~30 million records in total. All records
394
- have been ingested successfully, indicated by having no `processWithError`, `thrownAway` or `unparseable` records.
398
+ The statistics show that Druid ingested `594` records during the last minute and has already ingested ~1,3 million records in total. All records
399
+ have been ingested successfully, indicated by having no `processWithError`, `thrownAway` or `unparseable` records in the output of the `View raw`
400
+ button at the top right.
395
401
396
402
=== Query the Data Source
397
403
0 commit comments