Skip to content

Commit

Permalink
MINIFICPP-2166 Remove unused properties and update documentation
Browse files Browse the repository at this point in the history
Closes #1622

Signed-off-by: Marton Szasz <[email protected]>
  • Loading branch information
lordgamez authored and szaszm committed Aug 8, 2023
1 parent 034ff6c commit bed452e
Show file tree
Hide file tree
Showing 10 changed files with 198 additions and 693 deletions.
26 changes: 25 additions & 1 deletion C2.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,17 @@ be requested via C2 DESCRIBE manifest command.
nifi.c2.agent.protocol.class=RESTSender
# nifi.c2.agent.protocol.class=CoapProtocol

# Coap protocol can be also defined in the flow configuration
# nifi.c2.coap.connector.service=MyCoapProtocol

# control c2 heartbeat interval
nifi.c2.agent.heartbeat.period=30 sec

# enable reporter classes
nifi.c2.agent.heartbeat.reporter.class=RESTReciver
nifi.c2.agent.heartbeat.reporter.classes=RESTReceiver
# If RESTReceiver is configured its listener port and optional SSL certificate can also be configured
nifi.c2.rest.listener.port=<port>
nifi.c2.rest.listener.cacert=<SSL Cert path>

# specify the rest URIs if using RESTSender
nifi.c2.rest.url=http://<your-c2-server>/<c2-api-path>/c2-protocol/heartbeat
Expand All @@ -98,6 +104,24 @@ be requested via C2 DESCRIBE manifest command.
# specify encoding strategy for c2 requests (gzip, none)
#nifi.c2.rest.request.encoding=none

# minimize REST heartbeat updates
#nifi.c2.rest.heartbeat.minimize.updates=true

#### Flow Id and URL

Flow id and URL are usually retrieved from the C2 server. These identify the last updated flow version and where the flow was downloaded from. These properties are persisted in the minifi.properties file.

# in minifi.properties
nifi.c2.flow.id=8da5de7f-dcdb-4f6b-aa2f-6f162a7f9dc4
nifi.c2.flow.url=http://localhost:10090/efm/api/flows/8da5de7f-dcdb-4f6b-aa2f-6f162a7f9dc4/content?aid=efmtest

#### Agent Identifier Fallback

It is possible to set a persistent fallback agent id. This is needed so that the C2 server can identify the same agent after a restart, even if nifi.c2.agent.identifier is not specified.

# in minifi.properties
nifi.c2.agent.identifier.fallback=my_fallback_id

### Metrics

Command and Control metrics can be used to send metrics through the heartbeat or via the DESCRIBE
Expand Down
186 changes: 170 additions & 16 deletions CONFIGURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,31 +70,126 @@ It's recommended to create your configuration in YAML format or configure the ag
max concurrent tasks: 1
Properties:

### Configuring flow configuration format

MiNiFi supports YAML and JSON configuration formats. The desired configuration format can be set in the minifi.properties file, but it is automatically identified by default. The default value is `adaptiveconfiguration`, but we can force to use YAML with the `yamlconfiguration` value.

# in minifi.properties
nifi.flow.configuration.class.name=adaptiveconfiguration

### Scheduling strategies
Currently Apache NiFi MiNiFi C++ supports TIMER_DRIVEN, EVENT_DRIVEN, and CRON_DRIVEN. TIMER_DRIVEN uses periods to execute your processor(s) at given intervals.
The EVENT_DRIVEN strategy awaits for data be available or some other notification mechanism to trigger execution. CRON_DRIVEN executes at the desired intervals
based on the CRON periods. Apache NiFi MiNiFi C++ supports standard CRON expressions without intervals ( */5 * * * * ).

### Configuring encryption for flow configuration

To encrypt flow configuration set the following property to true.

# in minifi.properties
nifi.flow.configuration.encrypt=true

### Configuring additional sensitive properties

It is possible to set a comma seperated list of encrypted configuration options beyond the default sensitive property list.

# in minifi.properties
nifi.sensitive.props.additional.keys=nifi.flow.configuration.file, nifi.rest.api.password

### Backup previous flow configuration on flow update

It is possible to backup the previous flow configuration file with `.bak` extension in case of a flow update (e.g. through C2 or controller socket protocol).

# in minifi.properties
nifi.flow.configuration.backup.on.update=true

### Set number of flow threads

The number of threads used by the flow scheduler can be set in the MiNiFi configuration. The default value is 5.

# in minifi.properties
nifi.flow.engine.threads=5

### OnTrigger runtime alert

MiNiFi writes warning logs in case a processor has been running for too long. The period for these alerts can be set in the configuration file with the default being 5 seconds.

# in minifi.properties
nifi.flow.engine.alert.period=5 sec

### Event driven processor time slice

The flow scheduler can be configured how much time it should allocate at maximum for event driven processors. The processor is triggered until it has work to do, but no more than the configured time slice. The default value is 500 milliseconds.

# in minifi.properties
nifi.flow.engine.event.driven.time.slice=500 millis

### Administrative yield duration

In case an uncaught exception is thrown while running a processor, the processor will yield for the configured administrative yield time. The default yield duration is 30 seconds.

# in minifi.properties
nifi.administrative.yield.duration=30 sec

### Bored yield duration

If a processor is triggered but has no work available, it will yield for the configured bored yield time. The default yield duration is 100 milliseconds.

# in minifi.properties
nifi.bored.yield.duration=100 millis

### Graceful shutdown period

It is possible to configure a graceful shutdown period, the period the flow controller will wait to unload the flow configuration and stop running processors.

# in minifi.properties
nifi.flowcontroller.graceful.shutdown.period=30 sec

### FlowController drain timeout

Timeout period for finishing processing of flow files in progress when shutting down flow controller. When not set we do not wait for flow files to finish processing.

# in minifi.properties
nifi.flowcontroller.drain.timeout=500 millis

### SiteToSite Security Configuration

in minifi.properties
# in minifi.properties

enable tls
# enable tls
nifi.remote.input.secure=true

if you want to enable client certificate base authorization
# if you want to enable client certificate base authorization
nifi.security.need.ClientAuth=true
setup the client certificate and private key PEM files
# setup the client certificate and private key PEM files
nifi.security.client.certificate=./conf/client.pem
nifi.security.client.private.key=./conf/client.pem
setup the client private key passphrase file
# setup the client private key passphrase file
nifi.security.client.pass.phrase=./conf/password
setup the client CA certificate file
# setup the client CA certificate file
nifi.security.client.ca.certificate=./conf/nifi-cert.pem

if you do not want to enable client certificate base authorization
# if you do not want to enable client certificate base authorization
nifi.security.need.ClientAuth=false

It can also be configured to use the system certificate store.

# in minifi.properties
nifi.security.use.system.cert.store=true

Windows specific certificate options with the following default values:

# in minifi.properties
nifi.security.windows.cert.store.location=LocalMachine
nifi.security.windows.server.cert.store=ROOT
nifi.security.windows.client.cert.store=MY

# The CN that the client certificate is required to match; default: use the first available client certificate in the store
# nifi.security.windows.client.cert.cn=

# Comma-separated list of enhanced key usage values that the client certificate is required to have
nifi.security.windows.client.cert.key.usage=Client Authentication

You have the option of specifying an SSL Context Service definition for the RPGs instead of the properties above.
This will link to a corresponding SSL Context service defined in the flow.

Expand Down Expand Up @@ -130,6 +225,14 @@ for TCP and secure HTTPS communications.
Passphrase: <passphrase path or passphrase>
CA Certificate: <CA cert path>

If the SSL certificates are not provided with an absolute path or cannot be found on the given relative path, MiNiFi will try to find them on the default path provided in the configuration file.

# in minifi.properties

# default minifi resource path
nifi.default.directory=/path/to/cert/files/


### HTTP SiteToSite Configuration
To enable HTTPSiteToSite for a remote process group.
Remote Processing Groups:
Expand All @@ -150,14 +253,48 @@ To enable HTTP Proxy for a remote process group.
### Command and Control Configuration
Please see the [C2 readme](C2.md) for more informatoin

### State Storage

State storage is used for keeping the state of stateful processors like TailFile. This is done using RocksDB database, but can be configured to use a different state storage with custom options.

The default location of the RocksDB local state storage is the `corecomponentstate` directory under the MiNiFi root directory. This can be reconfigured if other directory is preferred.

# in minifi.properties
nifi.state.storage.local.path=/var/tmp/minifi-state/

To have a custom state storage one option is to configure it in the flow configuration file and set the created controller in the minifi.properties file.

# in config.yml
Controller Services:
- name: testcontroller
id: 2438e3c8-015a-1000-79ca-83af40ec1994
class: PersistentMapStateStorage
Properties:
Auto Persistence Interval:
- value: 0 sec
Always Persist:
- value: true
File:
- value: state.txt

# in minifi.properties
nifi.state.storage.local=2438e3c8-015a-1000-79ca-83af40ec1994

Another option to define a state storage is to use the following properties in the minifi.properties file.

# in minifi.properties
nifi.state.storage.local.class.name=PersistentMapStateStorage
nifi.state.storage.local.always.persist=true
nifi.state.storage.local.auto.persistence.interval=0 sec


### Configuring Repository storage locations
Persistent repositories, such as the Flow File repository, use a configurable path to store data.
The repository locations and their defaults are defined below. By default the MINIFI_HOME env
variable is used. If this is not specified we extrapolate the path and use the root installation
folder. You may specify your own path in place of these defaults.

in minifi.properties
# in minifi.properties
nifi.provenance.repository.directory.default=${MINIFI_HOME}/provenance_repository
nifi.flowfile.repository.directory.default=${MINIFI_HOME}/flowfile_repository
nifi.database.content.repository.directory.default=${MINIFI_HOME}/content_repository
Expand All @@ -167,16 +304,15 @@ folder. You may specify your own path in place of these defaults.
Rocksdb has an option to set compression type for its database to use less disk space.
If content repository or flow file repository is set to use the rocksdb database as their storage, then we have the option to compress those repositories. On Unix operating systems `zlib`, `bzip2`, `zstd`, `lz4` and `lz4hc` compression types and on Windows `xpress` compression type is supported by MiNiFi C++. If the property is set to `auto` then `xpress` will be used on Windows, `zstd` on Unix operating systems. These options can be set in the minifi.properies file with the following properties:

in minifi.properties
# in minifi.properties
nifi.flowfile.repository.rocksdb.compression=zlib
nifi.content.repository.rocksdb.compression=auto


### Configuring compaction for rocksdb database

Rocksdb has an option to run compaction at specific intervals not just when needed.

in minifi.properties
# in minifi.properties
nifi.flowfile.repository.rocksdb.compaction.period=2 min
nifi.database.content.repository.rocksdb.compaction.period=2 min

Expand All @@ -189,20 +325,20 @@ created into. E.g. in `minifidb:///home/user/minifi/agent_state/flowfile` a dire
`/home/user/minifi/agent_state` populated with rocksdb-specific content, and in that repository a logically
separate "subdatabase" is created under the name `"flowfile"`.

in minifi.properties
# in minifi.properties
nifi.flowfile.repository.directory.default=minifidb://${MINIFI_HOME}/agent_state/flowfile
nifi.database.content.repository.directory.default=minifidb://${MINIFI_HOME}/agent_state/content
nifi.state.management.provider.local.path=minifidb://${MINIFI_HOME}/agent_state/processor_states
nifi.state.storage.local.path=minifidb://${MINIFI_HOME}/agent_state/processor_states

We should not simultaneously use the same directory with and without the `minifidb://` scheme.
Moreover the `"default"` name is restricted and should not be used.


in minifi.properties
# in minifi.properties
nifi.flowfile.repository.directory.default=minifidb://${MINIFI_HOME}/agent_state/flowfile
nifi.database.content.repository.directory.default=${MINIFI_HOME}/agent_state
^ error: using the same database directory without the "minifidb://" scheme
nifi.state.management.provider.local.path=minifidb://${MINIFI_HOME}/agent_state/default
nifi.state.storage.local.path=minifidb://${MINIFI_HOME}/agent_state/default
^ error: "default" is restricted

### Configuring Repository encryption
Expand Down Expand Up @@ -232,7 +368,7 @@ Each of the repositories can be configured to be volatile ( state kept in memory

To configure the repositories:

in minifi.properties
# in minifi.properties
# For Volatile Repositories:
nifi.flowfile.repository.class.name=VolatileFlowFileRepository
nifi.provenance.repository.class.name=VolatileProvenanceRepository
Expand Down Expand Up @@ -265,6 +401,14 @@ Each of the repositories can be configured to be volatile ( state kept in memory

The content repository has a default option for "minimal.locking" set to true. This will attempt to use lock free structures. This may or may not be optimal as this requires additional additional searching of the underlying vector. This may be optimal for cases where max.count is not excessively high. In cases where object permanence is low within the repositories, minimal locking will result in better performance. If there are many processors and/or timing is such that the content repository fills up quickly, performance may be reduced. In all cases a locking cache is used to avoid the worst case complexity of O(n) for the content repository; however, this caching is more heavily used when "minimal.locking" is set to false.

### Configuring provenance repository storage

Provenance repository size buffer size and TTL can be configured when used with RocksDB. If not set it uses the available maximum RocksDB values.

#in minifi.properties
nifi.provenance.repository.max.storage.size=16 MB
nifi.provenance.repository.max.storage.time=30 days

### Provenance Reporter

Add Provenance Reporting to config.yml
Expand Down Expand Up @@ -391,6 +535,16 @@ The MQTTController Service can be configured for MQTT connectivity and provide t
Max Throughput: 1,024,1024
Max Payload: 1,024,1024

### Disk space watchdog #

Stops MiNiFi FlowController activity (excluding C2), when the available disk space on either of the repository volumes go below stop.threshold, checked every interval, then restarts when the available space on all repository volumes reach at least restart.threshold.

# in minifi.properties
minifi.disk.space.watchdog.enable=true
minifi.disk.space.watchdog.interval=15 sec
minifi.disk.space.watchdog.stop.threshold=100 MB
minifi.disk.space.watchdog.restart.threshold=150 MB

### Extension configuration
To notify the agent which extensions it should load see [Loading extensions](Extensions.md#Loading extensions).

Expand Down
4 changes: 2 additions & 2 deletions extensions/http-curl/tests/C2DescribeManifestTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ int main(int argc, char **argv) {

harness.getConfiguration()->set(minifi::Configuration::nifi_rest_api_password, encrypted_value);
harness.getConfiguration()->set(std::string(minifi::Configuration::nifi_rest_api_password) + ".protected", utils::crypto::EncryptionType::name());
harness.getConfiguration()->set(minifi::Configuration::nifi_server_name, "server_name");
harness.getConfiguration()->set(minifi::Configuration::nifi_c2_agent_identifier_fallback, "c2_id_fallback");
harness.getConfiguration()->set(minifi::Configuration::nifi_framework_dir, "framework_path");
harness.getConfiguration()->set(minifi::Configuration::nifi_sensitive_props_additional_keys,
std::string(minifi::Configuration::nifi_framework_dir) + ", " + std::string(minifi::Configuration::nifi_server_name));
std::string(minifi::Configuration::nifi_framework_dir) + ", " + std::string(minifi::Configuration::nifi_c2_agent_identifier_fallback));
harness.getConfiguration()->set(minifi::Configuration::nifi_log_appender_rolling_directory, "/var/log/minifi");

harness.setUrl(args.url, &responder);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ TEST_CASE("Configuration can merge lists of property names", "[mergeProperties]"
}

TEST_CASE("Configuration can validate values to be assigned to specific properties", "[validatePropertyValue]") {
REQUIRE(Configuration::validatePropertyValue(Configuration::nifi_server_name, "anything is valid"));
REQUIRE(Configuration::validatePropertyValue(Configuration::nifi_c2_agent_identifier_fallback, "anything is valid"));
REQUIRE_FALSE(Configuration::validatePropertyValue(Configuration::nifi_flow_configuration_encrypt, "invalid.value"));
REQUIRE(Configuration::validatePropertyValue(Configuration::nifi_flow_configuration_encrypt, "true"));
REQUIRE(Configuration::validatePropertyValue("random.property", "random_value"));
Expand Down
Loading

0 comments on commit bed452e

Please sign in to comment.