Configure the PI AF extractor

To configure the PI AF extractor, you must create a configuration file. The file must be in YAML format. You can use either the sample complete or minimal configuration files included with the installer as a starting point for your configuration settings:

config.default.yml - This file contains all configuration options and descriptions.
config.minimal.yml - This file contains a minimum configuration and no descriptions.

Naming the configuration file

You must name the configuration file config.yml.

Tip

If the configuration file contains errors, check the Windows event viewer or the extractor logs if you've configured logging. You can also manually check the configuration by starting the extractor from a Windows shell.

Pi

Include the pi section to configure the connection to the PI AF system. This is how the PI AF extractor selects a system:

If you configure system-name, the extractor selects the system by name from the preconfigured list of PI systems on the machine the extractor runs on.
If you configure host, the extractor selects a PI system running on the PI server's host.
If you don't configure either of these parameters, the extractor selects the default system on the machine the extractor runs on. If there is no default system, the extractor selects the first system from the preconfigured PI system list on the machine the extractor runs on.

Parameter	Description
`host`	Insert the base URL of the PI server's host. If you don't enter any value, you must configure a PI system in the installed SDK on the machine the extractor runs on. This is an optional setting.
`username`	Insert the Windows username on the PI server. This is a required value.
`password`	Insert the Windows password on the PI server. This is a required value.
`system-name`	Enter the name of the PI system you want to use. This is used instead of `host` to select a PI system. This is an optional setting.
`database-name`	Enter the name of the PI database you want to use. The default value is the default database configured on the machine the extractor runs on or the first database in the list if no default database is configured. This is an optional setting.

Destination

Include the destination section to configure the destination for the extracted data. Currently, this is only the CDF staging area (RAW).

Parameter	Description
`database`	Insert the CDF RAW database to extract data to. If no database exists, the extractor creates a database.
`elements-table`	Enter the table name for the PI AF elements in the CDF RAW database. If no table exists, the extractor creates a table. The default name is elements.
`unit-of-measure-classes-table`	Enter the table name for unit-of-measure classes in the CDF RAW database. If no table exists, the extractor creates a table. The default name is unit-of-measure-classes.
`attributes-table`	Enter the table name for the attributes in the CDF RAW database to be used if `elements.flatten-attributes` is set to `true`. If no table exists, the extractor creates a table. The default name is attributes.

Extraction

Include the extraction section to configure how to extract data from PI AF.

Parameter	Description
`elements.chunk`	Insert the maximum number of PI AF elements to read per request to PI. These are immediately written to CDF RAW.
`elements.query`	Insert the string query. See the OSIsoft documentation.
`elements.limit`	Insert the total maximum number of PI AF elements to read. Use this to get a reasonable subset of the server for testing. Note that this doesn't work if `update_period` is configured.
`elements.flatten-attributes`	Set to `true` to create a row in a separate table for each attribute instead of creating an attribute hierarchy for each PI AF element in CDF RAW. The default value is `false`. See also `attributes-table`.
`update-period`	Enter the time between each time the extractor reads update events from the PI AF server. This is used to partially refresh the PI AF elements, to get newly created elements, or any changes to attribute values. The syntax is `N`[time unit], where `d`(day) `h`(hour) `m`(min) `s` (seconds) `ms` (milliseconds). For instance, 2h means this configuration runs once every other hour, starting at extractor startup. The default unit is seconds. The extractor won't read updates if you set this parameter to `0` or a negative value. If you set this parameter and `refresh-period` to `0` or a negative value, the extractor quits after reading all elements or after the limit set in `elements.limit`.
`refresh-period`	Enter the time between each time the extractor performs a full refresh, reading all data from the PI AF server again. The syntax is `N`[time unit], where `d`(day) `h`(hour) `m`(min) `s` (seconds) `ms` (milliseconds). For instance, 2h means this configuration runs once every other hour, starting at extractor startup. The default unit is seconds. If you set `0` or a negative value, the extractor only reads PI AF elements at startup. If you set this parameter and `update-period` to `0` or a negative value, the extractor quits after reading all elements or after the limit set in `elements.limit`.
`keep-alive`	Enter the time between each time the extractor looks for updates in the PI system and database. This is a cheap operation that serves as a keep-alive. The syntax is `N`[time unit], where `d`(day) `h`(hour) `m`(min) `s` (seconds) `ms` (milliseconds) For instance, 2h means this configuration runs once every other hour, starting at extractor startup. The default unit is seconds. If you set `0` or a negative value, the extractor won't look for updates. The default value is every 5 minutes.

Cognite

Include the cognite section to configure which CDF project the extractor will load data into and how to connect to the project. This section is mandatory and should always contain the project and authentication configuration.

Parameter	Description
`project`	Insert the CDF project name you want to ingest data into. This is a required value.
`api-key`	We've deprecated API-key authentication.
`host`	Insert the base URL of the CDF project. The default value is https://api.cognitedata.com.
`idp-authentication`	Insert the client credentials for authenticating to CDF using an external identity provider. You must enter either an API key or use IdP authentication. `token-url`- Insert the URL to fetch tokens from. You must enter either a token URL or an Azure tenant. `client ID` - Enter the client ID from the IdP. This is a required value. `tenant` - Enter the Azure tenant. This is a required value. `secret` - Enter the client secret from the IdP. This is a required value. `scopes`- List the scopes. This is a required value. `min-ttl` - Enter the minimum time in seconds a token will be valid. The cached token is refreshed if it expires in less than `min_ttl` seconds. The default value is 30. This is an optional value. `authority`- Insert the base URL of the authority. The default value is https://login.microsoftonline.com/.
`cdf-retries`	Configure the automatic retry policy used for requests to CDF. `timeout` - Specify the timeout in milliseconds for each retry. The default value is 80000. `max-retries` - Enter the maximum number of retries. If you enter a negative value, the extractor keeps retrying. The default value is 5. `max-delay` - Enter the maximum delay in milliseconds between each try. The base delay is calculated according to 125*2^retry ms. If you enter a negative value, there is no maximum delay. 0 means that there is never any delay. The default value is 5000. You don't need to change these values unless the connection to CDF is poor. Lowering the maximum number of retries also lowers the time to failure-buffering starts, which may be necessary if there is a lot of data.
`cdf-chunking`	Configure the number of requests against CDF endpoints. This parameter is optional. If you don't enter any values, the extractor uses the default values based on CDF's current limits. `raw-rows` - Enter the maximum number of rows per request to CDF RAW. This is used with `raw state-store` and for RAW asset and time series metadata. The default value is 10000.
`cdf-throttling`	Configure how the extractor throttle requests to CDF. Each entry is the maximum allowed number of parallel requests to CDF. The only relevant field here is `raw`.
`sdk-logging`	Enable or disable output log messages from the .NET SDK. This additional debug information about requests shows the failed requests and how long they take. `disable` - Set to `true` to disable logging from the SDK. The default value is `false`. `level`- Enter the minimum level of logging, either `trace`, `debug`, `information`, `warning`, `error`, `critical`, `none`. The default value is `debug`. `format`- Select the format of the log message.
`extraction-pipeline`	Insert the external ID of the extraction pipeline in CDF. You should create the extraction pipeline before you configure this parameter. `pipeline-id`- Enter the external ID of the extraction pipeline in CDF. `frequency`- Enter the frequency in seconds to report Seen. If you enter `0` or a negative value, no reports are generated.
`certificates`	Configure this parameter for special handling of SSL certificates. This should never be considered a permanent solution to certificate problems. `accept-all` - Set to `true` to accept all certificates. This poses a severe security risk. `allow-list` - List the thumbprints of allowed certificates. This is a smaller risk compared to accepting all certificates.

Logger

Parameter	Description
`console/level`	Select the verbosity level for console logging. If this parameter isn't set or invalid, logging to a console is disabled.
`file/level`	Select the verbosity level for file logging. If this parameter isn't set or invalid, logging to a file is disabled.
`file/path`	Insert the path to the file logs. Logs are rotated according to `file/rolling-interval`.
`file/retention-limit`	Insert the maximum number of logs to keep in the log folder. The oldest logs will be deleted according to `file/rolling-interval`.
`file/rolling-interval`	Insert the rolling interval for log files as either `day` or `hour`. The default value is `day`.

Configure the PI AF extractor

Pi​

Destination​

Extraction​

Cognite​

Logger​

Pi

Destination

Extraction

Cognite

Logger