Configure the OPC UA extractor
To configure the OPC UA extractor, you must edit the configuration file. The file is in YAML format, and the sample configuration file contains all valid options with default values.
You can leave many fields empty to let the extractor use the default values. The configuration file separates the settings by component, and you can remove an entire component to disable it or use the default values.
Sample configuration files
In the extractor installation folder, the /config
subfolder contains sample complete and minimal configuration files. The values wrapped in ${}
are replaced with environment variables with that name. For example, ${COGNITE_PROJECT}
will be replaced with the value of the environment variable called COGNITE_PROJECT
.
The configuration file also contains the global parameter version
, which holds the version of the configuration schema used in the configuration file. This document describes version 1 of the configuration schema.
You can set up extraction pipelines to use versioned extractor configuration files stored in the cloud.
Minimal YAML configuration file
version: 1
source:
# The URL of the OPC-UA server to connect to
endpoint-url: 'opc.tcp://localhost:4840'
cognite:
# The project to connect to in the API, uses the environment variable COGNITE_PROJECT.
project: '${COGNITE_PROJECT}'
# Cognite authentication
# This is for Microsoft as IdP. To use a different provider,
# set implementation: Basic, and use token-url instead of tenant.
# See the example config for the full list of options.
idp-authentication:
# Directory tenant
tenant: ${COGNITE_TENANT_ID}
# Application Id
client-id: ${COGNITE_CLIENT_ID}
# Client secret
secret: ${COGNITE_CLIENT_SECRET}
# List of resource scopes, ex:
# scopes:
# - scopeA
# - scopeB
scopes:
- ${COGNITE_SCOPE}
extraction:
# Global prefix for externalId in destinations. Should be unique to prevent name conflicts.
id-prefix: 'gp:'
# Map OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used.
# Saves space compared to using the full URL. Using the ns index is not safe as the order can change on the server.
# It is recommended to set this before extracting the node hierarchy.
# For example:
# NamespaceMap:
# "urn:cognite:net:server": cns
# "urn:freeopcua:python:server": fps
# "http://examples.freeopcua.github.io": efg
ProtoNodeId
You can provide an OPC UA nodeid
in several places in the configuration file with an object in YAML with the following structure:
- node:
- node-id:
i=123
- namespace-urii:
opc.tcp://test.test/
- node-id:
To find the node IDs, we recommend using the UAexpert tool.
Locate the datatype/event-type/node in the hierarchy, then find the node ID on the right side under Attribute > NodeId. Find the Namespace Uri by matching the NamespaceIndex on the right to the list on the left. The default value is No highlight.
If either part is left empty, it's converted to a different node ID based on context. This happens automatically for events if you use the configuration tool released with version 1.1. If a mapping is specified in namespace-map, you can use the mapped value in place of namespace-uri.
Timestamps and intervals
In most places where time intervals are required, you can use a CDF-like syntax of [N][timeunit]
, for example, 10m
for 10 minutes or 1h
for 1 hour. timeunit
is one of d
, h
, m
, s
, ms
. You can also use a cron expression when this makes sense.
For history start and end times you can use a similar syntax. [N][timeunit]
and [N][timeunit]-ago
. 1d-ago
means 1 day in the past from the time history starts, and 1h
means 1 hour in the future. For instance, you can use this syntax to configure the extractor to read only recent history.
Source
This part of the configuration file concerns the extraction from the OPC UA server.
Parameter | Description |
---|---|
endpoint-url | The URL of the OPC UA server to connect to. In practice, this is the URL of the discovery server, where multiple levels of security may be provided. The OPC UA extractor attempts to use the highest security possible based on the configuration. Required. |
alt-endpoint-urls | List alternative endpoint URLs the extractor can attempt when connecting to the server. Use this for non-transparent redundancy. See the OPC UA standard part 4, section 6.6.2. We recommend setting force-restart to true . Otherwise, the extractor will reconnect to the same server each time. |
endpoint-details | Details to override default endpoint behavior. This is used to make the client connect directly to an OPC UA endpoint, for example if the server is behind NAT (Network Address Translation), circumventing server discovery. This parameter contains one field: override-endpoint-url, which overrides the URL of the selected endpoint. |
redundancy | Additional configuration options related to redundant servers. The OPC UA extractor supports Cold redundancy, as described in the OPC UA standard part 4, section 6.6.2. Options:
|
reverse-connect-url | The local URL used for reverse-connect. This is the URL the server should connect to. You should also specify an endpoint-url . The server is responsible for initiating connections, so it can be placed behind a firewall. Leave empty to use direct connections. |
auto-accept | Set to true to automatically accept connections from servers. If you set this to false and try to connect to a server with higher security than None , the connection fails. A certificate is placed in the rejected certificates folder (by default application_dir/pki/rejected/ ), but you can manually move it to the accepted certificates folder (application_dir/pki/accepted ). A simple solution is to set this to true once on the first connection, then change it to false . |
username/password | Used for server sign-in. Leave username empty to use no authentication. |
x509-certificate | Specifies the configuration for using a signed x509 certificate to connect to the server. Options:
|
secure | Try to connect to an endpoint with security above None . |
ignore-certificate-issues | Ignore all suppressible certificate errors on the server certificate. You can use this setting if you receive errors such as Certificate use not allowed. CAUTION: This is potentially a security risk. Bad certificates can open the extractor to man-in-the-middle attacks from the server or similar. If the server security is located elsewhere (it's running locally, over a secure VPN, or similar), it's most likely fairly safe. Some errors aren't suppressible and must be remedied on the server. |
publishing-interval | Sets the interval (n milliseconds) between publishing requests to the server. This limits the maximum frequency of points pushed to CDF but not the maximum frequency of points on the server. In most cases, this can be set to the same as Extraction.DataPushDelay . If you set it to 0 , the server chooses the interval according to the specification. |
force-restart | If true , the OPC UA extractor won't attempt to reconnect using the OPC UA reconnect protocol on a disconnect from the server but restart completely. Use this option for servers that do not support reconnecting |
exit-on-failure | If true , the OPC UA extractor won't automatically restart after a crash, but defer to some external mechanism. |
restart-on-reconnect | If true , the OPC UA extractor will be restarted on reconnect. This may not be required if the server is expected to be static and if it handles reconnects well. Setting this to true lowers restart times. |
keep-alive-interval | Specifies the interval in milliseconds between each keep-alive request to the server. The connection times out if a keep-alive request fails twice (2 * interval + 100ms). This typically happens if the server hangs on a heavy operation and doesn't manage to respond to keep-alive requests or if the server goes down. In the first case, waiting can be a good option. In the second case, it's better to time out quickly. |
node-set-source | Read from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for smaller servers, where the full node hierarchy is defined. In general, it can be used to lower the load on the server if parts of it's known beforehand. Options:
|
limit-to-server-config | The default value true uses the Server_ServerCapabilities object to limit chunk sizes. Set this to false only if you want to set the limits higher and are certain that the server is reporting the wrong limits. If the real server limits are exceeded, the extractor will typically crash. |
alt-source-background-browse | If true , browses the OPC UA node hierarchy in the background when reading nodes from NodeSet files or from CDF RAW. This setup doesn't reduce the load on the server but can speed up startup. |
browse-chunk | Sets the number of maximum desired results from each call of the Browse service to OPC UA. Most servers have some limits, but the default of 1000 is usually reasonable. The server should also usually limit this on its own. |
browse-nodes-chunk | Sets the number of maximum nodes to browse per browse service call. If set too high, the browse operation may fail. Most servers have an upper limit to the number of operations per service call, and this value also may affect the speed. We don't recommend setting this to 1, but it may be necessary for some servers. |
attributes-chunk | Specifies the maximum number of attributes to fetch per operation. If the server fails with a TooManyOperations exception during attribute read, it may help to lower this value. 1000 should be fine for most servers and may even be set higher for higher-spec servers. For very large servers, 1000 will take a long time, and this should be set as high as possible, even if that requires increasing the keep-alive-interval. |
subscription-chunk | Sets the maximum number of new MonitoredItems to create per operation. If the server fails with TooManyOperations , try to lower this value. Unless there are a large number of nodes on the server, 1000 per chunk is generally fine. |
browse-throttling | Configuration object for throttling browses.
|
certificate-expiry | Specifies the default certificate expiration in months. You can also replace the certificate with your own by modifying the .xml configuration file. Defaults to 5 years as of v2.5.3. |
retries | Specify the retry policy for requests to the OPC UA server.
|
History
The OPC UA extractor supports reading from data and event history in OPC UA. For data, the Historizing attribute must be set on the nodes to be read. For events, you must specify explicitly the node IDs of the emitters in the configuration.
Parameter | Description |
---|---|
enabled | Set to false to disable history read. This overrides all other history configurations and disables these entirely for both events and data points. |
data | Set to false to disable history for data points. The default value is true . Use this to only enable history for events. |
backfill | Enable backfill, meaning that data is read backward and forward through history. The server can start reading live values without completing history-read first if there is a lot of history. If set to false (default), the behavior is pre 1.1, meaning that the data is read from the beginning of history to the end before any live streaming begins. |
require-historizing | Set to true to require Historizing to be set on time series to read history. |
restart-period | Time in seconds to wait between each restart of history. Setting this too low may impact performance. Leave at 0 to disable periodic restarts. The syntax is described in Timestamps and intervals, this option allows cron expressions. |
data-chunk | Maximum number of results to request per HistoryRead call when reading variables. Generally, this is limited by the server, so it can safely be set to 0 . |
data-nodes-chunk | Maximum number of nodes to query per HistoryRead call when reading variables. If Granularity is set, this is applied afterward. |
event-chunk | Maximum number of results to request per HistoryRead call when reading events. Generally, this is limited by the server, so it can safely be set to 0 . |
event-nodes-chunk | Maximum number of nodes to query per HistoryRead call when reading events. |
granularity | Granularity in seconds for chunking history read operations. Variables with the latest timestamp within the same chunk have their history read together. Reading more variables per operation is more efficient, but if the granularity is set too high, then a large number of duplicates are fetched. This can be inefficient for very large granularities. The best choice for this value is a few times the expected update frequency of your variables. The syntax is described in Timestamps and intervals. |
start-time | Earliest timestamp to read from in milliseconds since January 1, 1970. The syntax is described in Timestamps and intervals, -ago can be added to make a timestamp in the past. |
end-time | Timestamp to be considered the end of forward history. Only relevant if max-read-length is set. In milliseconds since 1/1/1970. The default is the current time, if this is 0. The syntax is described in Timestamps and intervals, -ago can be added to make a timestamp in the past. |
ignore-continuation-points | Set to true to attempt to read history without using ContinationPoints , instead using the Time of events and SourceTimestamp of data points to incrementally change the start time of the request until no points are returned. |
max-history-length | Maximum length of each read of history, in seconds. If this is set greater than zero, history will be read in chunks of maximum this size until the end. This can potentially take a very long time if end-time is much larger than start-time . The syntax is described in Timestamps and intervals. |
throttling | Configuration object for throttling history reads.
|
log-bad-values | The default value is true . Log bad history data points, count per read at debug, and each data point at verbose. |
error-threshold | The threshold in percent for a history run to be considered failed. For example, if this is set to 10.0 , the history read will be considered failed if more than 10% of nodes fail to read at some point. Retries still apply. This only applies to nodes that fail even after retries. This is safe in terms of data loss. A node that has failed during history will not receive state updates from streaming. |
Dry run
The dry-run option is on the top level. If this is set to true
, the extractor will read from OPC UA, but not push anything to CDF. This is useful for debugging the extractor setup.
Cognite - CDF API
Configuration for pushing directly to the CDF API.
Parameter | Description |
---|---|
project | The CDF project. Required. Can be left out if the OPC UA extractor is set to debug mode. |
host | The CDF service URL. |
read-extracted-ranges | Specifies whether to read start/endpoints on startup, where possible. At least one pusher should be able to do this. Otherwise, the back/frontfill will run for the entire history of every restart. The CDF pusher can't read start/end points for events, so if reading historical events is enabled, one other pusher able to do this should be enabled. If the server has a lot of variables, this can be extremely slow, and we recommend using the state-store instead. |
data-set-id | The internal ID of the CDF data set to be used for all new time series, assets, and events. Already created items won't be affected. |
data-set-external-id | The data set to use for new objects, overridden by data-set-id . Requires the capability datasets:read for the given data set. |
nan-replacement | Replacement value for values that are non-finite, for instance NaN, +Infinity, and -Infinity. If this is left empty, these points are ignored. |
metadata-targets | Configuration for targets for metadata, meaning assets, time series metadata, and relationships. |
metadata-targets/clean | Configuration for enabling writing to clean. Options:
|
metadata-targets/raw | Configuration for writing to CDF RAW. Options:
|
raw-metadata | Configuration for using CDF RAW to store assets and time series metadata. This is deprecated in favor of cognite.metadata-targets. |
raw-node-buffer | Read from CDF instead of OPC UA when starting the extractor to speed up starting on slow servers. This requires the extraction.expand-node-ids and extraction-append-internal-values to be set to true . Generally, this would be enabled along with skip-metadata or raw-metadata. Reading from CDF RAW into clean using this is generally not supported. If browse-on-empty is set to true , and raw-metadata is configured with the same database and tables, the extractor will read from the server on first startup only, then use CDF RAW for all further reads. With this enabled, rebrowse/updates are generally pointless.
|
metadata-mapping | Contains two string/string maps named assets and timeseries. It lets you define mappings between properties in OPC UA and CDF attributes. For example, it's quite common for variables in OPC UA to have an EngineeringUnits field, which ideally should be mapped to unit in CDF. This can be done with timeseries:EngineeringUnits: unit Valid attributes are:name , description , and parentId , and unit for time series. parentId must be the parent external ID of the time series, and it must be an asset mapped by the OPC UA extractor. It may be a string ID or a node ID. |
skip-metadata | If true , assets won't be written to CDF, and only basic time series will be created. This is the same as when raw-metadata is enabled, except that nothing will be pushed to CDF RAW either. This is deprecated in favor of cognite.metadata-targets. |
idp-authentication | Configuration for authentication using a bearer access token. See OAuth 2.0 client credentials flow. Required fields are client-id , tenant , secret , scopes . min-ttl is optional minimum time-to-live in seconds for the token. The default value is 30 . The authentication is inferred if you enter a tenant or token-url . You can only set one. If you set tenant , MSAL is used for authentication. If you set token-url , basic is used for authentication. . authority is the identity provider endpoint. The default is https://login.microsoftonline.com/ . |
cdf-retries | Configure automatic retries on requests to CDF. Fields:
failure-buffering starts, which may be necessary if there is a lot of data. |
cdf-chunking | Configure chunking of data on requests to CDF. Note that some of these reflect actual limits in the API, and increasing them may cause requests to fail. See https://docs.cognite.com/api/v1/.
|
cdf-throttling | Configure how requests to CDF should be throttled. Each entry is the maximum allowed number of parallel requests to CDF. Fields: time series , assets , datapoints , raw , ranges (first/last data point), and events . |
sdk-logging | Configuration for logging using the .NET SDK. This is additional debug information about requests and will show in detail what requests fail and how long they take.
|
extraction-pipeline | Configure an extraction pipeline manager. The pipeline must be created beforehand.
|
browse-callback | Call a Cognite function with the number of assets, time series, and relationships created and updated after each browse and rebrowse operation. The function is called with a JSON object containing the following fields:
functions:WRITE scoped to the function given by external ID or ID, and functions:READ if external ID is used. It's a YAML object with fields:
|
delete-relationships | If this is set to true , relationships deleted from the source will be hard-deleted in CDF. |
Influx
Configuration for pushing to an InfluxDB database. Data points and events will be pushed, but no context or metadata.
Parameter | Description |
---|---|
host | The URL of the InfluxDB server. |
username | The username for connecting to the database. |
password | The password for connecting to the database |
database | The database to connect to on the server. The database won't be created automatically. |
read-extracted-ranges | Whether to read start/endpoints on startup, where possible. |
read-extracted-event-ranges | Whether to read start/endpoints for events on startup, where possible. |
point-chunk-size | Maximum number of points per push. Try to increase if the pushing seems to be slow. |
non-finite-replacement | Replacement value for values that are non-finite, e.g. NaN, +Infinity and -Infinity. Leave empty to ignore these points. |
MQTT
The MQTT pusher pushes to CDF one-way over MQTT. It requires that the MQTTCDFBridge application is running somewhere with access to CDF.
Parameter | Description |
---|---|
host | The address of TCP MQTT broker. This needs to be running for the pusher to function. |
port | The port on the TCP MQTT broker. |
username | The MQTT broker username. Leave empty to connect without authentication. |
password | The MQTT broker password. Leave empty to connect without authentication. |
client-id | The MQTT Client ID. This needs to be unique for each broker. |
data-set-id | The internal ID of CDF dataset to be used for all new time series, assets, and events. Already created items won't be affected. |
asset-topic | The topic to use for assets. Needs to match the configuration of MQTTCDFBridge (it does by default). |
ts-topic | The topic to use for time series. |
event-topic | The topic to use for events. |
datapoint-topic | The topic to use for data points. |
raw-topic | The topic to use for raw rows. |
local-state | Set to enable storing a list of created assets/time series in a local database. Requires the StateStorage.Location property to be set. The value of this option is the table name. The default value is empty. Using this with raw state-storage doesn't make sense. |
invalidate-before | Timestamp in ms since epoch to invalidate stored states. Any objects created before this will be replaced the next time the OPC UA extractor is restarted. |
non-finite-replacement | The replacement value for values that are non-finite e.g. NaN, +Infinity and -Infinity, or not between -10^100 and 10^100. If this is left empty, these points are ignored. |
raw-metadata | Configuration for using CDF RAW to store assets and time series metadata. |
raw-metadata/database | The CDF RAW database to store metadata in, required for this feature to be enabled. |
raw-metadata/assets-table | The CDF RAW table to store assets in. If this is set along with database, assets aren't pushed to the asset hierarchy but instead written to RAW. Time series won't be contextualized in this case, but if timeseries-table is set, the asset external ID will be stored there. The assets are pushed as full asset JSON objects with all the data available from extraction. |
raw-metadata/timeseries-table | The CDF RAW table to store time series in. If this is set along with database, time series are pushed with minimum information (isStep , isString , externalId ). Everything else is stored in CDF RAW as full time series JSON objects. |
metadata-mapping | Contains two string/string maps named assets and timeseries. It lets you define mappings between properties in OPC UA and CDF attributes. For example, it's quite common for variables in OPC UA to have an EngineeringUnits field, which ideally should be mapped to a unit in CDF. This can be done with timeseries:EngineeringUnits: unit Valid attributes arename , description , and parentId , and unit for time series. parentId must be the parent externalId of the time series, and it must be an asset mapped by the OPC UA extractor. It may be a string ID directly or a node ID. |
skip-metadata | If true , assets won't be written to CDF, and only basic time series will be created. This is the same as when raw-metadata is enabled, except that nothing will be pushed to CDF RAW either. |
allow-untrusted-certificates | If true , allow untrusted certificates when connecting to the MQTT broker. This is a security risk. We recommend using custom-certificate-authority instead. |
custom-certificate-authority | Path to a custom certificate file for a certificate authority the broker SSL certificate will be verified against. |
Logger
Log entries are either Fatal
, Error
, Warning
, Information
, Debug
, Verbose
, in order of decreasing importance. Each level covers the ones of higher importance.
Parameter | Description |
---|---|
console/level | The level of messages to write to console. If not present, or invalid, logging to console is disabled. One of fatal , error , warning , information , debug , or verbose . |
file/level | The level of messages to write to file. If not present, or invalid, logging to file is disabled. One of fatal , error , warning , information , debug , or verbose . |
file/path | The path to a log file, logs are rotated. |
file/retention-limit | The maximum number of logs to keep in log folder. The oldest are deleted. |
file/rolling-interval | A rolling interval for log files. Either day or hour . The default value is day . |
ua-trace-level | Capture OPC-UA tracing at this level or above. One of fatal , error , warning , information , debug , or verbose . This parameter is optional. |
ua-session-tracing | Log data sent to and received from the OPC UA server. |
StateStorage
A local LiteDb database or a table in CDF RAW that stores various persistent information between runs. It can be used as a replacement of the potential process of reading first/last data points from CDF, and also allow storing first/last times for events.
Parameter | Description |
---|---|
location | The path to the .db file used for storage, or the name of the CDF RAW database. |
interval | The time between each time the state store is updated. Use syntax described in Timestamps and intervals. Defaults to 10s . |
database | Which type of database to use. Valid options are None , Raw , LiteDb . |
variable-store | The name of the table or litedb collection to store information about extracted OPC UA variables. |
event-store | The name of the table or litedb collection to store information about extracted events. |
influx-variable-store | The name of the table or litedb collection to store information about variable ranges in influxdb failure buffer. |
influx-event-store | The name of the table or litedb collection to store information about event ranges in influxdb failure buffer. |
FailureBuffer
If the connection to a destination goes down, the OPC UA extractor supports buffering data points and events in influxdb or a local file. This is helpful if the connection is unstable.
Parameter | Description |
---|---|
datapoint-path | The path to the binary file where data points are buffered. Leave empty to disable pushing data points to file. Buffering to file is very fast, and is generally hardware bound. |
enabled | Set to true to enable the FailureBuffer for all pushers. |
event-path | The path to the binary file where events are buffered. Leave empty to disable pushing events to file. |
influx | Set to true to enable buffering in influxdb. This requires influxdb to be running. This serves as an alternative to a local file, but should only be used if pushing to influxdb is required for other reasons. |
influx-state-store | Set to true to enable storing the state of the influxdb buffer to a local database. This makes the influxdb buffer persistent even if the OPC UA extractor stops before it's emptied. Requires the StateStorage.Location option to be set. |
max-buffer-size | Set the maximum size in bytes for the buffer file. If the size exceeds this size, no new datapoints or events will be written to their respective buffer files, and any further ephemeral data is lost. Note that if both datapoint and event buffers are enabled, the potential disk usage is twice this number. |
Metrics
The OPC UA extractor can push some metrics about usage to a Prometheus pushgateway server.
Parameter | Description |
---|---|
server/host | The hostname for a locally hosted Prometheus server, used for scraping. |
server/port | The port used for a locally hosted Prometheus server. |
push-gateways | A list of pushgateway configurations. The OPC UA extractor will periodically push to each of these in turn. |
push-gateways/host | The pushgateway URL root. Ex. config my.prometheus.server and job myjob gives the final endpoint my.prometheus.server/metrics/jobs/myjob |
push-gateways/job | The job to use in the destination. |
push-gateways/username | The username for the Prometheus target. |
push-gateways/password | The password for the Prometheus target. |
nodes | Use to treat certain OPC UA nodes as metrics.
|
Extraction
Contains configuration settings for most extraction options, such as mapping, datatypes, and filters.
External ID generation
IDs used in OPC UA are special nodeId
object with an identifier and a namespace that need to be converted to a string for destination systems. However, a direct conversion has several problems:
- It will use the
namespaceIndex
, which isn't necessarily preserved between server restarts. - The
namespace
table may be modified, in which case all oldnodeIds
are invalidated.NodeIds
are also not unique between OPC UA servers and frequently just count from 1, which makes reading from multiple OPC UA servers impossible.
- Node identifiers can be duplicated on different namespaces.
The solution is a nodeId on the following form:
IdPrefix + namespace + identifiertype(i,s,g,etc.) + = + identifier value as string
(+ [index in array if viable])
For example, the node with nodeId
(SomeId
, http://my.namespace.url
), using the ID prefix gp:
will be mapped to gp:http://my.namespace.url:i=SomeId
. You can specify a namespace mapping in extraction/namespace-map to, for example, turn this into gp:mnu:i=SomeId
If it's an array, it turns into an object with the above ID, and several time series with IDs like gp:mnu:i=SomeId[1]
.
Alternatively, you can manually override each nodeId
.
Parameter | Description |
---|---|
id-prefix | Prefix used to generate NodeIds . |
ignore-name-prefix | DEPRECATED, use transformations. List of strings used to filter out prefixes on the DisplayName of nodes during browsing. This means that children of these nodes are also filtered out. |
ignore-name | DEPRECATED, use transformations. List of full DisplayNames to ignore instead of just a prefix. |
data-push-delay | Time between each push to destinations, in ms. The syntax is described in Timestamps and intervals. |
root-node | A single ProtoNodeId (as described above) used as the origin of the browse. An empty ProtoNodeId (no identifier or no namespace) is treated as the objects folder. Combined with root-nodes, if specified. If neither root-node or root-nodes is specified, this defaults to the Objects folder. |
root-nodes | A list of ProtoNodeIds to use as root nodes when browsing. These will generally be created as root assets in CDF. If a node set as root node is discovered as a descendant of another root node it will be ignored, but it may be best to avoid doing this at all. |
node-map | Map from strings, representing externalIds, to ProtoNodeIds . This can be used to override the externalIds, for example to place the hierarchy as children of an asset in CDF. For example, if UaRoot is set to the same value as the RootNode , all the nodes in the tree will be placed as children of the node with externalId UaRoot. |
namespace-map | Used as described above to map namespaces to shortened identifiers. |
data-types | Sub-object containing configuration for how data types and arrays should be handled by the OPC UA extractor. |
data-types/custom-numeric-types | Used to manually set types in OPC UA to be numeric. This can be used to make custom types be treated as numbers, etc. The conversion is done with the C# Convert functionality. If no valid conversion exists, this will fail. |
data-types/ignore-data-types | List of ProtoNodeId (as described above), describing data types on variables to filter out. |
data-types/unknown-as-scalar | Assume non-specific ValueRanks in OPC UA (ScalarOrOneDimensions and Any ), are scalar, if they do not have an ArrayDimension set. If such a variable produces an array, only the first element will be mapped to CDF. In order to properly extract arrays to CDF, ArrayDimensions must be set. |
data-types/max-array-size | Maximum length of arrays to be mapped to destinations. If this is set to 0 , only scalar values are mapped. Each array-type variable in the source system is converted to an object in the destination system, then each entry in the array is added as a child variable of that object. (In CDF this will mean that you get an asset with the externalId corresponding to the original variable, with time series for each entry in the array.) This requires the ArrayDimensions property to be set and be of length 1. |
data-types/allow-string-variables | Set to true to map variables of non-numeric types to strings in destination systems. |
data-types/auto-identify-types | Map out the data type hierarchy before starting. This is useful if there are custom or enum types. This is necessary for enum metadata and for enums-as-strings to work. If set to false , any custom numeric types must be added manually.This causes some extra work on startup. |
data-types/enums-as-strings | If set to false and auto-identify-types is set to true , or there are manually added enums in custom-numeric-types, enums will be mapped to numeric time series, and labels are added as metadata fields. If set to true , labels aren't mapped to metadata, and enums will be mapped to string time series with values equal to mapped label values. |
data-types/data-type-metadata | Add a metadata property dataType which contains the name or ID of the OPC UA datatype. Built-in types can always be mapped to name, custom types require auto-identify-types to be set to true. |
data-types/null-as-numeric | Treat null data types as numeric. This can be useful on servers without string variables and faulty data types. |
data-types/expand-node-ids | Add attributes such as NodeId , ParentNodeId , and TypeDefinitionId to nodes in CDF RAW , as full NodeIds encoded reversibly. |
data-types/append-internal-values | Add internal attributes like ValueRank , ArrayDimensions , AccessLevel , and Historizing to nodes in CDF RAW . |
data-types/estimate-array-sizes | If max-array-size is set, this looks for the MaxArraySize property on each node with one-dimension ValueRank. If this isn't found, it tries to read the value as well and look at the current size. ArrayDimensions is still the preferred way to identify array sizes, this isn't guaranteed to generate reasonable or useful values. |
auto-rebrowse-period | Time in minutes between each automatic re-browse of the node hierarchy. Since only new nodes are pushed to destinations, this is usually quite fast. The syntax is described in Timestamps and intervals, this option accepts cron expressions. |
enable-audit-discovery | The OPC UA extractor listens to AuditAddNodes and AuditAddReferences events on the server node, then uses the information in these to browse the hierarchy. This is more efficient than browsing periodically, but requires server support for auditing. |
map-variable-children | By default, children of variables are treated as properties. If this is set to true , they can be treated as objects or variables instead. This will cause some variables to be mapped to both time series and assets, to allow time series to have time series children. |
update | Update data in destinations on re-browse or restart. Set auto-rebrowse-period to some value to do this periodically. Consists of two objects, objects, and variables, controlling updates of assets and time series, respectively. For each, name, description, context, and metadata can be configured separately. context refers to the structure of the node graph in OPC UA (assetId and parentId in CDF). Metadata refers to any information obtained from OPC UA properties (metadata in CDF).Enabling any of these will increase the startup- and rebrowse-time of the OPC UA extractor. Enabling metadata will increase it more. |
relationships | Map OPC UA non-hierarchical references to relationships in CDF. The generated relationships will have external-id [prefix][reference type name (or inverse-name)];[namespace source][id source];[namespace target][id target] Only relationships between mapped nodes will be added. This may be relevant if the server contains functional relationships, like connected components, a non-hierarchical reference based system for location, etc. |
relationships/enabled | Enable mapping non-hierarchical relationships to CDF. This is also required for any kind of relationship mapping to occur at all. |
relationships/hierarchical | Map hierarchical references to relationships in CDF. |
relationships/inverse-hierarchical | Create inverse relationships for each hierarchical reference. For efficiency these are inferred, not read. |
node-types | Config related to mapping object- and variable-types to destinations. |
node-types/metadata | Add the TypeDefinition as a metadata field to all nodes. |
node-types/as-nodes | Allow discovered types to be treated as nodes and mapped to CDF assets. Requires these to be inside the hierarchy, a solution to this may be to specify the Types folder as a root node. |
transformations | A list of transformations to be applied to the source nodes before pushing. The possible transformations are:
|
rebrowse-triggers | Configure the extractor to trigger a rebrowse of the server when there are changes to specific namespace metadata nodes. Options:
|
deletes | Configuration for soft deletes. When this is enabled, all read nodes are written to a state store after browse. Nodes that are missing on subsequent browses are marked as deleted from CDF with a configurable marker. A notable exception is relationships in CDF, which have no metadata. These are hard-deleted if cognite.delete-relahionships is enabled. Options:
|
Subscriptions
A few options for subscriptions to events and data points. Subscriptions in OPC UA consist of Subscription objects on the server, which contain a list of MonitoredItems. By default, the extractor produces a maximum of four subscriptions:
- DataChangeListener - handles data point subscriptions.
- EventListener - handles event subscriptions.
- AuditListener - which handles audit events.
- NodeMetrics - which handles subscriptions for use as metrics.
Each of these can contain a number of MonitoredItems.
Parameter | Description |
---|---|
data-points | The default value is true . Enables subscriptions on data points. |
events | The default value is true . Enables subscriptions for events. |
data-change-filter | Modify the DataChangeFilter used for data point subscriptions. See OPC UA reference part 4 7.17.2 for details. These are passed to the server in DataChangeListener.
|
ignore-access-level | Ignore the AccessLevel attribute and subscribe to all Variables, reading history from all nodes with Historizing set to true . This is the pre-2.3 behavior. |
log-bad-values | Log bad subscription data points. |
sampling-interval | Sets the sample rate of subscriptions on the server. The server usually defines a set of permitted sample-rates and picks the closest to what you specify here. Many servers don't support more than a single sample rate. Set the interval to 0 to use the server default. This setting generally sets the maximum rate of points from the server (in milliseconds). On many servers, sampling is an internal operation, but on some, this may access external systems. Setting this very low can increase the load on the server significantly. It typically limits the density of the points from the server, but not always. |
queue-length | Specifies the length of the internal server queue for points and events. Normally, this can be set to the same as publishing-interval/sampling-interval. Higher numbers increase the strain on the server. Many servers have a limited maximum queue size or ignore this parameter entirely and use a fixed size for everything. |
keep-alive-count | The number of publish requests without a response before the server should send a keep alive message. Default 10. |
lifetime-count | The number of publish requests without a response before the server should close the subscription. Must be at least 3 * keep-alive-count . Default 1000. |
alternative-configs | List of alternative subscription configurations. The first entry with a matching filter will be used for each node. Contains data-change-filter, sampling-interval, and queue-length, as well as filter, which contains the following fields:
|
Events
Events in OPC UA are usually custom when used on a server, and servers that support events often have a large number active. In OPC UA, any node may specify the EventNotifier
property, which indicates that it emits events and optionally stores historical events.
By default, all events will be read. If all-events is set to false
, only events that do not belong to the base namespace will be read.
The attributes of each event are automatically mapped out, and a few general properties are filtered off. Others may be used as metadata in CDF or other source systems, or in some cases be mapped directly to event properties.
If the event has a SourceNode
that refers to a node in the mapped hierarchy, it will be used to set the assetId
property on the event in CDF.
The old options event-ids, emitter-ids, and historizing-emitter-ids are, deprecated, but will still work and may be used as a workaround for servers that aren't fully compliant with the OPC UA standard.
Parameter | Description |
---|---|
enabled | True to enable reading events from the server. If this is false , no events will be read. |
history | True to enable reading historical events. |
all-events | True to read all events, not just custom events. The default value is true . |
read-server | True to also check the server node when looking for event emitters. The default true . |
exclude-event-filter | Regex filter on event type DisplayName , matches won't be extracted. |
exclude-properties | List of BrowseNames for properties of events to be excluded from metadata or other consideration. By default, only Time and Severity are used from the BaseEventType , all properties of subtypes are included. |
destination-name-map | Map source browse names to other values in the destination. For CDF, internal properties may be overwritten, by default Message is mapped to description, SourceNode is used for context, and EventType is used for type. These may also be excluded or replaced by overrides in DestinationNameMap . If multiple properties are mapped to the same value, the first non-null is used. If StartTime , EndTime , or SubType are specified, either directly or through the map, these are used as event properties instead of metadata. StartTime and EndTime should be either DateTime , or a number corresponding to the number of milliseconds since January 1 1970. If no StartTime or EndTime are specified, both are set to the Time property of BaseEventType . Type may be overridden case-by-case using NodeMap in the Extraction configuration, or in a dynamic way here. If no Type is specified, it's generated from Event NodeId in the same way ExternalIds are generated for normal nodes. |
event-ids (deprecated) | List of ProtoNodeIds (as described above) to be mapped to destinations. Events must be ObjectTypes and subtypes of BaseEventType in the OPC UA hierarchy. An empty ProtoNodeId defaults to the BaseEventType . This serves as an allowlist. If not specified, all events will be extracted. |
emitter-ids (deprecated) | List of ProtoNodeIds used as emitters. An empty ProtoNodeId defaults to the server node. This allows specifying additional event emitters. This is used to add extra emitters that aren't in the extracted node hierarchy, or that doesn't correctly specify the EventNotifier property. |
historizing-emitter-ids (deprecated) | List of ProtoNodeIds that must be a subset of the EmitterIds . These emitters will have their event history read. The server must support this. The events.history option must be set for this to work. This is used to supplement the EventNotifier property, so that events that do not have the EventNotifier property set may still have their events read. Note that attempting to read historical events from non-historizing emitters may cause issues. |
Pub-Sub
This is an experimental feature that allows subscribing to OPC UA pubsub instead of using OPC UA subscriptions for data points only. This requires the OPC UA server to be available and to expose the full PubSub configuration, as described in Part 14 of the OPC UA standard. It currently only supports MQTT.
Note that this doesn't disable subscriptions, you may want to consider setting subscriptions: data-points: false to avoid getting double data points. Time series aren't created from OPC UA pubsub configuration, but must be discovered in the OPC UA node hierarchy.Parameter | Description |
---|---|
enabled | The default value is false . Enables pub-sub discovery. |
prefer-uadp | The default value is true . If set to false , the extractor will prefer using uadp if the same datasets are exposed through multiple DataSetWriters. |
file-name | Save or read configuration from a file. If the file doesn't exist, it will be created from server configuration. If this is pre-created manually, the server doesn't need to expose pubsub configuration. |
High availability
The extractor can run with a rudimentary form of redundancy. Multiple extractors on different machines are on standby, with one actively extracting from the OPC UA server. Each extractor must have a unique index.
Parameter | Description |
---|---|
index | A unique index for this extractor. Indices must be unique, or high availability will not work correctly. |
raw | Use the CDF staging area as a shared store for the extractor. This configuration must be the same for each redundant extractor.
|
redis | Use a redis store as shared state for the extractor. This configuration must be the same for each redundant extractor.
|