readFiles
Source processor that consumes files over different protocols.
Keep in mind that flow message processing is concurrent. Also, messages that fail with transient errors will be redelivered at a later time (and out of order) with other concurrent messages. Using sorting as part of the eagerMaxMessagesPerPoll
and/or sortBy
options will only provide a fuzzy and best effort ordering of the message processing.
Properties
Name | Summary |
---|---|
|
The FileTransferProtocol to use. Can be one of the following:
|
|
Host name or IP. Not relevant for |
|
Port to override the default. Not relevant for |
|
Secret key that the server uses to look up the credentials needed to perform authentication. Not relevant for |
|
Location from where files are consumed. The last element of the path must be a directory. |
|
Time between the polls in seconds. Some polling strategies might require files to be present in two poll listings to ensure they’re not being modified, so the minimum time for reading a file will be greater than this value. Do not use this parameter for scheduling purposes. |
|
The maximum number of files to fetch per poll. Must be a positive integer. If not set, all available files will be polled. When polling with the |
|
Whether the poller will only list The typical use case for switching off eagerness is when you need to process files in |
|
Whether to poll again immediately when the last poll returned files instead of waiting for |
|
Location to move files to after successful consumption. Must be a directory. If not set, then the files will be deleted. |
|
Regex that limits file consumption to files whose names match. |
|
Regex that excludes files from consumption whose names match. Takes precedence over |
|
Optional FileSortingStrategy to use when consuming files. This might incur a performance penalty if the amount of files is high. Can be one of the following:
|
|
Whether files in subdirectories below the given |
|
Time between checks if the file is still being written in milliseconds. For polling with the |
|
Whether to stepwise into directories while traversing file structures when downloading files. Defaults to Enabling stepwise incurs a performance penalty, since a |
|
Whether to check for the existence of (and update to) a file by listing the file itself or the parent folder containing the file. Defaults to Even though listing only the file is faster than listing all files in the parent folder, it may not be supported by all SFTP servers. |
|
Location to move files to when they could not be processed successfully. Only applies to flows with an exchange pattern of |
|
Optional configuration that splits the consumed files by line. Only applies to line-based text files (e.g., CSV files). If the consumed file is larger than the configured size, it will be split into fragments with the configured size. The splitting is line based, meaning the nearest previous line before the size of the fragment file crosses the threshold will be the last line in the fragment file. If an error occurs while the file consumer is processing the source file (during splitting or delivering fragments), then the source file will be placed in the Note that fragments will be delivered immediately during the splitting, so a redelivery of the source file can result in the duplicate delivery of fragments. |
|
Whether the header row from the original file should be copied to the fragments. Optional and only relevant when |
|
The file encoding to use when the file is to be split. Optional and only relevant when |
|
Whether the file endpoint should treat errors encountered while reading files as regular message processing errors. The file reading can for example fail if the file exceeds the allowed maximum size. When this feature is active (the default), these kinds of errors will be treated just like any other message processing error happening in the flow pipeline. I.e. they will be subject to regular error handing, and reported as failed messages in the monitoring tools. If the feature is deactivated the files will be rejected without any further processing. This property is optional, and |
|
Optional, descriptive name for the processor. |
|
Required identifier of the processor, unique across all processors within the flow. Must be between 3 and 30 characters long; contain only lower and uppercase alphabetical characters (a-z and A-Z), numbers, dashes ("-"), and underscores ("_"); and start with an alphabetical character. In other words, it adheres to the regex pattern |
|
Optional set of custom properties in a simple jdk-format, that are added to the message exchange properties before processing the incoming payload. Any existing properties with the same name will be replaced by properties defined here. |
|
Whether the incoming payload is available for error processing on failure. Defaults to |
Sub-builders
Name | Summary |
---|---|
Strategy for describing the external system integration. Optional. |
|
Strategy for describing how a processor’s message is logged on the server. |
|
Strategy for archiving payloads. |
|
Strategy that customizes the conversion of an incoming payload by a processor (e.g., string to object). Should be used when the processor’s default conversion logic cannot be used. |
Details
Authentication
The authenticationConfigKey
property supports secrets of type UserNameAndPassword and SshPrivateKey.
See the Secret Types documentation for formatting details.