Astra Help

Config file reference

This is a reference of all available options that can be set in an Astra config file, and recommended values.

nodeRoles

Defines the mode to operate a node in. Astra ships as a single binary to simplify deployments, and this configuration is used to select which operating mode to run in.

nodeRoles: [QUERY,INDEX,CACHE,MANAGER,RECOVERY,PREPROCESSOR]
QUERY

A query node provides http API for querying cluster and aggregates results from indexers and cache nodes.

INDEX

An indexer node reads from Kafka to create a Lucene index on disk, which is then uploaded to S3.

CACHE

A cache node receives chunk assignments from the manager node, which are downloaded from S3.

MANAGER

The manager node is responsible for cluster orchestration,

RECOVERY

A recovery node indexes data that is skipped by an indexer, publishing snapshots.

PREPROCESSOR

A preprocessor node serves a bulk ingest HTTP api, which is then transformed, rate limited, and sent to Kafka.

indexerConfig

Configuration options for the indexer node.

maxMessagesPerChunk

Maximum number of messages that are created per chunk before closing and uploading to S3. This should be roughly equivalent to the maxBytesPerChunk, such that a rollover is triggered at roughly the same time regardless of messages or bytes.

indexerConfig: maxMessagesPerChunk: 100000

maxBytesPerChunk

Maximum bytes that are created per chunk before closing and uploading to S3. This should be roughly equivalent to the maxMessagesPerChunk, such that a rollover is triggered at roughly the same time regardless of messages or bytes.

indexerConfig: maxBytesPerChunk: 1000000

maxTimePerChunkSeconds

Maximum time that a chunk can be open before closing and uploading to S3. Defaults to 90 minutes. This configuration is useful for ensuring that chunks are uploaded to S3 within a set time frame, during non-peak hours when we don't hit maxMessagesPerChunk or maxBytesPerChunk for several hours

indexerConfig: maxTimePerChunkSeconds: 5400

luceneConfig

indexerConfig: luceneConfig: commitDurationSecs: 30 refreshDurationSecs: 10 enableFullTextSearch: false
commitDurationSecs

How often Lucene commits to disk. This value will impact the accuracy of calculating the chunk size on disk for rollovers, so should be set conservatively to keep chunk sizes consistent.

refreshDurationSecs

How often Lucene refreshes the index, or makes results visible to search.

enableFullTextSearch

Indexes the contents of each message to the _all field, which is set as the default query field if not specified by the user. Enables queries such as value instead of specifying the field name explicitly, field:value.

staleDurationSecs

indexerConfig: staleDurationSecs: 7200

How long a stale chunk, or a chunk no longer being written to, can remain on an indexer before being deleted. If the indexerConfig.maxChunksOnDisk limit is reached prior to this value the chunk will be removed.

dataDirectory

indexerConfig: dataDirectory: /mnt/localdisk

Path of data directory to use. Generally recommended to be instance storage backed by NVMe disks, or a memory mapped storage like tmpfs for best performance.

maxOffsetDelayMessages

indexerConfig: maxOffsetDelayMessages: 300000

Maximum amount of messages that the indexer can lag behind on startup before creating an async recovery tasks. If the current message lag exceeds this the indexer will immediately start indexing at the current time, and create a task to be indexed by a recover node from the last persisted offset to where the indexer started from.

defaultQueryTimeoutMs

indexerConfig: defaultQueryTimeoutMs: 6500

Timeout for searching an individual chunk. Should be set to some value below the serverConfig.requestTimeoutMs to ensure that post-processing can occur before reaching the overall request timeout.

readFromLocationOnStart

indexerConfig: readFromLocationOnStart: LATEST

Defines where to read from Kafka when initializing a new cluster.

EARLIEST

Use the oldest Kafka offset when initializing cluster to include all messages currently on Kafka.

LATEST

Use the latest Kafka offset when initializing cluster, will start indexing new messages from the cluster initialization time onwards. See indexerConfig.createRecoveryTasksOnStart for an additional config parameter related to using LATEST.

createRecoveryTasksOnStart

indexerConfig: createRecoveryTasksOnStart: true

Defines if recovery tasks should be created when initializing a new cluster.

maxChunksOnDisk

indexerConfig: maxChunksOnDisk: 3

How many stale chunks, or chunks no longer being written to, can remain on an indexer before being deleted. If the indexerConfig.staleDurationSecs limit is reached prior to this value the chunk will be removed.

serverConfig

indexerConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.1 requestTimeoutMs: 7000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

kafkaConfig

indexerConfig: kafkaConfig: kafkaTopic: test-topic kafkaTopicPartition: 0 kafkaBootStrapServers: localhost:9092 kafkaClientGroup: astra-test enableKafkaAutoCommit: true kafkaAutoCommitInterval: 5000 kafkaSessionTimeout: 30000 additionalProps: "{isolation.level: read_committed}"
kafkaTopic

Kafka topic to consume messages from

kafkaBootStrapServers

Address of the kafka server, or servers comma separated.

additionalProps

String value of JSON encoded properties to set on Kafka consumer. Any valid Kafka property can be used. Conflent Kafka Producer Configuration Reference

s3Config

S3 configuration options common to indexer, recovery, cache, and manager nodes.

s3Config: s3AccessKey: access s3SecretKey: key s3Region: us-east-1 s3EndPoint: localhost:9090 s3Bucket: test-s3-bucket s3TargetThroughputGbps: 25
s3AccessKey

AWS access key. If both access key and secret key are empty will use the AWS default credentials provider.

s3SecretKey

AWS secret key. If both access key and secret key are empty will use the AWS default credentials provider.

s3Region

AWS region, ie us-east-1, us-west-2

s3EndPoint

S3 endpoint to use. If this setting is null or empty will not attempt to override the endpoint and will use the default provided by the AWS client.

s3Bucket

AWS S3 bucket name

s3TargetThroughputGbps

Throughput target in gigabits per second. This configuration controls how many concurrent connections will be established in the AWS CRT client. Recommended to be set to match the maximum bandwidth of the underlying host.

tracingConfig

tracingConfig: zipkinEndpoint: http://localhost:9411/api/v2/spans commonTags: clusterName: astra-local env: localhost samplingRate: 0.01

zipkinEndpoint

Fully path to the Zipkin POST spans endpoint. Will be submitted as a JSON array of span data.

commonTags

Optional common tags to annotate on all submitted Zipkin traces. Can be overwritten by spans at runtime, if keys collide.

samplingRate

Rate at which to sample astra's traces. A value of 1.0 will send all traces, 0.01 will send 1% of traces, etc.

queryConfig

Configuration options for the query node.

serverConfig

queryConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.2 requestTimeoutMs: 60000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

defaultQueryTimeout

queryConfig: defaultQueryTimeout: 55000

Query timeout for individual indexer and cache nodes when performing a query. This value should be set lower than the queryConfig.serverConfig.requestTimeoutMs and equal-to or greater-than the indexerConfig.serverConfig.requestTimeoutMs and cacheConfig.serverConfig.requestTimeoutMs.

managerConnectString

queryConfig: managerConnectString: 10.0.100.2:8085

Host address for manager node, used for on-demand recovery requests.

metadataStoreConfig

metadataStoreConfig: zookeeperConfig: zkConnectString: localhost:2181 zkPathPrefix: astra zkSessionTimeoutMs: 5000 zkConnectionTimeoutMs: 500 sleepBetweenRetriesMs: 100

zookeeperConfig

zkConnectString

Zookeeper connection string - list of servers to connect to or a common service discovery endpoint (ie, consul endpoint).

zkPathPrefix

Common prefix to use for Astra data. Useful when using a common Zookeeper installation.

zkSessionTimeoutMs

Zookeeper session timeout in milliseconds.

zkConnectionTimeoutMs

Zookeeper connection timeout in milliseconds.

sleepBetweenRetriesMs

How long to wait between retries when attempting to reconnect a Zookeeper session. Will retry up to the `zkSessionTimeoutMs`.

cacheConfig

Configuration options for the cache node.

slotsPerInstance

cacheConfig: slotsPerInstance: 200

Defines how many cache slots are registered per cache node. This should be set so that the slotsPerInstance multiplied by the indexerConfig.maxBytesPerChunk is less than the total available space at the cacheConfig.dataDirectory path, plus a small buffer.

replicaSet

cacheConfig: replicaSet: rep1

Unique identifier for this deployment of cache nodes. This setting, in combination with managerConfig.replicaCreationServiceConfig.replicaSets, managerConfig.replicaAssignmentServiceConfig.replicaSets and managerConfig.replicaRestoreServiceConfig.replicaSets allow running multiple deployments of cache nodes in a high availability deployment.

dataDirectory

cacheConfig: dataDirectory: /mnt/localdisk

Path of data directory to use. Generally recommended to be instance storage backed by NVMe disks, or a memory mapped storage like tmpfs for best performance.

defaultQueryTimeoutMs

cacheConfig: defaultQueryTimeoutMs: 50000

Timeout for searching an individual chunk. Should be set to some value below the serverConfig.requestTimeoutMs to ensure that post-processing can occur before reaching the overall request timeout.

serverConfig

cacheConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.3 requestTimeoutMs: 55000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

managerConfig

Configuration options for the manager node.

eventAggregationSecs

managerConfig: eventAggregationSecs: 10

Configures how long change events are batched before triggering an event executor. This helps improve performance in clusters with a large amounts of change events (chunk rollovers, pod turnover).

scheduleInitialDelayMins

managerConfig: scheduleInitialDelayMins: 1

How long after manager startup before scheduled services should start executing.

serverConfig

managerConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.10 requestTimeoutMs: 30000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

replicaCreationServiceConfig

managerConfig: replicaCreationServiceConfig: schedulePeriodMins: 15 replicaLifespanMins: 1440 replicaSets: [rep1]

Configuration options controlling replica creation after a chunk is uploaded from an indexer.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

replicaLifespanMins

How long a replica associated with a chunk will exist before expiring.

replicaSets

Array of cache replica set identifiers (cacheConfig.replicaSet) for this task to operate on.

replicaAssignmentServiceConfig

managerConfig: replicaAssignmentServiceConfig: schedulePeriodMins: 15 replicaSets: [rep1] maxConcurrentPerNode: 2

Configuration options controlling replica assignments to available cache nodes.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

replicaSets

Array of cache replica set identifiers (cacheConfig.replicaSet) for this task to operate on.

maxConcurrentPerNode

Controls how many assignment will concurrently execute. Setting this to a low value allows replicas to become available quicker as they do not compete for download bandwidth, and allows newly created replicas of higher priority to be downloaded before a long list of lower priority replicas.

replicaEvictionServiceConfig

managerConfig: replicaEvictionServiceConfig: schedulePeriodMins: 15

Configuration options controlling replica evictions from cache nodes due to expiration.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

replicaDeletionServiceConfig

managerConfig: replicaDeletionServiceConfig: schedulePeriodMins: 15

Configuration options controlling replica deletion once expired.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

recoveryTaskAssignmentServiceConfig

managerConfig: recoveryTaskAssignmentServiceConfig: schedulePeriodMins: 15

Configuration options controlling recovery tasks assignments to recovery nodes.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

snapshotDeletionServiceConfig

managerConfig: snapshotDeletionServiceConfig: schedulePeriodMins: 15 snapshotLifespanMins: 10080

Configuration options controlling snapshot deletion once expired.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

snapshotLifespanMins

Configures how long a snapshot can exist before being deleted from S3. This must be set to a value larger than the managerConfig.replicaCreationServiceConfig.replicaLifespanMins. When this is larger than the replicaLifespan it enables restoring replicas from cold storage (see managerConfig.replicaRestoreServiceConfig).

replicaRestoreServiceConfig

managerConfig: replicaRestoreServiceConfig: schedulePeriodMins: 15 maxReplicasPerRequest: 200 replicaLifespanMins: 60 replicaSets: [rep1]

Configurations controlling on-demand restores for snapshots that exist that do not have corresponding replicas.

schedulePeriodMins

How frequently this task is scheduled to execute. If the time to complete the scheduled task exceeds the period, the previous invocation will be cancelled and restarted at the scheduled time.

maxReplicasPerRequest

Maximum allowable replicas to be restored in a single request. When a request exceeds this value, and error will be returned to the user.

replicaLifespanMins

How long the restored replica will exist before expiring.

See indexerConfig.replicaCreationServiceConfig.replicaLifespanMins

replicaSets

Array of cache replica set identifiers (cacheConfig.replicaSet) for this task to operate on.

clusterConfig

Cluster configuration options common to all node type.

clusterConfig: clusterName: astra_local env: local
clusterName

Unique name assigned to this cluster. Should be identical for all node types in the cluster, and is used for metrics instrumentation.

env

Environment string for this cluster. Should be identical for all node types deployed to a single environment, and is used for metrics instrumentation.

recoveryConfig

Configuration options for the recovery indexer node.

serverConfig

recoveryConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.4 requestTimeoutMs: 10000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

kafkaConfig

kafkaConfig: kafkaTopic: test-topic kafkaTopicPartition: 0 kafkaBootStrapServers: localhost:9092 kafkaClientGroup: astra-test enableKafkaAutoCommit: true kafkaAutoCommitInterval: 5000 kafkaSessionTimeout: 30000 additionalProps: "{isolation.level: read_committed}"
kafkaTopic

Kafka topic to consume messages from

kafkaBootStrapServers

Address of the kafka server, or servers comma separated.

additionalProps

String value of JSON encoded properties to set on Kafka consumer. Any valid Kafka property can be used. Conflent Kafka Producer Configuration Reference

preprocessorConfig

Configuration options for the preprocessor node.

kafkaConfig

preprocessorConfig: kafkaConfig: kafkaTopic: test-topic kafkaBootStrapServers: localhost:9092 additionalProps: "{max.block.ms: 28500, linger.ms: 10, batch.size: 512000, buffer.memory: 1024000000, compression.type: snappy}"
kafkaTopic

Kafka topic to produce messages to

kafkaBootStrapServers

Address of the kafka server, or servers comma separated.

additionalProps

String value of JSON encoded properties to set on Kafka producer. Any valid Kafka property can be used. Conflent Kafka Producer Configuration Reference

schemaFile

preprocessorConfig: schemaFile: schema.yaml

For valid formatting options refer to Schema documentation.

serverConfig

preprocessorConfig: serverConfig: serverPort: 8081 serverAddress: 10.0.100.5 requestTimeoutMs: 55000
serverPort

Port used for application HTTP traffic.

serverAddress

Address at which this instance is accessible by other Astra components. Used for inter-node communication and is registered to Zookeeper.

requestTimeoutMs

Request timeout for all HTTP traffic after which the request is cancelled.

preprocessorInstanceCount

preprocessorConfig: preprocessorInstanceCount: 2

Indicates how many instances of the preprocessor are currently deployed. Used for scaling rate limiters such that each preprocessor instance will allow the total rate limit / preprocessor instance count through before applying.

rateLimiterMaxBurstSeconds

preprocessorConfig: rateLimiterMaxBurstSeconds: 1

Defines how many seconds rate limiting unused permits can be accumulated before no longer increasing.

rateLimitExceededErrorCode

preprocessorConfig: rateLimitExceededErrorCode: 400

Error code to return when the rate limit of the preprocessor is exceeded. If using OpenSearch Data Prepper a return code of 400 or 404 would mark the request as unable to be retried and sent to the dead letter queue.

Last modified: 03 September 2024