ArangoDB v3.13 is under development and not released yet. This documentation is not final and potentially incomplete.

arangorestore Options

The startup options of the arangorestore executable

Usage: arangorestore [<options>]

General

`--all-databases`

Type: boolean

Restore the data of all databases.

This option can be specified without a value to enable it.

`--batch-size`

Type: uint64

The maximum size for individual data batches (in bytes).

Default: 8388608

`--check-configuration`

Type: boolean

Check the configuration and exit.

This is a command, no value needs to be specified. The process terminates after executing the command.

`--cleanup-duplicate-attributes`

Type: boolean

Clean up duplicate attributes (use first specified value) in input documents instead of making the restore operation fail.

This option can be specified without a value to enable it.

`--collection`

Type: string…

Restrict the restore to this collection name (can be specified multiple times).

`--compress-request-threshold`

Introduced in: v3.12.0

Type: uint64

The HTTP request body size from which on requests are transparently compressed when sending them to the server.

Show details

Automatically compress outgoing HTTP requests with the deflate compression format. Compression will only happen for HTTP/1.1 and HTTP/2 connections, if the size of the uncompressed request body exceeds the threshold value controlled by this startup option, and if the request body size after compression is less than the original request body size. Using the value 0 disables the automatic request compression."

`--compress-transfer`

Introduced in: v3.12.0

Type: boolean

Compress data for transport between arangorestore and server.

This option can be specified without a value to enable it.

Show details

This option enables transport compression for data received by an ArangoDB server.

`--config`

Type: string

The configuration file or “none”.

`--configuration`

Type: string

The configuration file or “none”.

`--continue`

Type: boolean

Continue the restore operation.

This option can be specified without a value to enable it.

`--create-collection`

Type: boolean

Create collection structure.

This option can be specified without a value to enable it.

Default: true

`--create-database`

Type: boolean

Create the target database if it does not exist.

This option can be specified without a value to enable it.

`--default-number-of-shards`

Deprecated in: v3.3.22, v3.4.2

Type: uint64

The default numberOfShards value if not specified in the dump.

Default: 1

`--default-replication-factor`

Deprecated in: v3.3.22, v3.4.2

Type: uint64

The default replicationFactor value if not specified in the dump.

Default: 1

`--define`

Type: string…

Define a value for a @key@ entry in the configuration file using the syntax "key=value".

`--descriptors-minimum`

Introduced in: v3.12.0

Type: uint64

The minimum number of file descriptors needed to start (0 = no minimum)

Default: 8192

`--dump-dependencies`

Type: boolean

Dump the dependency graph of the feature phases (internal) and exit.

This is a command, no value needs to be specified. The process terminates after executing the command.

`--dump-options`

Type: boolean

Dump all available startup options in JSON format and exit.

This is a command, no value needs to be specified. The process terminates after executing the command.

`--enable-revision-trees`

Introduced in: v3.8.7

Type: boolean

Enable revision trees for new collections if the collection attributes syncByRevision and usesRevisionsAsDocumentIds are missing.

This option can be specified without a value to enable it.

Default: true

`--force`

Type: boolean

Continue the restore even in the face of some server-side errors.

This option can be specified without a value to enable it.

`--force-same-database`

Type: boolean

Force the same database name as in the source dump.json file.

This option can be specified without a value to enable it.

`--honor-nsswitch`

Type: boolean

Allow hostname lookup configuration via /etc/nsswitch.conf if on Linux/glibc.

This option can be specified without a value to enable it.

`--ignore-distribute-shards-like-errors`

Type: boolean

Continue the restore even if the sharding prototype collection is missing.

This option can be specified without a value to enable it.

`--import-data`

Type: boolean

Import data into collection.

This option can be specified without a value to enable it.

Default: true

`--include-system-collections`

Type: boolean

Include system collections.

This option can be specified without a value to enable it.

`--initial-connect-retries`

Introduced in: v3.7.13, v3.8.1

Type: uint32

The number of connect retries for the initial connection.

Default: 3

`--input-directory`

Type: string

The input directory.

Default: /dump

`--log`

Deprecated in: v3.5.0

Type: string…

Set the topic-specific log level, using --log level for the general topic or --log topic=level for the specified topic (can be specified multiple times). Available log levels: fatal, error, warning, info, debug, trace.

Default: info

`--max-unused-buffers-capacity`

Introduced in: v3.12.0

Type: uint64

Maximum cumulated size of spare in-memory buffers to keep.

Default: 536870912

Show details

Maximum cumulated size of in-memory buffers to keep around for sending batches. A value > 0 will increase the memory usage of arangorestore, but can help in avoiding repeated memory allocations for building new in-memory buffers.

`--number-of-shards`

Type: string…

Override the numberOfShards value (can be specified multiple times, e.g. --number-of-shards 2 --number-of-shards myCollection=3).

`--overwrite`

Type: boolean

Overwrite collections if they exist.

This option can be specified without a value to enable it.

Default: true

`--progress`

Type: boolean

Show the progress.

This option can be specified without a value to enable it.

Default: true

`--replication-factor`

Type: string…

Override the replicationFactor value (can be specified multiple times, e.g. --replication-factor 2 --replication-factor myCollection=3).

`--threads`

Type: uint32

The maximum number of collections to process in parallel.

Default: dynamic (e.g. 8)

`--use-splice-syscall`

Introduced in: v3.9.4

Type: boolean

Use the splice() syscall for file copying (may not be supported on all filesystems).

This option can be specified without a value to enable it.

Default: true

Show details

While the syscall is generally available since Linux 2.6.x, it is also required that the underlying filesystem supports the splice operation. This is not true for some encrypted filesystems (e.g. ecryptfs), on which splice() calls can fail.

You can set the --use-splice-syscall startup option to false to use a less efficient, but more portable file copying method instead, which should work on all filesystems.

`--version`

Type: boolean

Print the version and other related information, then exit.

This is a command, no value needs to be specified. The process terminates after executing the command.

`--version-json`

Introduced in: v3.9.0

Type: boolean

Print the version and other related information in JSON format, then exit.

This is a command, no value needs to be specified. The process terminates after executing the command.

`--view`

Type: string…

Restrict the restore to this view name (can be specified multiple times).

`--write-concern`

Introduced in: v3.12.0

Type: string…

Override the writeConcern value (can be specified multiple times, e.g. --write-concern 2 --write-concern myCollection=3).

encryption

`--encryption.key-generator`

Type: string

A program providing the encryption key on stdout. If set, encryption at rest is enabled.

Show details

The program must output 32 bytes of data on the standard output and exit.

`--encryption.keyfile`

Type: string

The path to the file that contains the encryption key. Must contain 32 bytes of data. If set, encryption at rest is enabled.

Show details

You must secure the encryption key file so that only arangodump, arangorestore, and arangod can access it. You should also ensure that the file is not readable if someone steals your hardware, for example, by encrypting /mytmpfs or creating an in-memory file-system under /mytmpfs.

log

`--log.color`

Type: boolean

Use colors for TTY logging.

This option can be specified without a value to enable it.

Default: dynamic (e.g. true)

`--log.escape-control-chars`

Introduced in: v3.9.0

Type: boolean

Escape control characters in log messages.

This option can be specified without a value to enable it.

Default: true

Show details

This option applies to the control characters, that have hex codes below \x20, and also the character DEL with hex code \x7f.

If you set this option to false, control characters are retained when they have a visible representation, and replaced with a space character in case they do not have a visible representation. For example, the control character \n is visible, so a \n is displayed in the log. Contrary, the control character BEL is not visible, so a space is displayed instead.

If you set this option to true, the hex code for the character is displayed, for example, the BEL character is displayed as \x07.

The default value for this option is true to ensure compatibility with previous versions.

A side effect of turning off the escaping is that it reduces the CPU overhead for the logging. However, this is only noticeable if logging is set to a very verbose level (e.g. debug or trace).

`--log.escape-unicode-chars`

Introduced in: v3.9.0

Type: boolean

Escape Unicode characters in log messages.

This option can be specified without a value to enable it.

Show details

If you set this option to false, Unicode characters are retained and written to the log as-is. For example, 犬 is logged as 犬.

If you set this options to true, any Unicode characters are escaped, and the hex codes for all Unicode characters are logged instead. For example, 犬 is logged as \u72AC.

The default value for this option is set to false for compatibility with previous versions.

A side effect of turning off the escaping is that it reduces the CPU overhead for the logging. However, this is only noticeable if logging is set to a very verbose level (e.g. debug or trace).

`--log.file`

Type: string

Shortcut for --log.output file://<filename>

Default: -

`--log.file-group`

Type: string

The group to use for a new log file. The user must be a member of this group.

`--log.file-mode`

Type: string

The mode to use for a new log file. The umask is applied as well.

`--log.force-direct`

Type: boolean

Do not start a separate thread for logging.

This option can be specified without a value to enable it.

Show details

You can use this option to disable logging in an extra logging thread. If set to true, any log messages are immediately printed in the thread that triggered the log message. This is non-optimal for performance but can aid debugging. If set to false, log messages are handed off to an extra logging thread, which asynchronously writes the log messages.

`--log.foreground-tty`

Type: boolean

Also log to TTY if backgrounded.

This option can be specified without a value to enable it.

`--log.hostname`

Introduced in: v3.8.0

Type: string

The hostname to use in log message. Leave empty for none, use “auto” to automatically determine a hostname.

Show details

You can specify a hostname to be logged at the beginning of each log message (for regular logging) or inside the hostname attribute (for JSON-based logging).

The default value is an empty string, meaning no hostnames is logged. If you set this option to auto, the hostname is automatically determined.

`--log.ids`

Type: boolean

Log unique message IDs.

This option can be specified without a value to enable it.

Default: true

Show details

Each log invocation in the ArangoDB source code contains a unique log ID, which can be used to quickly find the location in the source code that produced a specific log message.

Log IDs are printed as 5-digit hexadecimal identifiers in square brackets between the log level and the log topic:

2020-06-22T21:16:48Z [39028] INFO [144fe] {general} using storage engine 'rocksdb' (where 144fe is the log ID).

`--log.level`

Type: string…

Set the topic-specific log level, using --log.level level for the general topic or --log.level topic=level for the specified topic (can be specified multiple times).

Available log levels: fatal, error, warning, info, debug, trace.

Available log topics: all, agency, agencycomm, agencystore, aql, audit-authentication, audit-authorization, audit-collection, audit-database, audit-document, audit-hotbackup, audit-service, audit-view, authentication, authorization, backup, bench, cache, cluster, communication, config, crash, deprecation, development, dump, engines, flush, general, graphs, heartbeat, httpclient, license, maintenance, memory, queries, rep-state, rep-wal, replication, replication2, requests, restore, rocksdb, security, ssl, startup, statistics, supervision, syscall, threads, trx, ttl, v8, validation, views.

Default: info

Show details

ArangoDB’s log output is grouped by topics. --log.level can be specified multiple times at startup, for as many topics as needed. The log verbosity and output files can be adjusted per log topic.

arangod --log.level all=warning --log.level queries=trace --log.level startup=trace

This sets a global log level of warning and two topic-specific levels (trace for queries and info for startup). Note that --log.level warning does not set a log level globally for all existing topics, but only the general topic. Use the pseudo-topic all to set a global log level.

The same in a configuration file:

[log]
level = all=warning
level = queries=trace
level = startup=trace

The available log levels are:

fatal: Only log fatal errors.
error: Only log errors.
warning: Only log warnings and errors.
info: Log information messages, warnings, and errors.
debug: Log debug and information messages, warnings, and errors.
trace: Logs trace, debug, and information messages, warnings, and errors.

Note that the debug and trace levels are very verbose.

Some relevant log topics available in ArangoDB 3 are:

agency: Information about the cluster Agency.
performance: Performance-related messages.
queries: Executed AQL queries, slow queries.
replication: Replication-related information.
requests: HTTP requests.
startup: Information about server startup and shutdown.
threads: Information about threads.

You can adjust the log levels at runtime via the PUT /_admin/log/level HTTP API endpoint.

Audit logging (Enterprise Edition): The server logs all audit events by default. Low priority events, such as statistics operations, are logged with the debug log level. To keep such events from cluttering the log, set the appropriate log topics to the info log level.

`--log.line-number`

Type: boolean

Include the function name, file name, and line number of the source code that issues the log message. Format: [func@FileName.cpp:123]

This option can be specified without a value to enable it.

`--log.max-entry-length`

Type: uint32

The maximum length of a log entry (in bytes).

Default: 134217728

Show details

Note: This option does not include audit log messages. See --audit.max-entry-length instead.

Any log messages longer than the specified value are truncated and the suffix ... is added to them.

The purpose of this option is to shorten long log messages in case there is not a lot of space for log files, and to keep rogue log messages from overusing resources.

The default value is 128 MB, which is very high and should effectively mean downwards-compatibility with previous arangod versions, which did not restrict the maximum size of log messages.

`--log.max-queued-entries`

Introduced in: v3.10.12, v3.11.5, v3.12.0

Type: uint32

Upper limit of log entries that are queued in a background thread.

Default: 16384

Show details

Log entries are pushed on a queue for asynchronous writing unless you enable the --log.force-direct startup option. If you use a slow log output (e.g. syslog), the queue might grow and eventually overflow.

You can configure the upper bound of the queue with this option. If the queue is full, log entries are written synchronously until the queue has space again.

`--log.output`

Type: string…

Log destination(s), e.g. file:///path/to/file (any occurrence of $PID is replaced with the process ID).

Show details

This option allows you to direct the global or per-topic log messages to different outputs. The output definition can be one of the following:

- for stdout
+ for stderr
syslog://<syslog-facility>
syslog://<syslog-facility>/<application-name>
file://<relative-or-absolute-path>

To set up a per-topic output configuration, use --log.output <topic>=<definition>:

--log.output queries=file://queries.log

The above example logs query-related messages to the file queries.log.

You can specify the option multiple times in order to configure the output for different log topics:

--log.level queries=trace --log.output queries=file:///queries.log --log.level requests=info --log.output requests=file:///requests.log

The above example logs all query-related messages to the file queries.log and HTTP requests with a level of info or higher to the file requests.log.

Any occurrence of $PID in the log output value is replaced at runtime with the actual process ID. This enables logging to process-specific files:

--log.output 'file://arangod.log.$PID'

Note that dollar sign may need extra escaping when specified on a command-line such as Bash.

If you specify --log.file-mode <octalvalue>, then any newly created log file uses octalvalue as file mode. Please note that the umask value is applied as well.

If you specify --log.file-group <name>, then any newly created log file tries to use <name> as the group name. Note that you have to be a member of that group. Otherwise, the group ownership is not changed.

The old --log.file option is still available for convenience. It is a shortcut for the more general option --log.output file://filename.

The old --log.requests-file option is still available. It is a shortcut for the more general option --log.output requests=file://....

To change the log levels for the specified output you can add a comma separated list of topics with their respective level after the output definition, separated by a semicolon: --log.output file:///path/to/file;queries=trace,requests=info --log.output -;all=error

`--log.performance`

Deprecated in: v3.5.0

Type: boolean

Shortcut for --log.level performance=trace.

This option can be specified without a value to enable it.

`--log.prefix`

Type: string

Prefix log message with this string.

Show details

Example: arangod ... --log.prefix "-->"

2020-07-23T09:46:03Z --> [17493] INFO ...

`--log.process`

Introduced in: v3.8.0

Type: boolean

Show the process identifier (PID) in log messages.

This option can be specified without a value to enable it.

Default: true

`--log.request-parameters`

Type: boolean

include full URLs and HTTP request parameters in trace logs

This option can be specified without a value to enable it.

Default: true

`--log.role`

Type: boolean

Log the server role.

This option can be specified without a value to enable it.

Show details

If you set this option to true, log messages contains a single character with the server’s role. The roles are:

U: Undefined / unclear (used at startup)
S: Single server
C: Coordinator
P: Primary / DB-Server
A: Agent

`--log.shorten-filenames`

Type: boolean

Shorten filenames in log output (use with --log.line-number).

This option can be specified without a value to enable it.

Default: true

`--log.structured-param`

Introduced in: v3.10.0

Type: string…

Toggle the usage of the log category parameter in structured log messages.

Show details

Some log messages can be displayed together with additional information in a structured form. The following parameters are available:

database: The name of the database.
username: The name of the user.
queryid: The ID of the AQL query (on DB-Servers only).
url: The endpoint path.

The format to enable or disable a parameter is <parameter>=<bool>, or <parameter> to enable it. You can specify the option multiple times to configure multiple parameters:

arangod --log.structured-param database=true --log.structured-param url --log.structured-param username=false

You can adjust the parameter settings at runtime using the /_admin/log/structured HTTP API.

`--log.thread`

Type: boolean

Show the thread identifier in log messages.

This option can be specified without a value to enable it.

Default: true

`--log.thread-name`

Type: boolean

Show thread name in log messages.

This option can be specified without a value to enable it.

`--log.time-format`

Type: string

The time format to use in logs.

Default: utc-datestring-micros

Possible values: “local-datestring”, “timestamp”, “timestamp-micros”, “timestamp-millis”, “uptime”, “uptime-micros”, “uptime-millis”, “utc-datestring”, “utc-datestring-micros”, “utc-datestring-millis”

Show details

Overview over the different options:

Format	Example	Description
`timestamp`	1553766923000	Unix timestamps, in seconds
`timestamp-millis`	1553766923000.123	Unix timestamps, in seconds, with millisecond precision
`timestamp-micros`	1553766923000.123456	Unix timestamps, in seconds, with microsecond precision
`uptime`	987654	seconds since server start
`uptime-millis`	987654.123	seconds since server start, with millisecond precision
`uptime-micros`	987654.123456	seconds since server start, with microsecond precision
`utc-datestring`	2019-03-28T09:55:23Z	UTC-based date and time in format YYYY-MM-DDTHH:MM:SSZ
`utc-datestring-millis`	2019-03-28T09:55:23.123Z	like `utc-datestring`, but with millisecond precision
`local-datestring`	2019-03-28T10:55:23	local date and time in format YYYY-MM-DDTHH:MM:SS

`--log.use-json-format`

Introduced in: v3.8.0

Type: boolean

Use JSON as output format for logging.

This option can be specified without a value to enable it.

Show details

You can use this option to switch the log output to the JSON format. Each log message then produces a separate line with JSON-encoded log data, which can be consumed by other applications.

The object attributes produced for each log message are:

Key	Value
`time`	date/time of log message, in format specified by `--log.time-format`
`prefix`	only emitted if `--log.prefix` is set
`pid`	process id, only emitted if `--log.process` is set
`tid`	thread id, only emitted if `--log.thread` is set
`thread`	thread name, only emitted if `--log.thread-name` is set
`role`	server role (1 character), only emitted if `--log.role` is set
`level`	log level (e.g. `"WARN"`, `"INFO"`)
`file`	source file name of log message, only emitted if `--log.line-number` is set
`line`	source file line of log message, only emitted if `--log.line-number` is set
`function`	source file function name, only emitted if `--log.line-number` is set
`topic`	log topic name
`id`	log id (5 digit hexadecimal string), only emitted if `--log.ids` is set
`hostname`	hostname if `--log.hostname` is set
`message`	the actual log message payload

`--log.use-local-time`

Deprecated in: v3.5.0

Type: boolean

Use the local timezone instead of UTC.

This option can be specified without a value to enable it.

Show details

This option is deprecated. Use --log.time-format local-datestring instead.

`--log.use-microtime`

Deprecated in: v3.5.0

Type: boolean

Use Unix timestamps in seconds with microsecond precision.

This option can be specified without a value to enable it.

Show details

This option is deprecated. Use --log.time-format timestamp-micros instead.

random

`--random.generator`

Type: uint32

The random number generator to use (1 = MERSENNE, 2 = RANDOM, 3 = URANDOM, 4 = COMBINED). The options 2, 3, and 4 are deprecated and will be removed in a future version.

Default: 1

Possible values: 1, 2, 3, 4

Show details

1: a pseudo-random number generator using an implication of the Mersenne Twister MT19937 algorithm
2: use a blocking random (or pseudo-random) number generator
3: use the non-blocking random (or pseudo-random) number generator supplied by the operating system
4: a combination of the blocking random number generator and the Mersenne Twister

server

`--server.ask-jwt-secret`

Type: boolean

If enabled, you are prompted for a JWT secret. This option is not compatible with --server.username and --server.password. If specified, it is used for all connections - even if a new connection to another server is created.

This option can be specified without a value to enable it.

`--server.authentication`

Type: boolean

Require authentication credentials when connecting (does not affect the server-side authentication settings).

This option can be specified without a value to enable it.

Default: true

`--server.connection-timeout`

Type: double

The connection timeout (in seconds).

Default: 5

`--server.database`

Type: string

The database name to use when connecting.

Default: _system

`--server.endpoint`

Type: string…

The endpoint to connect to. Use ’none’ to start without a server. Use http+ssl:// as schema to connect to an SSL-secured server endpoint, otherwise http+tcp:// or unix://

Default: tcp://127.0.0.1:8529

`--server.jwt-secret-keyfile`

Type: string

If enabled, the JWT secret is loaded from the given file. This option is not compatible with --server.ask-jwt-secret, --server.username and --server.password. If specified, it is used for all connections - even if a new connection to another server is created.

`--server.max-packet-size`

Type: uint64

The maximum packet size (in bytes) for client/server communication.

Default: 1073741824

`--server.password`

Type: string

The password to use when connecting. If not specified and authentication is required, you are prompted for a password. In startup options, you can wrap the names of environment variables in at signs to use their value, like @ARANGO_PASSWORD@. This helps to expose the password less, like to the process list. Literal @ need to be escaped as @@.

`--server.request-timeout`

Type: double

The request timeout (in seconds).

Default: 1200

`--server.username`

Type: string

The username to use when connecting.

Default: root

ssl

`--ssl.protocol`

Type: uint64

The SSL protocol (1 = SSLv2 (unsupported), 2 = SSLv2 or SSLv3 (negotiated), 3 = SSLv3, 4 = TLSv1, 5 = TLSv1.2, 6 = TLSv1.3, 9 = generic TLS (negotiated))

Default: 5

Possible values: 1, 2, 3, 4, 5, 6, 9

temp

`--temp.path`

Type: string

The path for temporary files.

Show details

ArangoDB uses the path for storing temporary files, for extracting data from uploaded zip files (e.g. for Foxx services), and other things.

Ideally, the temporary path is set to an instance-specific subdirectory of the operating system’s temporary directory. To avoid data loss, the temporary path should not overlap with any directories that contain important data, for example, the instance’s database directory.

If you set the temporary path to the same directory as the instance’s database directory, a startup error is logged and the startup is aborted.