The configuration options used by the Cassandra PV Archiver server are
        controlled through a configuration file in the
        YAML format.
        The configuration file is located in the conf
        directory of the binary distribution or in the
        /etc/cassandra-pv-archiver directory when using the
        Debian package.
        In either case, the configuration file is called
        cassandra-pv-archiver.yaml.
        It is not an error if the configuration file does not exists at the
        expected location.
        In this case the server starts using default values for all
        configuration options.
      
        The path to the configuration file can be overridden by specifying the
        --config-file command line option to the
        cassandra-pv-archiver-server script.
        When this configuration option is specified, the default location is not
        used.
        Unlike the configuration file in the default location, a configuration
        file specified with --config-file option must exist
        and the server does not start if it is missing. 
      
The configuration options are organized in a hierarchy. For the rest of this document, the first level of this hierarchy is called the section. The hierarchical path to a configuration option can either be specified inline or through indentation. For example, specifying
level1a:
  option1: value1
  level2:
    option1: value2
level1b:
  option1: value3
is equivalent to specifying
level1a.option1: value1 level1a.level2.option1: value2 level1b:option1: value3
The default values specified in this document are the default values that are used when a configuration option is not specified at all, not the value of the option that is specified in the configuration file distributed as part of the binary distribution or Debian package.
This section only describes the part of the configuration that is stored in the per-server configuration file, not the configuration that is stored in the database. Regarding the latter one, please refer to Section 4, “Administrative user interface”.
          The cassandra section configures the server’s
          connection to the Cassandra cluster.
        
            The cassandra.hosts option specifies the list
            of hosts which are used for initially establishing the connection
            with the Cassandra cluster.
            This list does not have to contain all Cassandra hosts because all
            hosts in the cluster are detected automatatically once the
            connection to at least one host has been established.
            However, it is still a good idea to specify more than one host here
            because this will ensure that the connection can be established
            even if one of the hosts is down when the Cassandra PV Archiver
            server is started.
          
            By default, the list only contains localhost.
            The list of hosts has to be specified as a YAML list, using the
            regular or the inline list syntax. For example, a list specifying
            three hosts might look like this:
          
cassandra:
  hosts:
    - server1.example.com
    - server2.example.com
    - server3.example.com
            The cassandra.port option specifies the port
            number on which the Cassandra hosts are listening for incoming
            connections (for Cassandra’s native protocol).
            The default value is 9042, which is also the default value used by
            Cassandra.
          
            The cassandra.keyspace option specifies the name
            of the keyspace in which the Cassandra PV Archiver stores its data.
            The default value is pv_archive.
            While strictly speaking mixed-case names are allowed, the use of
            such names is discouraged because many tools have problem with them
            and they typically require quoting.
            For this reason, the keyspace name should be all lower-case when
            possible. 
          
            The cassandra.username option specifies the
            username that is specified when authenticating with the Cassandra
            cluster.
            When empty, the connection to the Cassandra cluster is established
            without trying to authenticate the client.
            The default value is the empty string (no authentication). 
          
            The cassandra.password option specifies the
            password that is specified when authenticating with the Cassandra
            cluster.
            The password is only used when the username is not empty.
            The default value is the empty string. 
          
            The cassandra.useLocalConsistencyLevel option
            specifies the consistency level that is used for all database
            operations.
            The default value is false.
            This option only has an effect when the Cassandra cluster is
            distributed across multiple data centers.
            By setting this option to true, the
            LOCAL_QUORUM consistency level is used where
            usually the QUORUM consistency level would be
            used.
            In the same way, the LOCAL_SERIAL consistency
            level is used instead of the SERIAL consistency
            level.
          
This option must only be enabled if only a single data center makes modifications to the data and all other data centers only use the database for read access. In this case, enabling this option can reduce the latency of operations because the client only has to wait for nodes local to the data center. The most likely scenario is a situation where all nodes running the Cassandra PV Archiver servers are in a single data center, but there is a second data center to which all data is replicated for disaster recovery.
![]()  | Important | 
|---|---|
Never enable this option when there is more than one data center that is used for write access to the database. In this case, enabling this option will lead to data corruption because operations that are expected to result in a consistent state might actually leave inconsistencies. 
              This option merely provides a performance optimization, so in case
              of doubt, leave it at its default value of
                | 
          The server section configures the archiving server
          (for example the ID assigned to each server instance and on which
          address and ports the archiving server listens).
          While the address and port settings can usually be left at their
          defaults the server’s ID has to be set.
        
            Each server in the cluster is identified by a unique ID (UUID).
            As this UUID has to be unique for each server, there is no
            reasonable default value, but it has to be specified explicitly.
            The server’s UUID can be specified using the
            server.uuid option.
            Alternatively, it can be specified by passing the
            --server-uuid parameter to the server’s start
            script.
          
![]()  | Important | 
|---|---|
Starting two server instances with the same UUID results in data corruption, regardless of whether these instances are started on the same host or different hosts. For this reason, care should be taken to ensure that each UUID is only used for exactly one process.  | 
            As an alternative to specifying the server’s UUID in the
            configuration file or on the command line, it is possible to have a
            separate file that specifies the UUID.
            The path to this file can be specified with the
            server.uuidFile option.
            If this file exists, it is expected to contain a single line with
            the UUID that is then used as the server’s UUID.
            If this file does not exist, the server tries to create it on
            startup, using a randomly generated UUID.
            By default this option is not set so that the server expects an
            explicitly specified UUID.
            This option is particularly useful in an environment where servers
            are deployed automatically and should thus automatically generate a
            UUID the first time they are started.
          
            The server.listenAddress option specifies the IP
            address (or the hostname resolving to the IP address) on which the
            server listens for incoming connections.
            If it is empty (the default), the server listens on the first
            non-loopback address that is found.
            This means that typically, this option only has to be set for
            servers that have more than one (non-loopback) interface.
          
The specified address is used for the administrative user-interface, the archive-access interface, and the inter-node communication interface. In addition to the specified address, the administrative user-interface and the archive-access interface are also made available on the loopback address.
            This option should never be set to localhost,
            127.0.0.1, ::1, or any other
            loopback address because other servers will try to contact the
            server on the specified address and obviously this will lead to
            unexpected results when the address is a loopback address.
          
            The server.adminPort option specifies the TCP
            port number on which the administrative user-interface is made
            available.
            The default is port 4812.
          
            The server.archiveAccessPort option specifies the
            TCP port number on which the archive-access interface is made
            available.
            The default is port 9812.
            The archive-access interface is the web-interface through which
            clients access the data stored in the archive.
          
            The server.interNodeCommunicationPort option
            specifies the TCP port number on which the inter-node communication
            interface is made available.
            The default is port 9813.
            Like the name suggests, the inter-node communication interface is
            used for internal communication between Cassandra PV Archiver
            servers that is needed in order to coordinate the cluster operation
            (for example in case of configuration changes).
          
          The throttling section contains options for
          throttling database operations.
          The Cassandra PV Archiver server tries to run database operations in
          parallel in order to reduce the effective latency of complex
          operations (e.g. operations involing many channels).
          However, depending on the exact configuration of the Cassandra cluster
          (for example the size of the cluster, network bandwidth and latency,
          hardware used for the cluster, load caused by other applications), the
          number of operations that can safely be run in parallel might differ. 
        
When running too many operations in parallel, this results in some of the operations timing out. This can be avoided by reducing the number of operations allowed to run in parallel. On the other hand, when operations never time out, one might try to increase the limits in order to improve the performance.
          The limits can be controlled separately for read and write operations
          and for operations touching the channels’ meta-data (for example the
          configuration and information about sample buckets) and the actual
          samples.
          Operations modifying channel meta-data are typically carried out using
          the SERIAL consistency level, so in this case write
          operations typically are more expensive than read operations.
          Thus the limit for write operations should be lower than the limit for
          read operations.
          In the case of operations dealing with actual samples, read operations
          typically are more expensive than write operation (due to how
          Cassandra works internally), so the limit for read operations shold be
          lower than the limit for write operations.
        
![]()  | Note | 
|---|---|
When trying to optimize the throttling settings, it can be helpful to connect to the Cassandra PV Archiver server via JMX (for example using JConsole from the JDK). The current number of operations that are running and waiting is exposed via MBeans, so that it is possible to monitor how changing the throttling parameters affects the operation.  | 
            The
            throttling.maxConcurrentChannelMetaDataReadStatements
            configuration option controls how many read operations for channel
            meta-data should be allowed to run in parallel.
            Usually, these are statements reading from the
            channels, channels_by_server,
            and pending_channel_operations_by_server tables.
            Typically, this limit should be greater than the limit set by the
            throttling.maxConcurrentChannelMetaDataWriteStatements
            option.
            The default value is 64.
          
            The
            throttling.maxConcurrentChannelMetaDataWriteStatements
            configuration option controls how many write operations for channel
            meta-data should be allowed to run in parallel.
            Usually, these are statements writing to the
            channels, channels_by_server,
            and pending_channel_operations_by_server tables.
            Typically, such operations are light-weight transactions and thus 
            this limit should be less than the limit set by the
            throttling.maxConcurrentChannelMetaDataReadStatements
            option.
            The default value is 16.
          
            The
            throttling.maxConcurrentControlSystemSupportReadStatements
            configuration option controls how many read operations the
            control-system supports (all of them combined) are allowed to run in
            parallel.
            Usually, these are statements that read actual samples and thus read
            from the tables used by the control-system support(s).
            Typically, this limit should be less than the limit set by the
            throttling.maxConcurrentControlSystemSupportWriteStatements
            option, but significantly greater than the limit set by the
            throttling.maxConcurrentChannelMetaDataReadStatements
            option.
            The default value is 128.
          
            The
            throttling.maxConcurrentControlSystemSupportWriteStatements
            configuration option controls how many write operations the
            control-system supports (all of them combined) are allowed to run in
            parallel.
            Usually, these are statements that write actual samples (for each
            sample that is written, an INSERT statement is
            triggered) and that thus write to the tables used by the
            control-system support(s).
            Typically, this limit should be greater than the limit set by the
            throttling.maxConcurrentControlSystemSupportReadStatements
            option and significantly greater than the limits set by the
            throttling.maxConcurrentChannelMetaDataReadStatements
            and
            throttling.maxConcurrentChannelMetaDataWriteStatements
            options.
            The default value is 512.
          
          The controlSystemSupport section contains the
          configuration options for the various control-system supports.
          For each available control-system support, this section has a
          corresponding sub-section.
          The configuration options in these sub-sections are not handled by
          the Cassandra PV Archiver server itself but passed as-is to the
          respective control-system support.
          For this reason, the names of the available options entirely depend
          on the respective control-system support.
          Please refer to the documentation of the respective control-system
          support for details.
          For example, the documentation for the Channel Access control-system
          support is available in Appendix C, Channel Access control-system support.
        
The Cassandra PV Archiver server is based on the Spring Boot framework. For this reason, the options supported for configuring logging are actually the same ones that are supported by Spring Boot. These options are documented in the Spring Boot Reference Guide. The Cassanra PV Archiver server uses Logback as its logging backend, so the specifics of how to configure Logback for Spring Boot might also be interesting.
In order to get started more easily, this section contains a few pointers on how the logging configuration can be modified.
The log level can be set both globally and for specific subtrees of the class hierarchy. When specifying different log levels for different parts of the hierarchy, more specific definitions (the ones covering a smaller sub-tree of the hierarchy) take precedence over more general definitions.
            The available log levels are ERROR,
            WARN, INFO,
            DEBUG, and TRACE.
            Each log level contains the preceding log levels (for example
            the log level INFO also contains
            ERROR and WARN).
          
            The log level for the root of the hierarchy (that is used for all
            loggers that do not have a more specific definition) is set through
            the logging.root.level option.
            By default, this log level is set to INFO.
            This results in a lot of diagnostic messages being logged, so you
            might want to consider reducing it to WARN.
          
            The log level for individual parts of the hierarchy can be set by
            using a configuration option containing the path to the respective
            hierarchy level.
            For example, in order to enable DEBUG messages for all classes in
            the com.aquenos.cassandra.pvarchiver package (and
            its sub-packages), one could set
            logging.com.aquenos.cassandra.pvarchiver.level to
            DEBUG.
          
            The path to the log file can be specified using the
            logging.file option.
            If no log file is specified (the default),
            log messages are only written to the standard output.
            In order to log to more than one log file (for example depending
            on the log level or the class writing the log message) or in order
            to disable logging to the standard output, one has to specify a
            custom logback configuration file (see the next section).
          
            When the configuration options directly available through the
            Cassandra PV Archiver server configuration-file are not sufficient,
            one can specify a custom Logback configuration file.
            The path to this file is specified using the
            logging.config option.
            The
            information
            available in the
            Spring Boot Reference Guide
            might be useful when using this option.
          
          In addition to the configuration options that can be specified in the
          server’s configuration file, there are two environment variables that
          can be passed to the server’s startup script.
          When using the Debian package, these environment variables should be
          set in the file
          /etc/default/cassandra-pv-archiver-server.
        
          The first environment variable is JAVA_HOME.
          It specifies the path to the JRE.
          When starting the Java process, the server’s startup scripts uses the
          $JAVA_HOME/bin/java executable
          (%JAVA_HOME%/bin/java.exe on Windows).
          When JAVA_HOME is not set, the startup script uses the
          java executable that is in the search
          PATH of the shell executing the startup script.
        
          The second environment variable is JAVA_OPTS.
          When set, the value of this environment variable is added to the
          parameters passed to the java executable.
          It can be used to configure JVM options like the maximum heap size.