From b201d73f8d3c5a08844ad3fa5dfce0540d79cdb7 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 16:57:54 +0200 Subject: [PATCH 1/6] Chore: Convert `install/**` into MyST format rst2myst convert -R ... --- docs/install/_control-linux.md | 26 + docs/install/_control-linux.rst | 23 - docs/install/_post-install.md | 34 ++ docs/install/_post-install.rst | 33 -- docs/install/cloud/aws/aws-terraform-setup.md | 180 +++++++ .../install/cloud/aws/aws-terraform-setup.rst | 201 ------- docs/install/cloud/aws/ec2-setup.md | 204 +++++++ docs/install/cloud/aws/ec2-setup.rst | 206 ------- .../install/cloud/aws/{index.rst => index.md} | 18 +- docs/install/cloud/aws/s3-setup.md | 102 ++++ docs/install/cloud/aws/s3-setup.rst | 102 ---- .../cloud/azure/{index.rst => index.md} | 19 +- .../azure/{terraform.rst => terraform.md} | 200 ++++--- docs/install/cloud/azure/vm.md | 158 ++++++ docs/install/cloud/azure/vm.rst | 180 ------- docs/install/cloud/index.md | 17 + docs/install/cloud/index.rst | 18 - docs/install/configure.md | 58 ++ docs/install/configure.rst | 63 --- docs/install/container/docker.md | 495 +++++++++++++++++ docs/install/container/docker.rst | 507 ------------------ docs/install/container/index.md | 59 ++ docs/install/container/index.rst | 61 --- docs/install/container/kubernetes/index.md | 42 ++ docs/install/container/kubernetes/index.rst | 45 -- .../kubernetes/kubernetes-operator.md | 131 +++++ .../kubernetes/kubernetes-operator.rst | 138 ----- .../container/kubernetes/kubernetes.md | 369 +++++++++++++ .../container/kubernetes/kubernetes.rst | 387 ------------- docs/install/debian-ubuntu.md | 90 ++++ docs/install/debian-ubuntu.rst | 77 --- docs/install/{index.rst => index.md} | 100 ++-- docs/install/redhat.md | 102 ++++ docs/install/redhat.rst | 97 ---- docs/install/tarball.md | 74 +++ docs/install/tarball.rst | 66 --- docs/install/windows.md | 84 +++ docs/install/windows.rst | 74 --- 38 files changed, 2392 insertions(+), 2448 deletions(-) create mode 100644 docs/install/_control-linux.md delete mode 100644 docs/install/_control-linux.rst create mode 100644 docs/install/_post-install.md delete mode 100644 docs/install/_post-install.rst create mode 100644 docs/install/cloud/aws/aws-terraform-setup.md delete mode 100644 docs/install/cloud/aws/aws-terraform-setup.rst create mode 100644 docs/install/cloud/aws/ec2-setup.md delete mode 100644 docs/install/cloud/aws/ec2-setup.rst rename docs/install/cloud/aws/{index.rst => index.md} (51%) create mode 100644 docs/install/cloud/aws/s3-setup.md delete mode 100644 docs/install/cloud/aws/s3-setup.rst rename docs/install/cloud/azure/{index.rst => index.md} (53%) rename docs/install/cloud/azure/{terraform.rst => terraform.md} (52%) create mode 100644 docs/install/cloud/azure/vm.md delete mode 100644 docs/install/cloud/azure/vm.rst create mode 100644 docs/install/cloud/index.md delete mode 100644 docs/install/cloud/index.rst create mode 100644 docs/install/configure.md delete mode 100644 docs/install/configure.rst create mode 100644 docs/install/container/docker.md delete mode 100644 docs/install/container/docker.rst create mode 100644 docs/install/container/index.md delete mode 100644 docs/install/container/index.rst create mode 100644 docs/install/container/kubernetes/index.md delete mode 100644 docs/install/container/kubernetes/index.rst create mode 100644 docs/install/container/kubernetes/kubernetes-operator.md delete mode 100644 docs/install/container/kubernetes/kubernetes-operator.rst create mode 100644 docs/install/container/kubernetes/kubernetes.md delete mode 100644 docs/install/container/kubernetes/kubernetes.rst create mode 100644 docs/install/debian-ubuntu.md delete mode 100644 docs/install/debian-ubuntu.rst rename docs/install/{index.rst => index.md} (62%) create mode 100644 docs/install/redhat.md delete mode 100644 docs/install/redhat.rst create mode 100644 docs/install/tarball.md delete mode 100644 docs/install/tarball.rst create mode 100644 docs/install/windows.md delete mode 100644 docs/install/windows.rst diff --git a/docs/install/_control-linux.md b/docs/install/_control-linux.md new file mode 100644 index 00000000..0df0c3a9 --- /dev/null +++ b/docs/install/_control-linux.md @@ -0,0 +1,26 @@ +# Control CrateDB on Linux + +You can control the `crate` service with the `systemctl` utility program: + +``` +sudo systemctl COMMAND crate +``` + +Replace `COMMAND` with `start`, `stop`, `restart`, `status` and +so on. + +# Notes + +After the installation is finished, the `crate` service should be installed, +but may not be configured to start automatically. Use the following command to +start CrateDB: + +``` +sudo systemctl start crate +``` + +In order to make the service reboot-safe, invoke: + +``` +sudo systemctl enable crate +``` diff --git a/docs/install/_control-linux.rst b/docs/install/_control-linux.rst deleted file mode 100644 index a58faf52..00000000 --- a/docs/install/_control-linux.rst +++ /dev/null @@ -1,23 +0,0 @@ -Control CrateDB on Linux -======================== - -You can control the ``crate`` service with the ``systemctl`` utility program:: - - sudo systemctl COMMAND crate - -Replace ``COMMAND`` with ``start``, ``stop``, ``restart``, ``status`` and -so on. - -Notes -===== - -After the installation is finished, the ``crate`` service should be installed, -but may not be configured to start automatically. Use the following command to -start CrateDB:: - - sudo systemctl start crate - -In order to make the service reboot-safe, invoke:: - - sudo systemctl enable crate - diff --git a/docs/install/_post-install.md b/docs/install/_post-install.md new file mode 100644 index 00000000..12ec3e94 --- /dev/null +++ b/docs/install/_post-install.md @@ -0,0 +1,34 @@ +# Post-install notes + +After successfully installing CrateDB, for example on your workstation, the web-based +Admin UI can be visited at: + +``` +http://localhost:4200/ +``` + +:::{SEEALSO} +If you are new to CrateDB, you may want to follow up by {ref}`taking the guided tour `. +::: + +Also, let us outline those information entrypoints as suggestions to explore next: + +- Read more details about the {ref}`crate-reference:config`. +- The background about {ref}`bootstrap-checks`. +- Multi-node configuration within the section about {ref}`clustering` + and {ref}`going-into-production`. +- When operating a CrateDB cluster in production, please also take + {ref}`performance tuning ` into consideration. + +:::{NOTE} +This kind of installation flavor will let you quickly set up and start a +**single-node cluster**. When adding additional CrateDB nodes, in order to +make it form a **multi-node cluster**, you will need to reset (remove) the +cluster state after changing the configuration. +::: + +:::{CAUTION} +Please make sure to read the {ref}`upgrade-planning`, and the guidelines about {ref}`rolling +upgrades ` and {ref}`full restart upgrades `, +before upgrading a running cluster. +::: diff --git a/docs/install/_post-install.rst b/docs/install/_post-install.rst deleted file mode 100644 index a43071f1..00000000 --- a/docs/install/_post-install.rst +++ /dev/null @@ -1,33 +0,0 @@ -Post-install notes -================== - -After successfully installing CrateDB, for example on your workstation, the web-based -Admin UI can be visited at:: - - http://localhost:4200/ - -.. SEEALSO:: - - If you are new to CrateDB, you may want to follow up by :ref:`taking the guided tour `. - -Also, let us outline those information entrypoints as suggestions to explore next: - -* Read more details about the :ref:`crate-reference:config`. -* The background about :ref:`bootstrap-checks`. -* Multi-node configuration within the section about :ref:`clustering` - and :ref:`going-into-production`. -* When operating a CrateDB cluster in production, please also take - :ref:`performance tuning ` into consideration. - -.. NOTE:: - - This kind of installation flavor will let you quickly set up and start a - **single-node cluster**. When adding additional CrateDB nodes, in order to - make it form a **multi-node cluster**, you will need to reset (remove) the - cluster state after changing the configuration. - -.. CAUTION:: - - Please make sure to read the :ref:`upgrade-planning`, and the guidelines about :ref:`rolling - upgrades ` and :ref:`full restart upgrades `, - before upgrading a running cluster. diff --git a/docs/install/cloud/aws/aws-terraform-setup.md b/docs/install/cloud/aws/aws-terraform-setup.md new file mode 100644 index 00000000..4b4496b2 --- /dev/null +++ b/docs/install/cloud/aws/aws-terraform-setup.md @@ -0,0 +1,180 @@ +(aws-terraform-setup)= + +# Running CrateDB via Terraform + +In {ref}`ec2_setup`, we elaborated on how to leverage EC2's functionality to set +up a CrateDB cluster. Here, we will explore how to automate this kind of setup. + +[Terraform] is an infrastructure as code tool, often used as an abstraction +layer on top of a cloud's management APIs. Instead of creating cloud resources +manually, the target state is specified via configuration files which can also +be managed in a version control system. This brings some advantages, such as but +not limited to: + +- Reproducibility of deployments, e.g., across different accounts or in case of + disaster recovery +- Enables common development workflows like code reviews, automated testing, and + so on +- Better prediction and tracing of infrastructure changes + +The [crate-terraform] repository provides a predefined configuration template +of various AWS resources to form a CrateDB cluster on AWS (such as EC2 +instances, load balancer, etc). This eliminates the need to manually compose all +required resources and their interactions. + +:::{SEEALSO} +Engage with us in the [community post] on Terraform deployments for any +questions or feedback! +::: + +:::{CAUTION} +The provided configuration is meant to be used for development or testing +purposes and does not aim to fulfil all needs of a production environment. +::: + +## Prerequisites + +Before creating the configuration to launch your CrateDB cluster, the following +prerequisites should be fulfilled: + +1. The Terraform CLI is installed as per + [Terraform's installation guide] +2. The git CLI is installed as per [git's installation guide] +3. AWS credentials are configured for Terraform. If you already have a + configured AWS CLI setup, Terraform will reuse this configuration. If not, + see the [AWS provider] documentation on authentication. + +## Deployment configuration + +The CrateDB Terraform configuration consists of a set of variables to customize +your deployment. Create a new file `main.tf` with the following content and +adjust variable values as needed: + +``` +module "cratedb-cluster" { + source = "github.com/crate/crate-terraform.git/aws" + + # Global configuration items for naming/tagging resources + config = { + project_name = "example-project" + environment = "test" + owner = "Crate.IO" + team = "Customer Engineering" + } + + # CrateDB-specific configuration + crate = { + # Java Heap size in GB available to CrateDB + heap_size_gb = 2 + + cluster_name = "crate-cluster" + + # The number of nodes the cluster will consist of + cluster_size = 2 + + # Enables a self-signed SSL certificate + ssl_enable = true + } + + # The disk size in GB to use for CrateDB's data directory + disk_size_gb = 512 + + # The AWS region + region = "eu-central-1" + + # The VPC to deploy to + vpc_id = "vpc-1234567" + + # Applicable subnets of the VPC + subnet_ids = ["subnet-123456", "subnet-123457"] + + # The corresponding availability zones of above subnets + availability_zones = ["eu-central-1b", "eu-central-1a"] + + # The SSH key pair for EC2 instances + ssh_keypair = "cratedb-cluster" + + # Enable SSH access to EC2 instances + ssh_access = true +} + +output "cratedb" { + value = module.cratedb-cluster + sensitive = true +} +``` + +The AWS-specific variables need to be adjusted according to your environment: + +| Variable | Explanation | How to obtain | +| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------- | +| `region` | The geographic region in which to create the AWS resources | [List of available AWS regions] | +| `vpc_id` | The ID of the Virtual Private Cloud (VPC) in which the EC2 instances will be launched | [How to view VPC properties] | +| `subnet_ids` | Each VPC consists of multiple subnets, typically distributed across availability zones. Choose the ones you want to launch EC2 instances in. | [How to view subnet properties] | +| `availability_zones` | The availability zones of the above subnets. The positions in the `availability_zones` array must match with the corresponding element in `subnet_ids`. In the example above, `subnet-123456` is in `eu-central-1b`, and `subnet-123457` in `eu-central-1a`. | [How to view subnet properties] | +| `ssh_keypair` | The EC2 key pair used for SSH access. This must be an already existing key pair name. | [How to create EC2 key pairs] | + +## Execution + +Once all variables are configured properly, Terraform needs to be initialized: + +```bash +terraform init +``` + +To proceed with executing the creation of resources, apply the configuration. +There will be a final confirmation prompt before any changes are applied to your +AWS account: + +```bash +terraform apply +``` + +Once the execution succeeded, a message similar to the one below is shown: + +```bash +Apply complete! Resources: 22 added, 0 changed, 0 destroyed. + +Outputs: + +cratedb = +``` + +Terraform internally tracks the state of each resource it manages, including +certain outputs with details on the created Cluster. As those details include +credentials, they are marked as sensitive and not shown in the output above. +To view the output, run: + +```bash +terraform output cratedb +``` + +The output variable `cratedb_application_url` points to the load balancer with +the port of CrateDB's Admin UI. Opening that URL in your browser should show a +password prompt on which you can authenticate using `cratedb_username` and +`cratedb_password`. + +## Deprovisioning + +If the CrateDB cluster is not needed anymore, you can easily instruct Terraform +to destroy all associated resources: + +```bash +terraform destroy +``` + +:::{CAUTION} +Destroying the cluster will permanently delete all data stored on it. Use +{ref}`snapshots ` to create a backup on S3 if needed. +::: + +[aws provider]: https://registry.terraform.io/providers/hashicorp/aws/latest/docs +[community post]: https://community.cratedb.com/t/deploying-cratedb-to-the-cloud-via-terraform/849 +[crate-terraform]: https://github.com/crate/crate-terraform +[git's installation guide]: https://git-scm.com/downloads +[how to create ec2 key pairs]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/create-key-pairs.html +[how to view subnet properties]: https://docs.aws.amazon.com/vpc/latest/userguide/configure-subnets.html +[how to view vpc properties]: https://docs.aws.amazon.com/vpc/latest/userguide/work-with-default-vpc.html#view-default-vpc +[list of available aws regions]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions +[terraform]: https://www.terraform.io +[terraform's installation guide]: https://www.terraform.io/downloads.html diff --git a/docs/install/cloud/aws/aws-terraform-setup.rst b/docs/install/cloud/aws/aws-terraform-setup.rst deleted file mode 100644 index 7d30010b..00000000 --- a/docs/install/cloud/aws/aws-terraform-setup.rst +++ /dev/null @@ -1,201 +0,0 @@ -.. _aws_terraform_setup: - -============================= -Running CrateDB via Terraform -============================= - -In :ref:`ec2_setup`, we elaborated on how to leverage EC2's functionality to set -up a CrateDB cluster. Here, we will explore how to automate this kind of setup. - -`Terraform`_ is an infrastructure as code tool, often used as an abstraction -layer on top of a cloud's management APIs. Instead of creating cloud resources -manually, the target state is specified via configuration files which can also -be managed in a version control system. This brings some advantages, such as but -not limited to: - -- Reproducibility of deployments, e.g., across different accounts or in case of - disaster recovery -- Enables common development workflows like code reviews, automated testing, and - so on -- Better prediction and tracing of infrastructure changes - -The `crate-terraform`_ repository provides a predefined configuration template -of various AWS resources to form a CrateDB cluster on AWS (such as EC2 -instances, load balancer, etc). This eliminates the need to manually compose all -required resources and their interactions. - -.. SEEALSO:: - - Engage with us in the `community post`_ on Terraform deployments for any - questions or feedback! - -.. CAUTION:: - - The provided configuration is meant to be used for development or testing - purposes and does not aim to fulfil all needs of a production environment. - -Prerequisites -============= - -Before creating the configuration to launch your CrateDB cluster, the following -prerequisites should be fulfilled: - -1. The Terraform CLI is installed as per - `Terraform's installation guide`_ -2. The git CLI is installed as per `git's installation guide`_ -3. AWS credentials are configured for Terraform. If you already have a - configured AWS CLI setup, Terraform will reuse this configuration. If not, - see the `AWS provider`_ documentation on authentication. - -Deployment configuration -======================== - -The CrateDB Terraform configuration consists of a set of variables to customize -your deployment. Create a new file ``main.tf`` with the following content and -adjust variable values as needed: - -.. code-block:: - - module "cratedb-cluster" { - source = "github.com/crate/crate-terraform.git/aws" - - # Global configuration items for naming/tagging resources - config = { - project_name = "example-project" - environment = "test" - owner = "Crate.IO" - team = "Customer Engineering" - } - - # CrateDB-specific configuration - crate = { - # Java Heap size in GB available to CrateDB - heap_size_gb = 2 - - cluster_name = "crate-cluster" - - # The number of nodes the cluster will consist of - cluster_size = 2 - - # Enables a self-signed SSL certificate - ssl_enable = true - } - - # The disk size in GB to use for CrateDB's data directory - disk_size_gb = 512 - - # The AWS region - region = "eu-central-1" - - # The VPC to deploy to - vpc_id = "vpc-1234567" - - # Applicable subnets of the VPC - subnet_ids = ["subnet-123456", "subnet-123457"] - - # The corresponding availability zones of above subnets - availability_zones = ["eu-central-1b", "eu-central-1a"] - - # The SSH key pair for EC2 instances - ssh_keypair = "cratedb-cluster" - - # Enable SSH access to EC2 instances - ssh_access = true - } - - output "cratedb" { - value = module.cratedb-cluster - sensitive = true - } - -The AWS-specific variables need to be adjusted according to your environment: - -+------------------------+--------------------------------------------------------------+----------------------------------+ -| Variable | Explanation | How to obtain | -+========================+==============================================================+==================================+ -| ``region`` | The geographic region in which to create the AWS resources | `List of available AWS regions`_ | -+------------------------+--------------------------------------------------------------+----------------------------------+ -| ``vpc_id`` | The ID of the Virtual Private Cloud (VPC) in which the EC2 | `How to view VPC properties`_ | -| | instances will be launched | | -+------------------------+--------------------------------------------------------------+----------------------------------+ -| ``subnet_ids`` | Each VPC consists of multiple subnets, typically distributed | `How to view subnet properties`_ | -| | across availability zones. Choose the ones you want to | | -| | launch EC2 instances in. | | -+------------------------+--------------------------------------------------------------+----------------------------------+ -| ``availability_zones`` | The availability zones of the above subnets. | `How to view subnet properties`_ | -| | The positions in the ``availability_zones`` array must match | | -| | with the corresponding element in ``subnet_ids``. | | -| | In the example above, ``subnet-123456`` is in | | -| | ``eu-central-1b``, and ``subnet-123457`` in | | -| | ``eu-central-1a``. | | -+------------------------+--------------------------------------------------------------+----------------------------------+ -| ``ssh_keypair`` | The EC2 key pair used for SSH access. This must be an | `How to create EC2 key pairs`_ | -| | already existing key pair name. | | -+------------------------+--------------------------------------------------------------+----------------------------------+ - -Execution -========= - -Once all variables are configured properly, Terraform needs to be initialized: - -.. code-block:: bash - - terraform init - -To proceed with executing the creation of resources, apply the configuration. -There will be a final confirmation prompt before any changes are applied to your -AWS account: - -.. code-block:: bash - - terraform apply - -Once the execution succeeded, a message similar to the one below is shown: - -.. code-block:: bash - - Apply complete! Resources: 22 added, 0 changed, 0 destroyed. - - Outputs: - - cratedb = - -Terraform internally tracks the state of each resource it manages, including -certain outputs with details on the created Cluster. As those details include -credentials, they are marked as sensitive and not shown in the output above. -To view the output, run: - -.. code-block:: bash - - terraform output cratedb - -The output variable ``cratedb_application_url`` points to the load balancer with -the port of CrateDB's Admin UI. Opening that URL in your browser should show a -password prompt on which you can authenticate using ``cratedb_username`` and -``cratedb_password``. - -Deprovisioning -============== - -If the CrateDB cluster is not needed anymore, you can easily instruct Terraform -to destroy all associated resources: - -.. code-block:: bash - - terraform destroy - -.. CAUTION:: - - Destroying the cluster will permanently delete all data stored on it. Use - :ref:`snapshots ` to create a backup on S3 if needed. - -.. _Terraform: https://www.terraform.io -.. _crate-terraform: https://github.com/crate/crate-terraform -.. _Terraform's installation guide: https://www.terraform.io/downloads.html -.. _git's installation guide: https://git-scm.com/downloads -.. _AWS provider: https://registry.terraform.io/providers/hashicorp/aws/latest/docs -.. _List of available AWS regions: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions -.. _How to view VPC properties: https://docs.aws.amazon.com/vpc/latest/userguide/work-with-default-vpc.html#view-default-vpc -.. _How to view subnet properties: https://docs.aws.amazon.com/vpc/latest/userguide/configure-subnets.html -.. _How to create EC2 key pairs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/create-key-pairs.html -.. _community post: https://community.cratedb.com/t/deploying-cratedb-to-the-cloud-via-terraform/849 diff --git a/docs/install/cloud/aws/ec2-setup.md b/docs/install/cloud/aws/ec2-setup.md new file mode 100644 index 00000000..d9bb0feb --- /dev/null +++ b/docs/install/cloud/aws/ec2-setup.md @@ -0,0 +1,204 @@ +```{highlight} yaml +``` + +(ec2-setup)= + +# Running CrateDB on Amazon EC2 + +## Introduction + +When running CrateDB in a cloud environment such as [Amazon EC2] (Elastic +Compute Cloud) you usually face the problem that CrateDB's default discovery +mechanism does not work out of the box. + +Luckily, CrateDB has several built-in mechanisms for unicast host discovery, +also one for EC2. EC2 discovery uses the [EC2 API] to look up other EC2 hosts +that are then used as unicast hosts for node discovery (see +{ref}`Unicast Host Discovery `). + +:::{NOTE} +Note that this best practice only describes how to use the EC2 discovery and +its settings, and not how to set up a cluster on EC2 securely. +::: + +## Basic Configuration + +The most important step for EC2 discovery is that you have to launch your EC2 +instances within the same security group. The rules of that security group must +at least allow traffic on CrateDB's transport port (default `4300`). This +will allow CrateDB to accept and respond to pings from other CrateDB instances +with the same cluster name and form a cluster. + +Once you have your instances running and CrateDB installed, you can enable EC2 +discovery: + +| CrateDB Version | Reference | Example | +| --------------- | --------- | ------- | +| >=4.x | [latest] | ``` +discovery.seed_providers: ec2 ``` | +| \<=3.x | [3.3] | ``` +discovery.zen.hosts_provider: ec2 ``` | + +To be able to use the EC2 API, CrateDB must [sign the requests] by using +AWS credentials consisting of an access key and a secret key. Therefore +AWS provides [IAM roles] to avoid any distribution of your AWS credentials +to the instances. + +CrateDB binds to the loopback interface by default. To get EC2 discovery +working, you need to update the {ref}`host ` +setting, in order to bind to and publish the site-local address: + +``` +network.host: _site_ +``` + +(ec2-authentication)= + +### Authentication + +For that, it is recommended to create a separate user that has only the +necessary permissions to describe instances. First, you need to create an IAM +role in order to assign the instances later on. This [AWS guide] gives you a +short description of how you can create a policy via the CLI or AWS management +console. An example policy file is attached below and should at least contain +these API permissions/actions: + +```json +{ + "Statement": [ + { + "Action": [ + "ec2:DescribeInstances" + ], + "Effect": "Allow", + "Resource": [ + "*" + ] + } + ], + "Version": "2012-10-17" +} +``` + +This policy allows the instances to find each other if they have been assigned +to this role on startup. + +:::{NOTE} +The same environment variables are used when performing `COPY FROM` and +`COPY TO`. This means that if you want to use these statements you'll have +to extend the permissions of that EC2 user. +::: + +You could also provide them as system properties or as settings in the +`crate.yml`, but the advantage of env variables is that also +`COPY FROM/TO` statements use the same environment variables. + +:::{NOTE} +Note that the env variables need to be provided for the user that runs the +CrateDB process, which is usually the user `crate` in production. +::: + +Now you are ready to start your CrateDB instances and they will discover each +other automatically. Use the AWS CLI or the AWS Console to run instances +and [assign them with an IAM role]. Note that all CrateDB instances of the same +region will join the cluster as long as their cluster name is equal and they are +able to communicate to each other over the transport port. + +## Production Setup + +For a production setup, the best way to filter instances for discovery is via +a security group. This requires that you create a separate security group for +each cluster and allow TCP traffic on transport port `4300` (or other, if set +to a different port) only from within the group. + +```{image} /_assets/img/install/cloud/ec2-discovery-security-groups.png +:alt: Assign security group on instance launch +:width: 100% +``` + +Since the instances that belong to the same CrateDB cluster have the same +security group then, you can easily filter instances by that group. + +For example, when you launch your instances with the security group +`sg-crate-demo`, your CrateDB setting would be: + +``` +discovery.ec2.groups: sg-crate-demo +``` + +The combination with the unique cluster name makes the production setup very +simple yet secure. + +See also {ref}`crate-reference:discovery.ec2.groups`. + +## Optional Filters + +Sometimes, however, you will want to have a more flexible setup. In this case, +there are a few other configuration settings that can be adjusted. + +(filter-by-tags)= + +### Filter by Tags + +The EC2 discovery mechanism can additionally filter machines by instance tags. +Tags are key-value pairs that can be assigned to an instance as metadata when +it is launched. + +A good example usage of tags is to assign environment and usage type +information. + +Let's assume you have a pool of several instances tagged with `env` and +`type`, where `env` is either `dev` or `production` and `type` is +either `app` or `database`. + +```{image} /_assets/img/install/cloud/ec2-discovery-tags.png +:alt: Adding tags on instance launch +:width: 100% +``` + +Setting `discovery.ec2.tag.env` to `production` will filter machines with +the tag key `env` set to `production` excluding machines that have set the +same key set to `dev` (and vice versa). + +To further more exclude "`app` instances" from discovery you can add the +setting `discovery.ec2.tag.type: database`. + +This way, any number of tags can be used for filtering, using the +`discovery.ec2.tag.` prefix for the setting name. + +Filtering by tags can help when you want to launch several CrateDB clusters +within the same security group, e.g: + +``` +discovery.ec2: + groups: sg-crate-demo + tag.env: production + tag.type: database +``` + +See also {ref}`crate-reference:discovery.ec2.tag.name`. + +### Filter by Availability Zones + +A third possible way to filter instances is via availability zones. Let's say +you have several clusters for the same tenant in different availability zones +(e.g. `us-west-1` and `us-west-2`), you can launch the instance with the +same security group (e.g. `sg-crate-demo`) and filter the instances used for +discovery by availability zone: + +``` +discovery.ec2: + groups: sg-crate-demo + availability_zones: us-west-1 +``` + +See also {ref}`crate-reference:discovery.ec2.availability_zones`. + +[3.3]: https://github.com/crate/crate/blob/3.3/blackbox/docs/config/cluster.rst#discovery +[amazon ec2]: https://aws.amazon.com/ec2/ +[assign them with an iam role]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/attach-iam-role.html +[aws guide]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html +[ec2 api]: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html +[iam roles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html +[latest]: https://cratedb.com/docs/crate/reference/en/latest/config/cluster.html#discovery +[sign the requests]: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sigv.html diff --git a/docs/install/cloud/aws/ec2-setup.rst b/docs/install/cloud/aws/ec2-setup.rst deleted file mode 100644 index 34b59a0d..00000000 --- a/docs/install/cloud/aws/ec2-setup.rst +++ /dev/null @@ -1,206 +0,0 @@ -.. highlight:: yaml -.. _ec2_setup: - -============================= -Running CrateDB on Amazon EC2 -============================= - -Introduction -============ - -When running CrateDB in a cloud environment such as `Amazon EC2`_ (Elastic -Compute Cloud) you usually face the problem that CrateDB's default discovery -mechanism does not work out of the box. - -Luckily, CrateDB has several built-in mechanisms for unicast host discovery, -also one for EC2. EC2 discovery uses the `EC2 API`_ to look up other EC2 hosts -that are then used as unicast hosts for node discovery (see -:ref:`Unicast Host Discovery `). - -.. NOTE:: - - Note that this best practice only describes how to use the EC2 discovery and - its settings, and not how to set up a cluster on EC2 securely. - -Basic Configuration -=================== - -The most important step for EC2 discovery is that you have to launch your EC2 -instances within the same security group. The rules of that security group must -at least allow traffic on CrateDB's transport port (default ``4300``). This -will allow CrateDB to accept and respond to pings from other CrateDB instances -with the same cluster name and form a cluster. - -Once you have your instances running and CrateDB installed, you can enable EC2 -discovery: - -+-----------------+-------------------+---------------------------------------+ -| CrateDB Version | Reference | Example | -+=================+===================+=======================================+ -| >=4.x | `latest`_ | :: | -| | | | -| | | discovery.seed_providers: ec2 | -+-----------------+-------------------+---------------------------------------+ -| <=3.x | `3.3`_ | :: | -| | | | -| | | discovery.zen.hosts_provider: ec2 | -+-----------------+-------------------+---------------------------------------+ - -To be able to use the EC2 API, CrateDB must `sign the requests`_ by using -AWS credentials consisting of an access key and a secret key. Therefore -AWS provides `IAM roles`_ to avoid any distribution of your AWS credentials -to the instances. - -CrateDB binds to the loopback interface by default. To get EC2 discovery -working, you need to update the :ref:`host ` -setting, in order to bind to and publish the site-local address:: - - network.host: _site_ - -.. _ec2_authentication: - -Authentication --------------- - -For that, it is recommended to create a separate user that has only the -necessary permissions to describe instances. First, you need to create an IAM -role in order to assign the instances later on. This `AWS guide`_ gives you a -short description of how you can create a policy via the CLI or AWS management -console. An example policy file is attached below and should at least contain -these API permissions/actions: - -.. code-block:: json - - { - "Statement": [ - { - "Action": [ - "ec2:DescribeInstances" - ], - "Effect": "Allow", - "Resource": [ - "*" - ] - } - ], - "Version": "2012-10-17" - } - -This policy allows the instances to find each other if they have been assigned -to this role on startup. - -.. NOTE:: - - The same environment variables are used when performing ``COPY FROM`` and - ``COPY TO``. This means that if you want to use these statements you'll have - to extend the permissions of that EC2 user. - -You could also provide them as system properties or as settings in the -``crate.yml``, but the advantage of env variables is that also -``COPY FROM/TO`` statements use the same environment variables. - -.. NOTE:: - - Note that the env variables need to be provided for the user that runs the - CrateDB process, which is usually the user ``crate`` in production. - -Now you are ready to start your CrateDB instances and they will discover each -other automatically. Use the AWS CLI or the AWS Console to run instances -and `assign them with an IAM role`_. Note that all CrateDB instances of the same -region will join the cluster as long as their cluster name is equal and they are -able to communicate to each other over the transport port. - -Production Setup -================ - -For a production setup, the best way to filter instances for discovery is via -a security group. This requires that you create a separate security group for -each cluster and allow TCP traffic on transport port ``4300`` (or other, if set -to a different port) only from within the group. - -.. image:: /_assets/img/install/cloud/ec2-discovery-security-groups.png - :alt: Assign security group on instance launch - :width: 100% - -Since the instances that belong to the same CrateDB cluster have the same -security group then, you can easily filter instances by that group. - -For example, when you launch your instances with the security group -``sg-crate-demo``, your CrateDB setting would be:: - - discovery.ec2.groups: sg-crate-demo - -The combination with the unique cluster name makes the production setup very -simple yet secure. - -See also :ref:`crate-reference:discovery.ec2.groups`. - -Optional Filters -================ - -Sometimes, however, you will want to have a more flexible setup. In this case, -there are a few other configuration settings that can be adjusted. - -.. _filter-by-tags: - -Filter by Tags --------------- - -The EC2 discovery mechanism can additionally filter machines by instance tags. -Tags are key-value pairs that can be assigned to an instance as metadata when -it is launched. - -A good example usage of tags is to assign environment and usage type -information. - -Let's assume you have a pool of several instances tagged with ``env`` and -``type``, where ``env`` is either ``dev`` or ``production`` and ``type`` is -either ``app`` or ``database``. - -.. image:: /_assets/img/install/cloud/ec2-discovery-tags.png - :alt: Adding tags on instance launch - :width: 100% - -Setting ``discovery.ec2.tag.env`` to ``production`` will filter machines with -the tag key ``env`` set to ``production`` excluding machines that have set the -same key set to ``dev`` (and vice versa). - -To further more exclude "``app`` instances" from discovery you can add the -setting ``discovery.ec2.tag.type: database``. - -This way, any number of tags can be used for filtering, using the -``discovery.ec2.tag.`` prefix for the setting name. - -Filtering by tags can help when you want to launch several CrateDB clusters -within the same security group, e.g:: - - discovery.ec2: - groups: sg-crate-demo - tag.env: production - tag.type: database - -See also :ref:`crate-reference:discovery.ec2.tag.name`. - -Filter by Availability Zones ----------------------------- - -A third possible way to filter instances is via availability zones. Let's say -you have several clusters for the same tenant in different availability zones -(e.g. ``us-west-1`` and ``us-west-2``), you can launch the instance with the -same security group (e.g. ``sg-crate-demo``) and filter the instances used for -discovery by availability zone:: - - discovery.ec2: - groups: sg-crate-demo - availability_zones: us-west-1 - -See also :ref:`crate-reference:discovery.ec2.availability_zones`. - -.. _3.3: https://github.com/crate/crate/blob/3.3/blackbox/docs/config/cluster.rst#discovery -.. _Amazon EC2: https://aws.amazon.com/ec2/ -.. _assign them with an IAM role: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/attach-iam-role.html -.. _AWS guide: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html -.. _EC2 API: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/Welcome.html -.. _IAM roles: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html -.. _latest: https://cratedb.com/docs/crate/reference/en/latest/config/cluster.html#discovery -.. _sign the requests: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sigv.html diff --git a/docs/install/cloud/aws/index.rst b/docs/install/cloud/aws/index.md similarity index 51% rename from docs/install/cloud/aws/index.rst rename to docs/install/cloud/aws/index.md index b619cc48..e0627201 100644 --- a/docs/install/cloud/aws/index.rst +++ b/docs/install/cloud/aws/index.md @@ -1,6 +1,4 @@ -======================================== -Run CrateDB on Amazon Web Services (AWS) -======================================== +# Run CrateDB on Amazon Web Services (AWS) Amazon Web Services (AWS) offers a wide range of cloud services, allowing to easily run and scale applications such as CrateDB. @@ -8,11 +6,13 @@ easily run and scale applications such as CrateDB. This section explains particularities in setting up CrateDB on AWS, to make the best use of its capabilities. -.. rubric:: Table of contents +```{rubric} Table of contents +``` -.. toctree:: - :maxdepth: 1 +```{toctree} +:maxdepth: 1 - ec2-setup - aws-terraform-setup - s3-setup +ec2-setup +aws-terraform-setup +s3-setup +``` diff --git a/docs/install/cloud/aws/s3-setup.md b/docs/install/cloud/aws/s3-setup.md new file mode 100644 index 00000000..48d58a29 --- /dev/null +++ b/docs/install/cloud/aws/s3-setup.md @@ -0,0 +1,102 @@ +```{highlight} yaml +``` + +(s3-setup)= + +# Using Amazon S3 as a snapshot repository + +CrateDB supports using the [Amazon S3] (Amazon Simple Storage Service) as a +snapshot repository. For this, you need to register the AWS plugin with +CrateDB. + +## Basic configuration + +Support for *Snapshot* and *Restore* to the [Amazon S3] service is enabled by +default in CrateDB. If you need to explicitly turn it off, disable the cloud +setting in the `crate.yml` file: + +``` +cloud.enabled: false +``` + +To be able to use the S3 API, CrateDB must [sign the requests] by using AWS +credentials consisting of an access key and a secret key. Therefore AWS +provides [IAM roles] to avoid any distribution of your AWS credentials to the +instances. + +(s3-authentication)= + +### Authentication + +It is recommended to restrict the permissions of CrateDB on the S3 to only the +required extend. First, an IAM role is required. This [AWS guide] gives a +short description of how to create a policy offer using the CLI or the AWS +management console. Further, access of the snapshot to the S3 bucket needs to +be restricted. An example policy file granting anybody access to a bucket +called `snaps.example.com` is attached below: + +```json +{ + "Statement": [ + { + "Action": [ + "s3:ListBucket", + "s3:GetBucketLocation", + "s3:ListBucketMultipartUploads", + "s3:ListBucketVersions" + ], + "Effect": "Allow", + "Principal": "*", + "Resource": [ + "arn:aws:s3:::snaps.example.com" + ] + }, + { + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:AbortMultipartUpload", + "s3:ListMultipartUploadParts" + ], + "Effect": "Allow", + "Principal": "*", + "Resource": [ + "arn:aws:s3:::snaps.example.com/*" + ] + } + ], + "Version": "2012-10-17" +} +``` + +Access permissions can be further restricted to a specific AWS Principal by +changing the `Statement.Principal` setting. Please refer to [AWS principals] +for more information. + +For further AWS policy examples and detailed information, please refer to +[AWS policy examples] and the links therein. + +It has to be noted, that the bucket needs to exist before registering a +repository for snapshots within CrateDB. CrateDB can also be allowed to create +the bucket. However this requires the following permissions to be contained +within the policy: + +```json +{ + "Action": [ + "s3:CreateBucket" + ], + "Effect": "Allow", + "Resource": [ + "arn:aws:s3:::snaps.example.com" + ] +} +``` + +[amazon s3]: https://aws.amazon.com/s3/ +[aws guide]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html +[aws policy examples]: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html +[aws principals]: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html +[iam roles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html +[sign the requests]: https://docs.aws.amazon.com/general/latest/gr/signing_aws_api_requests.html diff --git a/docs/install/cloud/aws/s3-setup.rst b/docs/install/cloud/aws/s3-setup.rst deleted file mode 100644 index ae0e4996..00000000 --- a/docs/install/cloud/aws/s3-setup.rst +++ /dev/null @@ -1,102 +0,0 @@ -.. highlight:: yaml -.. _s3_setup: - -======================================== -Using Amazon S3 as a snapshot repository -======================================== - -CrateDB supports using the `Amazon S3`_ (Amazon Simple Storage Service) as a -snapshot repository. For this, you need to register the AWS plugin with -CrateDB. - -Basic configuration -=================== - -Support for *Snapshot* and *Restore* to the `Amazon S3`_ service is enabled by -default in CrateDB. If you need to explicitly turn it off, disable the cloud -setting in the ``crate.yml`` file:: - - cloud.enabled: false - -To be able to use the S3 API, CrateDB must `sign the requests`_ by using AWS -credentials consisting of an access key and a secret key. Therefore AWS -provides `IAM roles`_ to avoid any distribution of your AWS credentials to the -instances. - -.. _s3_authentication: - -Authentication --------------- - -It is recommended to restrict the permissions of CrateDB on the S3 to only the -required extend. First, an IAM role is required. This `AWS guide`_ gives a -short description of how to create a policy offer using the CLI or the AWS -management console. Further, access of the snapshot to the S3 bucket needs to -be restricted. An example policy file granting anybody access to a bucket -called ``snaps.example.com`` is attached below: - -.. code-block:: json - - { - "Statement": [ - { - "Action": [ - "s3:ListBucket", - "s3:GetBucketLocation", - "s3:ListBucketMultipartUploads", - "s3:ListBucketVersions" - ], - "Effect": "Allow", - "Principal": "*", - "Resource": [ - "arn:aws:s3:::snaps.example.com" - ] - }, - { - "Action": [ - "s3:GetObject", - "s3:PutObject", - "s3:DeleteObject", - "s3:AbortMultipartUpload", - "s3:ListMultipartUploadParts" - ], - "Effect": "Allow", - "Principal": "*", - "Resource": [ - "arn:aws:s3:::snaps.example.com/*" - ] - } - ], - "Version": "2012-10-17" - } - -Access permissions can be further restricted to a specific AWS Principal by -changing the ``Statement.Principal`` setting. Please refer to `AWS principals`_ -for more information. - -For further AWS policy examples and detailed information, please refer to -`AWS policy examples`_ and the links therein. - -It has to be noted, that the bucket needs to exist before registering a -repository for snapshots within CrateDB. CrateDB can also be allowed to create -the bucket. However this requires the following permissions to be contained -within the policy: - -.. code-block:: json - - { - "Action": [ - "s3:CreateBucket" - ], - "Effect": "Allow", - "Resource": [ - "arn:aws:s3:::snaps.example.com" - ] - } - -.. _`Amazon S3`: https://aws.amazon.com/s3/ -.. _`sign the requests`: https://docs.aws.amazon.com/general/latest/gr/signing_aws_api_requests.html -.. _`IAM roles`: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html -.. _`AWS guide`: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html -.. _`AWS principals`: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html -.. _`AWS policy examples`: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html diff --git a/docs/install/cloud/azure/index.rst b/docs/install/cloud/azure/index.md similarity index 53% rename from docs/install/cloud/azure/index.rst rename to docs/install/cloud/azure/index.md index fd16f3b5..4f42dc79 100644 --- a/docs/install/cloud/azure/index.rst +++ b/docs/install/cloud/azure/index.md @@ -1,18 +1,17 @@ -.. _azure: +(azure)= -============================== -Run CrateDB on Microsoft Azure -============================== +# Run CrateDB on Microsoft Azure Microsoft Azure is the second largest and fastest growing provider of Cloud Services in the world. It offers a wide variety of options including Windows servers, containers, application images and much more. -.. rubric:: Table of contents +```{rubric} Table of contents +``` -.. toctree:: - :maxdepth: 1 - - vm - terraform +```{toctree} +:maxdepth: 1 +vm +terraform +``` diff --git a/docs/install/cloud/azure/terraform.rst b/docs/install/cloud/azure/terraform.md similarity index 52% rename from docs/install/cloud/azure/terraform.rst rename to docs/install/cloud/azure/terraform.md index c8746f3d..4994992c 100644 --- a/docs/install/cloud/azure/terraform.rst +++ b/docs/install/cloud/azure/terraform.md @@ -1,14 +1,12 @@ -.. _azure_terraform_setup: +(azure-terraform-setup)= -============================= -Running CrateDB via Terraform -============================= +# Running CrateDB via Terraform -In :ref:`azure_vm_setup`, we elaborated on how to leverage Azure's functionality to +In {ref}`azure_vm_setup`, we elaborated on how to leverage Azure's functionality to set up a CrateDB cluster. Here, we will explore how to automate this kind of setup. -`Terraform`_ is an infrastructure as code tool, often used as an abstraction +[Terraform] is an infrastructure as code tool, often used as an abstraction layer on top of a cloud's management APIs. Instead of creating cloud resources manually, the target state is specified via configuration files which can also be managed in a version control system. This brings some advantages, such as but @@ -20,95 +18,94 @@ not limited to: so on - Better prediction and tracing of infrastructure changes -The `crate-terraform`_ repository provides a predefined configuration template +The [crate-terraform] repository provides a predefined configuration template of various Azure resources to form a CrateDB cluster on Azure (such as VMs, load balancer, etc). This eliminates the need to manually compose all required resources and their interactions. -.. SEEALSO:: +:::{SEEALSO} +Engage with us in the [community post] on Terraform deployments for any +questions or feedback! +::: - Engage with us in the `community post`_ on Terraform deployments for any - questions or feedback! +:::{CAUTION} +The provided configuration is meant to be used for development or testing +purposes and does not aim to fulfil all needs of a production environment. +::: -.. CAUTION:: - - The provided configuration is meant to be used for development or testing - purposes and does not aim to fulfil all needs of a production environment. - -Prerequisites -============= +## Prerequisites Before creating the configuration to launch your CrateDB cluster, the following prerequisites should be fulfilled: 1. The Terraform CLI is installed as per - `Terraform's installation guide`_ -2. The git CLI is installed as per `git's installation guide`_ + [Terraform's installation guide] +2. The git CLI is installed as per [git's installation guide] 3. Azure credentials are configured for Terraform. If you already have a configured Azure CLI setup, Terraform will reuse this configuration. If not, - see the `Azure provider`_ documentation on authentication. + see the [Azure provider] documentation on authentication. -Deployment configuration -======================== +## Deployment configuration The CrateDB Terraform configuration consists of a set of variables to customize -your deployment. Create a new file ``main.tf`` with the following content and +your deployment. Create a new file `main.tf` with the following content and adjust variable values as needed: -.. code-block:: - - module "cratedb-cluster" { - source = "github.com/crate/crate-terraform.git/azure" - - # The Azure subscription ID - subscription_id = "x-y-z" +``` +module "cratedb-cluster" { + source = "github.com/crate/crate-terraform.git/azure" - # Global configuration items for naming/tagging resources - config = { - project_name = "example-project" - environment = "test" - owner = "Crate.IO" - team = "Customer Engineering" + # The Azure subscription ID + subscription_id = "x-y-z" - # Run "az account list-locations" for a full list - location = "westeurope" - } + # Global configuration items for naming/tagging resources + config = { + project_name = "example-project" + environment = "test" + owner = "Crate.IO" + team = "Customer Engineering" - # CrateDB-specific configuration - crate = { - # Java Heap size in GB available to CrateDB - heap_size_gb = 2 - - cluster_name = "crate-cluster" + # Run "az account list-locations" for a full list + location = "westeurope" + } - # The number of nodes the cluster will consist of - cluster_size = 2 + # CrateDB-specific configuration + crate = { + # Java Heap size in GB available to CrateDB + heap_size_gb = 2 - # Enables a self-signed SSL certificate - ssl_enable = true - } + cluster_name = "crate-cluster" - # Azure VM specific configuration - vm = { - # The size of the disk storing CrateDB's data directory - disk_size_gb = 512 - storage_account_type = "Premium_LRS" - size = "Standard_DS12_v2" + # The number of nodes the cluster will consist of + cluster_size = 2 - # Enabling SSH access - ssh_access = true - # Username to connect via SSH to the nodes - user = "cratedb-vmadmin" - } + # Enables a self-signed SSL certificate + ssl_enable = true } - output "cratedb" { - value = module.cratedb-cluster - sensitive = true + # Azure VM specific configuration + vm = { + # The size of the disk storing CrateDB's data directory + disk_size_gb = 512 + storage_account_type = "Premium_LRS" + size = "Standard_DS12_v2" + + # Enabling SSH access + ssh_access = true + # Username to connect via SSH to the nodes + user = "cratedb-vmadmin" } +} + +output "cratedb" { + value = module.cratedb-cluster + sensitive = true +} +``` The Azure-specific variables need to be adjusted according to your environment: +```{eval-rst} +--------------------------+--------------------------------------------------------------+----------------------------------+ | Variable | Explanation | How to obtain | +==========================+==============================================================+==================================+ @@ -123,68 +120,67 @@ The Azure-specific variables need to be adjusted according to your environment: +--------------------------+--------------------------------------------------------------+----------------------------------+ | ``size`` | Specifies the size of the VM | ``az vm list-sizes`` | +--------------------------+--------------------------------------------------------------+----------------------------------+ +``` -Execution -========= +## Execution Once all variables are configured properly, Terraform needs to be initialized: -.. code-block:: bash - - terraform init +```bash +terraform init +``` To proceed with executing the creation of resources, apply the configuration. There will be a final confirmation prompt before any changes are applied to your Azure account: -.. code-block:: bash - - terraform apply +```bash +terraform apply +``` Once the execution succeeded, a message similar to the one below is shown: -.. code-block:: bash - - Apply complete! Resources: 22 added, 0 changed, 0 destroyed. +```bash +Apply complete! Resources: 22 added, 0 changed, 0 destroyed. - Outputs: +Outputs: - cratedb = +cratedb = +``` Terraform internally tracks the state of each resource it manages, including certain outputs with details on the created Cluster. As those details include credentials, they are marked as sensitive and not shown in the output above. To view the output, run: -.. code-block:: bash +```bash +terraform output cratedb +``` - terraform output cratedb - -The output variable ``cratedb_application_url`` points to the load balancer with +The output variable `cratedb_application_url` points to the load balancer with the port of CrateDB's Admin UI. Opening that URL in your browser should show a -password prompt on which you can authenticate using ``cratedb_username`` and -``cratedb_password``. +password prompt on which you can authenticate using `cratedb_username` and +`cratedb_password`. -Deprovisioning -============== +## Deprovisioning If the CrateDB cluster is not needed anymore, you can easily instruct Terraform to destroy all associated resources: -.. code-block:: bash - - terraform destroy - -.. CAUTION:: - - Destroying the cluster will permanently delete all data stored on it. Use - :ref:`snapshots ` to create a backup on Azure Blob storage - if needed. - -.. _Terraform: https://www.terraform.io -.. _crate-terraform: https://github.com/crate/crate-terraform -.. _Terraform's installation guide: https://www.terraform.io/downloads.html -.. _git's installation guide: https://git-scm.com/downloads -.. _Azure provider: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs -.. _List of Storage Account Types: https://docs.microsoft.com/en-us/azure/templates/microsoft.compute/virtualmachines?tabs=bicep#manageddiskparameters -.. _community post: https://community.cratedb.com/t/deploying-cratedb-to-the-cloud-via-terraform/849 +```bash +terraform destroy +``` + +:::{CAUTION} +Destroying the cluster will permanently delete all data stored on it. Use +{ref}`snapshots ` to create a backup on Azure Blob storage +if needed. +::: + +[azure provider]: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs +[community post]: https://community.cratedb.com/t/deploying-cratedb-to-the-cloud-via-terraform/849 +[crate-terraform]: https://github.com/crate/crate-terraform +[git's installation guide]: https://git-scm.com/downloads +[list of storage account types]: https://docs.microsoft.com/en-us/azure/templates/microsoft.compute/virtualmachines?tabs=bicep#manageddiskparameters +[terraform]: https://www.terraform.io +[terraform's installation guide]: https://www.terraform.io/downloads.html diff --git a/docs/install/cloud/azure/vm.md b/docs/install/cloud/azure/vm.md new file mode 100644 index 00000000..145a6215 --- /dev/null +++ b/docs/install/cloud/azure/vm.md @@ -0,0 +1,158 @@ +(azure-vm-setup)= + +# Running CrateDB on Azure VMs + +Getting CrateDB working on Azure with Linux or Windows is a simple process. You +can use Azure's management console or CLI interface ([Learn how to install +here][learn how to install here]). + +## Azure and Linux + +### Create a resource group + +Azure uses 'Resource Groups' to group together related services and resources +for easier management. + +Create a resource group for the CrateDB cluster by selecting *Resource groups* +under the *new* left hand panel of the Azure portal. + +```{image} /_assets/img/install/cloud/azure-new-resource-group.png +:alt: Create Virtual Network +``` + +### Create a network security group + +CrateDB uses two ports, one for inter-node communication (`4300`) and one for +it's http endpoint (`4200`), so access to these needs to be opened. + +Create a *New Security Group*, giving it a name and assigning it to the +'Resource Group' just created. + +```{image} /_assets/img/install/cloud/azure-new-nsg.png +:alt: Create New Security Group +``` + +Find that security group in your resources list and open it's settings, +navigating to the *Inbound security rules* section. + +```{image} /_assets/img/install/cloud/azure-nsg-inbound.png +:alt: Create New Security Group +``` + +Add a rule for each port: + +```{image} /_assets/img/install/cloud/azure-inbound-rules.png +:alt: Create New Security Group +``` + +### Create a virtual network + +To create a cluster of CrateDB nodes on some cloud hosting providers, CrateDB +relies on unicast for inter-node communication. + +The easiest way to get Unicast communication working with Azure is to create a +Virtual Network (*+ -> Networking -> Virtual Network*) so that all the cluster +nodes exist on the same IP range. Give the network a name, a region and let +Azure handle all the remaining settings by clicking the next arrow on each +screen. + +```{image} /_assets/img/install/cloud/azure-create-vn.png +:alt: Create Virtual Network +``` + +Once the Virtual Network has been created, find it in your resources list, open +the edit screen and the *Subnets* setting. Add the security group created +earlier to the subnet. + +```{image} /_assets/img/install/cloud/azure-vn-subnet-sg.png +:alt: Add Security Group +``` + +### Create virtual machines + +Next create virtual machines to act as your CrateDB nodes. In this tutorial, I +chose two low-specification Ubuntu 14.04 servers, but you likely have your own +preferred configurations. + +Most importantly, make sure you select the Virtual Network created earlier. + +### Install CrateDB + +*Note that these instructions should be followed on each VM in your cluster.* + +To Install CrateDB, ssh into your VMs and follow {ref}`the standard process for +Linux installation `, this will automatically start an instance of CrateDB, +which we will need to restart after the next step. + +### Configure CrateDB + +*Note that these instructions should be followed on each VM in your cluster.* + +To set the Unicast hosts for the CrateDB cluster we change the default +configuration file at */etc/crate/crate.yml*. + +Uncomment / add these lines: + +| CrateDB Version | Reference | Configuration Example | +| --------------- | --------- | --------------------- | +| \<=4.x | [latest] | ```yaml +discovery.seed_hosts: - node1.example.com:4300 - node2.example.com:4300 - 10.0.1.102:4300 - 10.0.1.103:4300 ``` | +| \<=3.x | [3.3] | ```yaml +discovery.zen.ping.unicast.hosts: - node1.example.com:4300 - node2.example.com:4300 - 10.0.1.102:4300 - 10.0.1.103:4300 ``` | + +Note: You might want to try {ref}`DNS based discovery +` for inter-node communication. + +Uncomment and set the cluster name + +```yaml +cluster.name: crate +``` + +Restart CrateDB `service crate restart`. + +## Azure and Windows + +### Initial setup + +To create a Resource Group, Network security group and virtual network, follow +the same steps as for Azure and Linux. + +### Create virtual machines + +Similar steps to creating Virtual Machines for Azure and Linux, but create the +VM based on the 'Windows Server 2012 R2 Datacenter' image. + +### Install CrateDB + +*Note that these instructions should be followed on each VM in your cluster.* + +To install CrateDB on Windows Server, you will need a [Java JDK installed]. +Ensure that the `JAVA*HOME` environment variable is set. + +```{image} /_assets/img/install/cloud/azure-envvar.png +:alt: Environment Variables +``` + +Next {ref}`download the CrateDB Tarball `, expand it, and move +it to a convenient location. + +### Configure CrateDB and Windows + +*Note that these instructions need to be followed on each VM in your cluster.* + +Edit the *config/crate.yml* configuration file in the expanded directory to +make the same changes noted above in running CrateDB on Azure & Linux. + +We need to allow the ports CrateDB uses through the Windows Firewall + +```{image} /_assets/img/install/cloud/azure-port.gif +:alt: Firewall configuration +``` + +Start crate by running `bin/crate`. + +[3.3]: https://github.com/crate/crate/blob/3.3/blackbox/docs/config/cluster.rst#discovery +[java jdk installed]: https://www.oracle.com/java/technologies/downloads/#java8 +[latest]: https://cratedb.com/docs/crate/reference/en/latest/config/cluster.html#discovery +[learn how to install here]: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli diff --git a/docs/install/cloud/azure/vm.rst b/docs/install/cloud/azure/vm.rst deleted file mode 100644 index d62a0742..00000000 --- a/docs/install/cloud/azure/vm.rst +++ /dev/null @@ -1,180 +0,0 @@ -.. _azure_vm_setup: - -============================ -Running CrateDB on Azure VMs -============================ - -Getting CrateDB working on Azure with Linux or Windows is a simple process. You -can use Azure's management console or CLI interface (`Learn how to install -here`_). - -Azure and Linux -=============== - -Create a resource group ------------------------ - -Azure uses 'Resource Groups' to group together related services and resources -for easier management. - -Create a resource group for the CrateDB cluster by selecting *Resource groups* -under the *new* left hand panel of the Azure portal. - -.. image:: /_assets/img/install/cloud/azure-new-resource-group.png - :alt: Create Virtual Network - -Create a network security group -------------------------------- - -CrateDB uses two ports, one for inter-node communication (``4300``) and one for -it's http endpoint (``4200``), so access to these needs to be opened. - -Create a *New Security Group*, giving it a name and assigning it to the -'Resource Group' just created. - -.. image:: /_assets/img/install/cloud/azure-new-nsg.png - :alt: Create New Security Group - -Find that security group in your resources list and open it's settings, -navigating to the *Inbound security rules* section. - -.. image:: /_assets/img/install/cloud/azure-nsg-inbound.png - :alt: Create New Security Group - -Add a rule for each port: - -.. image:: /_assets/img/install/cloud/azure-inbound-rules.png - :alt: Create New Security Group - -Create a virtual network ------------------------- - -To create a cluster of CrateDB nodes on some cloud hosting providers, CrateDB -relies on unicast for inter-node communication. - -The easiest way to get Unicast communication working with Azure is to create a -Virtual Network (*+ -> Networking -> Virtual Network*) so that all the cluster -nodes exist on the same IP range. Give the network a name, a region and let -Azure handle all the remaining settings by clicking the next arrow on each -screen. - -.. image:: /_assets/img/install/cloud/azure-create-vn.png - :alt: Create Virtual Network - -Once the Virtual Network has been created, find it in your resources list, open -the edit screen and the *Subnets* setting. Add the security group created -earlier to the subnet. - -.. image:: /_assets/img/install/cloud/azure-vn-subnet-sg.png - :alt: Add Security Group - -Create virtual machines ------------------------ - -Next create virtual machines to act as your CrateDB nodes. In this tutorial, I -chose two low-specification Ubuntu 14.04 servers, but you likely have your own -preferred configurations. - -Most importantly, make sure you select the Virtual Network created earlier. - -Install CrateDB ---------------- - -*Note that these instructions should be followed on each VM in your cluster.* - -To Install CrateDB, ssh into your VMs and follow :ref:`the standard process for -Linux installation `, this will automatically start an instance of CrateDB, -which we will need to restart after the next step. - - -Configure CrateDB ------------------ - -*Note that these instructions should be followed on each VM in your cluster.* - -To set the Unicast hosts for the CrateDB cluster we change the default -configuration file at */etc/crate/crate.yml*. - -Uncomment / add these lines: - -+-----------------+-----------+---------------------------------------+ -| CrateDB Version | Reference | Configuration Example | -+=================+===========+=======================================+ -| <=4.x | `latest`_ | .. code-block:: yaml | -| | | | -| | | discovery.seed_hosts: | -| | | - node1.example.com:4300 | -| | | - node2.example.com:4300 | -| | | - 10.0.1.102:4300 | -| | | - 10.0.1.103:4300 | -+-----------------+-----------+---------------------------------------+ -| <=3.x | `3.3`_ | .. code-block:: yaml | -| | | | -| | | discovery.zen.ping.unicast.hosts: | -| | | - node1.example.com:4300 | -| | | - node2.example.com:4300 | -| | | - 10.0.1.102:4300 | -| | | - 10.0.1.103:4300 | -+-----------------+-----------+---------------------------------------+ - -Note: You might want to try :ref:`DNS based discovery -` for inter-node communication. - -Uncomment and set the cluster name - -.. code-block:: yaml - - cluster.name: crate - -Restart CrateDB ``service crate restart``. - -Azure and Windows -================= - -Initial setup -------------- - -To create a Resource Group, Network security group and virtual network, follow -the same steps as for Azure and Linux. - -Create virtual machines ------------------------ - -Similar steps to creating Virtual Machines for Azure and Linux, but create the -VM based on the 'Windows Server 2012 R2 Datacenter' image. - -Install CrateDB ---------------- - -*Note that these instructions should be followed on each VM in your cluster.* - -To install CrateDB on Windows Server, you will need a `Java JDK installed`_. -Ensure that the ``JAVA*HOME`` environment variable is set. - -.. image:: /_assets/img/install/cloud/azure-envvar.png - :alt: Environment Variables - -Next :ref:`download the CrateDB Tarball `, expand it, and move -it to a convenient location. - - -Configure CrateDB and Windows ------------------------------ - -*Note that these instructions need to be followed on each VM in your cluster.* - -Edit the *config/crate.yml* configuration file in the expanded directory to -make the same changes noted above in running CrateDB on Azure & Linux. - -We need to allow the ports CrateDB uses through the Windows Firewall - -.. image:: /_assets/img/install/cloud/azure-port.gif - :alt: Firewall configuration - -Start crate by running ``bin/crate``. - - -.. _3.3: https://github.com/crate/crate/blob/3.3/blackbox/docs/config/cluster.rst#discovery -.. _Java JDK installed: https://www.oracle.com/java/technologies/downloads/#java8 -.. _latest: https://cratedb.com/docs/crate/reference/en/latest/config/cluster.html#discovery -.. _Learn how to install here: https://docs.microsoft.com/en-us/cli/azure/install-azure-cli diff --git a/docs/install/cloud/index.md b/docs/install/cloud/index.md new file mode 100644 index 00000000..721a6d47 --- /dev/null +++ b/docs/install/cloud/index.md @@ -0,0 +1,17 @@ +(install-cloud)= + +# Cloud Hosting + +CrateDB provides packages and executables that will work on any operating +system capable of running Java. + +```{rubric} Table of contents +``` + +```{toctree} +:maxdepth: 3 +:titlesonly: true + +Amazon AWS +Microsoft Azure +``` diff --git a/docs/install/cloud/index.rst b/docs/install/cloud/index.rst deleted file mode 100644 index 6a1e9c12..00000000 --- a/docs/install/cloud/index.rst +++ /dev/null @@ -1,18 +0,0 @@ -.. _install-cloud: - -############# -Cloud Hosting -############# - -CrateDB provides packages and executables that will work on any operating -system capable of running Java. - -.. rubric:: Table of contents - -.. toctree:: - :maxdepth: 3 - :titlesonly: - - Amazon AWS - Microsoft Azure - diff --git a/docs/install/configure.md b/docs/install/configure.md new file mode 100644 index 00000000..cb1580d0 --- /dev/null +++ b/docs/install/configure.md @@ -0,0 +1,58 @@ +(install-configure)= + +# Configuration Settings + +In order to configure CrateDB, please take note of the configuration file +locations and the available environment variables. + +## Configuration Files + +When using the package-based setup flavor for {ref}`install-deb` or +{ref}`install-rpm`, the main CrateDB configuration files are located within the +`/etc/crate` directory. + +When using the {ref}`install-tarball` setup, or the {ref}`Microsoft Windows ` +setup, the configuration files are located within the `config/` directory relative to the +working directory. + +## Environment Variables + +For the vanilla package-based setup flavor, the CrateDB startup script reads +{ref}`crate-reference:conf-env` from the `/etc/default/crate` file as +environment variables. + +:::{Note} +RPM packages of CrateDB versions up to [5.2.11], [5.3.8], [5.4.7] +and [5.5.2] are using the `/etc/sysconfig/crate` file instead. +::: + +When using the {ref}`install-tarball` setup, or the {ref}`Microsoft Windows ` +setup, the environment variables will be defined by `bin/crate{.sh,.bat}` relative to the +working directory. + +Here is an example: + +``` +# Configure heap size (defaults to 256m min, 1g max). +CRATE_HEAP_SIZE=2g + +# Maximum number of open files, defaults to 65535. +# MAX_OPEN_FILES=65535 + +# Maximum locked memory size. Set to "unlimited" if you use the +# bootstrap.mlockall option in crate.yml. You must also set +# CRATE_HEAP_SIZE. +MAX_LOCKED_MEMORY=unlimited + +# Provide additional Java OPTS. +# CRATE_JAVA_OPTS= + +# Force the JVM to use IPv4 only. +CRATE_USE_IPV4=true +``` + +[5.2.11]: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.2.11.html +[5.3.8]: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.3.8.html +[5.4.7]: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.4.7.html +[5.5.2]: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.5.2.html +[sources]: https://en.wikipedia.org/wiki/Source_(command) diff --git a/docs/install/configure.rst b/docs/install/configure.rst deleted file mode 100644 index eefc9151..00000000 --- a/docs/install/configure.rst +++ /dev/null @@ -1,63 +0,0 @@ -.. _install-configure: - -###################### -Configuration Settings -###################### - -In order to configure CrateDB, please take note of the configuration file -locations and the available environment variables. - - -Configuration Files -=================== - -When using the package-based setup flavor for :ref:`install-deb` or -:ref:`install-rpm`, the main CrateDB configuration files are located within the -``/etc/crate`` directory. - -When using the :ref:`install-tarball` setup, or the :ref:`Microsoft Windows ` -setup, the configuration files are located within the ``config/`` directory relative to the -working directory. - -Environment Variables -===================== - -For the vanilla package-based setup flavor, the CrateDB startup script reads -:ref:`crate-reference:conf-env` from the ``/etc/default/crate`` file as -environment variables. - -.. Note:: - - RPM packages of CrateDB versions up to `5.2.11`_, `5.3.8`_, `5.4.7`_ - and `5.5.2`_ are using the ``/etc/sysconfig/crate`` file instead. - -When using the :ref:`install-tarball` setup, or the :ref:`Microsoft Windows ` -setup, the environment variables will be defined by ``bin/crate{.sh,.bat}`` relative to the -working directory. - -Here is an example:: - - # Configure heap size (defaults to 256m min, 1g max). - CRATE_HEAP_SIZE=2g - - # Maximum number of open files, defaults to 65535. - # MAX_OPEN_FILES=65535 - - # Maximum locked memory size. Set to "unlimited" if you use the - # bootstrap.mlockall option in crate.yml. You must also set - # CRATE_HEAP_SIZE. - MAX_LOCKED_MEMORY=unlimited - - # Provide additional Java OPTS. - # CRATE_JAVA_OPTS= - - # Force the JVM to use IPv4 only. - CRATE_USE_IPV4=true - - - -.. _5.2.11: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.2.11.html -.. _5.3.8: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.3.8.html -.. _5.4.7: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.4.7.html -.. _5.5.2: https://cratedb.com/docs/crate/reference/en/latest/appendices/release-notes/5.5.2.html -.. _sources: https://en.wikipedia.org/wiki/Source_(command) diff --git a/docs/install/container/docker.md b/docs/install/container/docker.md new file mode 100644 index 00000000..20782799 --- /dev/null +++ b/docs/install/container/docker.md @@ -0,0 +1,495 @@ +```{highlight} sh +``` + +(cratedb-docker)= + +# Run CrateDB on Docker + +CrateDB and [Docker] are a great match thanks to CrateDB’s [horizontally +scalable][horizontally scalable] [shared-nothing architecture] that lends itself well to +[containerization]. + +This document covers the essentials of running CrateDB on Docker. + +:::{NOTE} +If you are just getting started with CrateDB and Docker, check out the +introductory guides for {ref}`spinning up your first container-based +CrateDB instance `. +::: + +:::{SEEALSO} +A guide for running CrateDB on {ref}`Kubernetes `. + +The official [CrateDB Docker image]. +::: + +## Quick start + +### Creating a cluster + +To get started with CrateDB and Docker, you will create a three-node cluster +on your dev machine. The cluster will run on a dedicated network and will +require the first two nodes, `crate01` and `crate02`, to vote which one +is the master. The third node, `crate03`, will simply join the cluster +with no vote. + +To create the [user-defined network], run the command: + +``` +sh$ docker network create crate +``` + +You should then be able to see something like this: + +```text +sh$ docker network ls +NETWORK ID NAME DRIVER SCOPE +1bf1b7acd66f bridge bridge local +51cebbdf7d2b crate bridge local +5b8e6fbe9ab6 host host local +8baa149b6986 none null local +``` + +Any CrateDB container put into the `crate` network will be able to resolve +other CrateDB containers by name. Each container will run a single node, which +is identified by its node name. In this guide, container `crate01` will run +node `crate01`, container `crate02` will run node `crate02`, and +container `crate03` will run cluster node `crate03`. + +You can then create your first CrateDB container and node, like this: + +``` +sh$ docker run --rm -d \ + --name=crate01 \ + --net=crate \ + -p 4201:4200 \ + --env CRATE_HEAP_SIZE=1g \ + crate -Cnetwork.host=_site_ \ + -Cnode.name=crate01 \ + -Cdiscovery.seed_hosts=crate02,crate03 \ + -Ccluster.initial_master_nodes=crate01,crate02 \ + -Cgateway.expected_data_nodes=3 \ + -Cgateway.recover_after_data_nodes=3 +``` + +Breaking the command down: + +- Creates and runs a container called `crate01` (--name) in detached + mode (-d). The container will automatically be removed on exit (--rm), + and all its internal data will be lost. If you would like to avoid this, + you can mount a dedicated volume (-v) for the container (each container + would need its own dedicated folder on your dev machine, see + {ref}`docker-compose` as reference). +- Puts the container into the `crate` network and maps port `4201` on your + host machine to port `4200` on the container (admin UI). +- Defines the environment variable:ref:`CRATE_HEAP_SIZE `, + which is used by CrateDB to allocate 1 GB for its heap memory. +- Runs the command `crate` inside the container with parameters: + : - `network.host`: The `_site_` value results in the binding of the + CrateDB process to a site-local IP address. + - `node.name`: Defines the node's name as `crate01` (used by + master election). + - `discovery.seed_hosts`: This parameter lists the other hosts in the + cluster. The format is a comma-separated list of `host:port` entries, + where port defaults to setting `transport.tcp.port`. Each node must + contain the name of all the other hosts in this list. Notice also that + any node in the cluster might be started at any time, and this will + create connection exceptions in the log files, however all nodes will + eventually be running and interconnected. + - `cluster.initial_master_nodes`: Defines the list of master-eligible + node names which will participate in the vote of the first master + (first bootstrap). If this parameter is not defined, then it is expected + that the node will join an already formed cluster. This parameter is only + relevant for the first election. + - `gateway.expected_data_nodes` and `gateway.recover_after_data_nodes`: + Specifies how many nodes you expect in the cluster and how many nodes must + be discovered before the cluster state is recovered. + +:::{NOTE} +If this command aborts with an error, consult the +{ref}`docker-troubleshooting` section for help. +::: + +Verify that the node is running with `docker ps` and you should see something like this: + +```text +sh$ docker ps +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +f79116373877 crate "/docker-entrypoin..." 16 seconds ago Up 15 seconds 4300/tcp, 5432-5532/tcp, 0.0.0.0:4201->4200/tcp crate01 +``` + +You can have a look at the container's logs in tail mode like this: + +```text +sh$ docker logs -f crate01 +``` + +:::{NOTE} +To exit the logs view, press ctrl+C. +::: + +You can visit the admin UI in your browser with this URL: + +```text +http://localhost:4201/ +``` + +Select the *Cluster* icon from the left-hand navigation, and you should see a +page that lists a single node. + +Now add the second node, `crate02`, to the cluster: + +``` +sh$ docker run --rm -d \ + --name=crate02 \ + --net=crate \ + -p 4202:4200 \ + --env CRATE_HEAP_SIZE=1g \ + crate -Cnetwork.host=_site_ \ + -Cnode.name=crate02 \ + -Cdiscovery.seed_hosts=crate01,crate03 \ + -Ccluster.initial_master_nodes=crate01,crate02 \ + -Cgateway.expected_data_nodes=3 \ + -Cgateway.recover_after_data_nodes=2 +``` + +Notice here that: + +- You updated the container and node name to `crate02`. +- You updated the port mapping, so that port `4202` on your host is mapped + to `4200` on the container. +- You set the parameter `discovery.seed_hosts` to contain the other hosts of + the cluster. +- `cluster.initial_master_nodes`: Since only nodes `crate01` and `crate02` + will participate in the election of the first master, this setting is unchanged. + +Now, if you go back to the admin UI you opened earlier, or visit the admin UI +of the node you just created (located at `http://localhost:4202/`) you +should see two nodes. + +You can now add `crate03` like this: + +``` +sh$ docker run --rm -d \ + --name=crate03 \ + --net=crate -p 4203:4200 \ + --env CRATE_HEAP_SIZE=1g \ + crate -Cnetwork.host=_site_ \ + -Cnode.name=crate03 \ + -Cdiscovery.seed_hosts=crate01,crate02 \ + -Cgateway.expected_data_nodes=3 \ + -Cgateway.recover_after_data_nodes=2 +``` + +Notice here that: + +- You updated the container and node name to `crate03`. +- You updated the port mapping, so that port `4203` on your host is mapped + to `4200` on the container. +- You set parameter `discovery.seed_hosts` to contain the other hosts of the + cluster. +- `cluster.initial_master_nodes`: This setting is removed since only nodes + `crate01` and `crate02` will participate in the election of the first + master. + +Success! You just created a three-node CrateDB cluster with Docker. + +:::{NOTE} +This is only a quick start example and you will notice some failing checks +in the admin UI. For a more robust cluster, you should, at the very least, +configure the {ref}`Metadata Gateway ` and +{ref}`Discovery ` settings. +::: + +(docker-troubleshooting)= + +### Troubleshooting + +The most common issue when running CrateDB on Docker is a failing +{ref}`bootstrap check ` because the *memory map limit* +is too low. This can be {ref}`adjusted on the host system `. + +If the limit cannot be adjusted on the host system, the memory map limit check +can be bypassed by passing the `-Cnode.store.allow_mmap=false` option to +the `crate` command: + +``` +sh$ docker run -d --name=crate01 \ + --net=crate -p 4201:4200 --env CRATE_HEAP_SIZE=1g \ + crate -Cnetwork.host=_site_ \ + -Cnode.store.allow_mmap=false +``` + +:::{CAUTION} +This will result in degraded performance. +::: + +You can also start a single node without any {ref}`bootstrap checks +` by passing the `-Cdiscovery.type=single-node` option: + +``` +sh$ docker run -d --name=crate01 \ + --net=crate -p 4201:4200 \ + --env CRATE_HEAP_SIZE=1g \ + crate -Cnetwork.host=_site_ \ + -Cdiscovery.type=single-node +``` + +:::{NOTE} +This means that the node cannot form a cluster with any other nodes. +::: + +### Taking it further + +{ref}`CrateDB settings ` are set +using the `-C` flag, as shown in the examples above. + +Check out the [Docker docs](https://docs.docker.com/reference/cli/docker/) +for more Docker-specific features that CrateDB can leverage. + +### CrateDB Shell + +The CrateDB Shell, `crash`, is bundled with the Docker image. + +If you wanted to run `crash` inside a user-defined network called `crate` +and connect to three hosts named `crate01`, `crate02`, and `crate03` +(i.e. the example covered in the [Creating a Cluster] section) you could run: + +``` +$ docker run --rm -ti \ + --net=crate crate \ + crash --hosts crate01 crate02 crate03 +``` + +(docker-compose)= + +## Docker Compose + +Docker's Compose tool allows developers to define and run multi-container +Docker applications that can be started with a single `docker-compose up` +command. + +Read about Docker Compose specifics [here](https://docs.docker.com/compose/). + +You can define the services that make up your app in a `docker-compose.yml` +file. To recreate the three-node cluster in the previous example, you can +define your services like this: + +```yaml +version: '3.8' +services: + cratedb01: + image: crate:latest + ports: + - "4201:4200" + volumes: + - /tmp/crate/01:/data + command: ["crate", + "-Ccluster.name=crate-docker-cluster", + "-Cnode.name=cratedb01", + "-Cnode.data=true", + "-Cnetwork.host=_site_", + "-Cdiscovery.seed_hosts=cratedb02,cratedb03", + "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", + "-Cgateway.expected_data_nodes=3", + "-Cgateway.recover_after_data_nodes=2"] + deploy: + replicas: 1 + restart_policy: + condition: on-failure + environment: + - CRATE_HEAP_SIZE=1g + + cratedb02: + image: crate:latest + ports: + - "4202:4200" + volumes: + - /tmp/crate/02:/data + command: ["crate", + "-Ccluster.name=crate-docker-cluster", + "-Cnode.name=cratedb02", + "-Cnode.data=true", + "-Cnetwork.host=_site_", + "-Cdiscovery.seed_hosts=cratedb01,cratedb03", + "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", + "-Cgateway.expected_data_nodes=3", + "-Cgateway.recover_after_data_nodes=2"] + deploy: + replicas: 1 + restart_policy: + condition: on-failure + environment: + - CRATE_HEAP_SIZE=1g + + cratedb03: + image: crate:latest + ports: + - "4203:4200" + volumes: + - /tmp/crate/03:/data + command: ["crate", + "-Ccluster.name=crate-docker-cluster", + "-Cnode.name=cratedb03", + "-Cnode.data=true", + "-Cnetwork.host=_site_", + "-Cdiscovery.seed_hosts=cratedb01,cratedb02", + "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", + "-Cgateway.expected_data_nodes=3", + "-Cgateway.recover_after_data_nodes=2"] + deploy: + replicas: 1 + restart_policy: + condition: on-failure + environment: + - CRATE_HEAP_SIZE=1g +``` + +In the file above: + +- You specified the latest [compose file version]. +- You created three CrateDB services which pulls the latest CrateDB Docker + image and maps the ports manually. +- You created a file system volume per instance and defined a set of + configuration parameters (`-C`). +- You defined some deploy settings and an environment variable for the heap size. +- Network settings no longer need to be defined in the latest compose file + version because a [default bridge network] will be created. If you are + using multiple hosts and want to use an overlay network, you will need to + explicitly define that. +- The start order of the containers is not deterministic and you want all + three containers to be up and running before the election of the master node. + +## Best Practices + +### One container per host + +For performance reasons, we strongly recommend that you only run one container +per host machine. + +If you are running one container per machine, you can map the container ports +to the host ports so that the host acts like a native installation. For example: + +``` +$ docker run -d -p 4200:4200 -p 4300:4300 -p 5432:5432 --env CRATE_HEAP_SIZE=1g crate \ + crate -Cnetwork.host=_site_ +``` + +### Persistent data directory + +Docker containers are ephemeral, meaning that containers are expected to come +and go, and any data inside them is lost when the container is removed. For +this reason, you should mount a persistent `data` directory on your host +machine to the `/data` directory inside the container: + +``` +$ docker run -d -v /srv/crate/data:/data --env CRATE_HEAP_SIZE=1g crate \ + crate -Cnetwork.host=_site_ +``` + +Here, `/srv/crate/data` is an example path, and should be replaced with the +path to your host machine's `data` directory. + +See the [Docker volume] documentation for more help. + +### Custom configuration + +If you want to use a custom configuration, it is recommended that you mount +configuration files on the host machine to the appropriate path inside the +container. That way, your configuration will not be lost if the container is +removed. + +Here is an example of how you could mount the `crate.yml` config file: + +``` +$ docker run -d \ + -v /srv/crate/config/crate.yml:/crate/config/crate.yml \ + --env CRATE_HEAP_SIZE=1g crate \ + crate -Cnetwork.host=_site_ +``` + +Here, `/srv/crate/config/crate.yml` is an example path, and should be +replaced with the path to your host machine's `crate.yml` file. + +## Troubleshooting + +The official [CrateDB Docker image] ships with a liveness [healthcheck] +configured. + +This healthcheck will flag a problem if the CrateDB process crashed or hung +inside the container without terminating. + +If you use [Docker Swarm] and are experiencing trouble starting your Docker +containers, try to deactivate the healthcheck. + +You can do that by editing your [Docker Stack YAML file]: + +```yaml +healthcheck: + disable: true +``` + +(resource-constraints)= + +## Resource constraints + +To avoid overallocation of resources, you may want to consider setting +constraints on CPU and memory if you plan to run multiple CrateDB containers +on a single machine. + +### Bootstrap checks + +When using CrateDB with Docker, CrateDB binds by default to any site-local IP +address on the system (i.e. 192.168.0.1). This performs a number of checks +during bootstrap. The settings listed in {ref}`bootstrap checks +` must be addressed on the Docker **host system** in order +to start CrateDB successfully and when {ref}`going into production +`. + +### Memory + +You must calculate and explicitly [set the maximum memory] that the container +can use. This is dependent on your host system and should typically be as high +as possible. + +You must then calculate the appropriate heap size (typically half the container's +memory limit, see {ref}`CRATE_HEAP_SIZE ` +for details), and pass this to CrateDB, which in turn passes it to the JVM. + +It is not necessary to configure swap memory since CrateDB does not use swap. + +### CPU + +You must calculate and explicitly [set the maximum number of CPUs] that the +container can use. This is dependent on your host system and should typically +be as high as possible. + +### Combined configuration + +If you want the container to use a maximum of 1.5 CPUs, a maximum of 2 GB +memory, with a heap size of 1 GB, you could configure everything at once. For +example: + +``` +$ docker run -d \ + --cpus 1.5 \ + --memory 1g \ + --env CRATE_HEAP_SIZE=1g \ + crate \ + crate -Cnetwork.host=_site_ +``` + +[compose file version]: https://docs.docker.com/compose/compose-file/compose-versioning/ +[containerization]: https://www.docker.com/resources/what-container +[cratedb docker image]: https://hub.docker.com/_/crate/ +[default bridge network]: https://docs.docker.com/engine/network/drivers/bridge/#configure-the-default-bridge-network +[docker]: https://www.docker.com/ +[docker stack yaml file]: https://docs.docker.com/reference/compose-file/legacy-versions/ +[docker swarm]: https://docs.docker.com/engine/swarm/ +[docker volume]: https://docs.docker.com/engine/tutorials/dockervolumes/ +[healthcheck]: https://docs.docker.com/engine/containers/run/#healthchecks +[horizontally scalable]: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_out)_and_vertical_scaling_(scale_up) +[set the maximum memory]: https://docs.docker.com/engine/containers/resource_constraints/#memory +[set the maximum number of cpus]: https://docs.docker.com/engine/containers/resource_constraints/#cpu +[shared-nothing architecture]: https://en.wikipedia.org/wiki/Shared-nothing_architecture +[user-defined network]: https://docs.docker.com/network/bridge/ diff --git a/docs/install/container/docker.rst b/docs/install/container/docker.rst deleted file mode 100644 index f68fb671..00000000 --- a/docs/install/container/docker.rst +++ /dev/null @@ -1,507 +0,0 @@ -.. highlight:: sh - -.. _cratedb-docker: - -===================== -Run CrateDB on Docker -===================== - -CrateDB and `Docker`_ are a great match thanks to CrateDB’s `horizontally -scalable`_ `shared-nothing architecture`_ that lends itself well to -`containerization`_. - -This document covers the essentials of running CrateDB on Docker. - -.. NOTE:: - - If you are just getting started with CrateDB and Docker, check out the - introductory guides for :ref:`spinning up your first container-based - CrateDB instance `. - -.. SEEALSO:: - - A guide for running CrateDB on :ref:`Kubernetes `. - - The official `CrateDB Docker image`_. - -Quick start -=========== - - -Creating a cluster ------------------- - -To get started with CrateDB and Docker, you will create a three-node cluster -on your dev machine. The cluster will run on a dedicated network and will -require the first two nodes, ``crate01`` and ``crate02``, to vote which one -is the master. The third node, ``crate03``, will simply join the cluster -with no vote. - -To create the `user-defined network`_, run the command:: - - sh$ docker network create crate - -You should then be able to see something like this: - -.. code-block:: text - - sh$ docker network ls - NETWORK ID NAME DRIVER SCOPE - 1bf1b7acd66f bridge bridge local - 51cebbdf7d2b crate bridge local - 5b8e6fbe9ab6 host host local - 8baa149b6986 none null local - -Any CrateDB container put into the ``crate`` network will be able to resolve -other CrateDB containers by name. Each container will run a single node, which -is identified by its node name. In this guide, container ``crate01`` will run -node ``crate01``, container ``crate02`` will run node ``crate02``, and -container ``crate03`` will run cluster node ``crate03``. - -You can then create your first CrateDB container and node, like this:: - - sh$ docker run --rm -d \ - --name=crate01 \ - --net=crate \ - -p 4201:4200 \ - --env CRATE_HEAP_SIZE=1g \ - crate -Cnetwork.host=_site_ \ - -Cnode.name=crate01 \ - -Cdiscovery.seed_hosts=crate02,crate03 \ - -Ccluster.initial_master_nodes=crate01,crate02 \ - -Cgateway.expected_data_nodes=3 \ - -Cgateway.recover_after_data_nodes=3 - -Breaking the command down: - -- Creates and runs a container called ``crate01`` (--name) in detached - mode (-d). The container will automatically be removed on exit (--rm), - and all its internal data will be lost. If you would like to avoid this, - you can mount a dedicated volume (-v) for the container (each container - would need its own dedicated folder on your dev machine, see - :ref:`docker-compose` as reference). -- Puts the container into the ``crate`` network and maps port ``4201`` on your - host machine to port ``4200`` on the container (admin UI). -- Defines the environment variable:ref:`CRATE_HEAP_SIZE `, - which is used by CrateDB to allocate 1 GB for its heap memory. -- Runs the command ``crate`` inside the container with parameters: - * ``network.host``: The ``_site_`` value results in the binding of the - CrateDB process to a site-local IP address. - * ``node.name``: Defines the node's name as ``crate01`` (used by - master election). - * ``discovery.seed_hosts``: This parameter lists the other hosts in the - cluster. The format is a comma-separated list of ``host:port`` entries, - where port defaults to setting ``transport.tcp.port``. Each node must - contain the name of all the other hosts in this list. Notice also that - any node in the cluster might be started at any time, and this will - create connection exceptions in the log files, however all nodes will - eventually be running and interconnected. - * ``cluster.initial_master_nodes``: Defines the list of master-eligible - node names which will participate in the vote of the first master - (first bootstrap). If this parameter is not defined, then it is expected - that the node will join an already formed cluster. This parameter is only - relevant for the first election. - * ``gateway.expected_data_nodes`` and ``gateway.recover_after_data_nodes``: - Specifies how many nodes you expect in the cluster and how many nodes must - be discovered before the cluster state is recovered. - -.. NOTE:: - - If this command aborts with an error, consult the - :ref:`docker-troubleshooting` section for help. - -Verify that the node is running with ``docker ps`` and you should see something like this: - -.. code-block:: text - - sh$ docker ps - CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES - f79116373877 crate "/docker-entrypoin..." 16 seconds ago Up 15 seconds 4300/tcp, 5432-5532/tcp, 0.0.0.0:4201->4200/tcp crate01 - -You can have a look at the container's logs in tail mode like this: - -.. code-block:: text - - sh$ docker logs -f crate01 - -.. NOTE:: - - To exit the logs view, press ctrl+C. - -You can visit the admin UI in your browser with this URL: - -.. code-block:: text - - http://localhost:4201/ - -Select the *Cluster* icon from the left-hand navigation, and you should see a -page that lists a single node. - -Now add the second node, ``crate02``, to the cluster:: - - sh$ docker run --rm -d \ - --name=crate02 \ - --net=crate \ - -p 4202:4200 \ - --env CRATE_HEAP_SIZE=1g \ - crate -Cnetwork.host=_site_ \ - -Cnode.name=crate02 \ - -Cdiscovery.seed_hosts=crate01,crate03 \ - -Ccluster.initial_master_nodes=crate01,crate02 \ - -Cgateway.expected_data_nodes=3 \ - -Cgateway.recover_after_data_nodes=2 - -Notice here that: - -- You updated the container and node name to ``crate02``. -- You updated the port mapping, so that port ``4202`` on your host is mapped - to ``4200`` on the container. -- You set the parameter ``discovery.seed_hosts`` to contain the other hosts of - the cluster. -- ``cluster.initial_master_nodes``: Since only nodes ``crate01`` and ``crate02`` - will participate in the election of the first master, this setting is unchanged. - -Now, if you go back to the admin UI you opened earlier, or visit the admin UI -of the node you just created (located at ``http://localhost:4202/``) you -should see two nodes. - -You can now add ``crate03`` like this:: - - sh$ docker run --rm -d \ - --name=crate03 \ - --net=crate -p 4203:4200 \ - --env CRATE_HEAP_SIZE=1g \ - crate -Cnetwork.host=_site_ \ - -Cnode.name=crate03 \ - -Cdiscovery.seed_hosts=crate01,crate02 \ - -Cgateway.expected_data_nodes=3 \ - -Cgateway.recover_after_data_nodes=2 - -Notice here that: - -- You updated the container and node name to ``crate03``. -- You updated the port mapping, so that port ``4203`` on your host is mapped - to ``4200`` on the container. -- You set parameter ``discovery.seed_hosts`` to contain the other hosts of the - cluster. -- ``cluster.initial_master_nodes``: This setting is removed since only nodes - ``crate01`` and ``crate02`` will participate in the election of the first - master. - - -Success! You just created a three-node CrateDB cluster with Docker. - -.. NOTE:: - - This is only a quick start example and you will notice some failing checks - in the admin UI. For a more robust cluster, you should, at the very least, - configure the :ref:`Metadata Gateway ` and - :ref:`Discovery ` settings. - - -.. _docker-troubleshooting: - -Troubleshooting ---------------- - -The most common issue when running CrateDB on Docker is a failing -:ref:`bootstrap check ` because the *memory map limit* -is too low. This can be :ref:`adjusted on the host system `. - -If the limit cannot be adjusted on the host system, the memory map limit check -can be bypassed by passing the ``-Cnode.store.allow_mmap=false`` option to -the ``crate`` command:: - - sh$ docker run -d --name=crate01 \ - --net=crate -p 4201:4200 --env CRATE_HEAP_SIZE=1g \ - crate -Cnetwork.host=_site_ \ - -Cnode.store.allow_mmap=false - -.. CAUTION:: - - This will result in degraded performance. - -You can also start a single node without any :ref:`bootstrap checks -` by passing the ``-Cdiscovery.type=single-node`` option:: - - sh$ docker run -d --name=crate01 \ - --net=crate -p 4201:4200 \ - --env CRATE_HEAP_SIZE=1g \ - crate -Cnetwork.host=_site_ \ - -Cdiscovery.type=single-node - -.. NOTE:: - - This means that the node cannot form a cluster with any other nodes. - - -Taking it further ------------------ - -:ref:`CrateDB settings ` are set -using the ``-C`` flag, as shown in the examples above. - -Check out the `Docker docs `_ -for more Docker-specific features that CrateDB can leverage. - - -CrateDB Shell -------------- - -The CrateDB Shell, ``crash``, is bundled with the Docker image. - -If you wanted to run ``crash`` inside a user-defined network called ``crate`` -and connect to three hosts named ``crate01``, ``crate02``, and ``crate03`` -(i.e. the example covered in the `Creating a Cluster`_ section) you could run:: - - $ docker run --rm -ti \ - --net=crate crate \ - crash --hosts crate01 crate02 crate03 - - -.. _docker-compose: - -Docker Compose -============== - -Docker's Compose tool allows developers to define and run multi-container -Docker applications that can be started with a single ``docker-compose up`` -command. - -Read about Docker Compose specifics `here `_. - -You can define the services that make up your app in a `docker-compose.yml` -file. To recreate the three-node cluster in the previous example, you can -define your services like this: - -.. code-block:: yaml - - version: '3.8' - services: - cratedb01: - image: crate:latest - ports: - - "4201:4200" - volumes: - - /tmp/crate/01:/data - command: ["crate", - "-Ccluster.name=crate-docker-cluster", - "-Cnode.name=cratedb01", - "-Cnode.data=true", - "-Cnetwork.host=_site_", - "-Cdiscovery.seed_hosts=cratedb02,cratedb03", - "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", - "-Cgateway.expected_data_nodes=3", - "-Cgateway.recover_after_data_nodes=2"] - deploy: - replicas: 1 - restart_policy: - condition: on-failure - environment: - - CRATE_HEAP_SIZE=1g - - cratedb02: - image: crate:latest - ports: - - "4202:4200" - volumes: - - /tmp/crate/02:/data - command: ["crate", - "-Ccluster.name=crate-docker-cluster", - "-Cnode.name=cratedb02", - "-Cnode.data=true", - "-Cnetwork.host=_site_", - "-Cdiscovery.seed_hosts=cratedb01,cratedb03", - "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", - "-Cgateway.expected_data_nodes=3", - "-Cgateway.recover_after_data_nodes=2"] - deploy: - replicas: 1 - restart_policy: - condition: on-failure - environment: - - CRATE_HEAP_SIZE=1g - - cratedb03: - image: crate:latest - ports: - - "4203:4200" - volumes: - - /tmp/crate/03:/data - command: ["crate", - "-Ccluster.name=crate-docker-cluster", - "-Cnode.name=cratedb03", - "-Cnode.data=true", - "-Cnetwork.host=_site_", - "-Cdiscovery.seed_hosts=cratedb01,cratedb02", - "-Ccluster.initial_master_nodes=cratedb01,cratedb02,cratedb03", - "-Cgateway.expected_data_nodes=3", - "-Cgateway.recover_after_data_nodes=2"] - deploy: - replicas: 1 - restart_policy: - condition: on-failure - environment: - - CRATE_HEAP_SIZE=1g - -In the file above: - -- You specified the latest `compose file version`_. -- You created three CrateDB services which pulls the latest CrateDB Docker - image and maps the ports manually. -- You created a file system volume per instance and defined a set of - configuration parameters (`-C`). -- You defined some deploy settings and an environment variable for the heap size. -- Network settings no longer need to be defined in the latest compose file - version because a `default bridge network`_ will be created. If you are - using multiple hosts and want to use an overlay network, you will need to - explicitly define that. -- The start order of the containers is not deterministic and you want all - three containers to be up and running before the election of the master node. - - -Best Practices -============== - - -One container per host ----------------------- - -For performance reasons, we strongly recommend that you only run one container -per host machine. - -If you are running one container per machine, you can map the container ports -to the host ports so that the host acts like a native installation. For example:: - - $ docker run -d -p 4200:4200 -p 4300:4300 -p 5432:5432 --env CRATE_HEAP_SIZE=1g crate \ - crate -Cnetwork.host=_site_ - - -Persistent data directory -------------------------- - -Docker containers are ephemeral, meaning that containers are expected to come -and go, and any data inside them is lost when the container is removed. For -this reason, you should mount a persistent ``data`` directory on your host -machine to the ``/data`` directory inside the container:: - - $ docker run -d -v /srv/crate/data:/data --env CRATE_HEAP_SIZE=1g crate \ - crate -Cnetwork.host=_site_ - -Here, ``/srv/crate/data`` is an example path, and should be replaced with the -path to your host machine's ``data`` directory. - -See the `Docker volume`_ documentation for more help. - - -Custom configuration --------------------- - -If you want to use a custom configuration, it is recommended that you mount -configuration files on the host machine to the appropriate path inside the -container. That way, your configuration will not be lost if the container is -removed. - -Here is an example of how you could mount the ``crate.yml`` config file:: - - $ docker run -d \ - -v /srv/crate/config/crate.yml:/crate/config/crate.yml \ - --env CRATE_HEAP_SIZE=1g crate \ - crate -Cnetwork.host=_site_ - -Here, ``/srv/crate/config/crate.yml`` is an example path, and should be -replaced with the path to your host machine's ``crate.yml`` file. - - -Troubleshooting -=============== - -The official `CrateDB Docker image`_ ships with a liveness `healthcheck`_ -configured. - -This healthcheck will flag a problem if the CrateDB process crashed or hung -inside the container without terminating. - -If you use `Docker Swarm`_ and are experiencing trouble starting your Docker -containers, try to deactivate the healthcheck. - -You can do that by editing your `Docker Stack YAML file`_: - -.. code-block:: yaml - - healthcheck: - disable: true - - -.. _resource_constraints: - -Resource constraints -==================== - -To avoid overallocation of resources, you may want to consider setting -constraints on CPU and memory if you plan to run multiple CrateDB containers -on a single machine. - - -Bootstrap checks ----------------- - -When using CrateDB with Docker, CrateDB binds by default to any site-local IP -address on the system (i.e. 192.168.0.1). This performs a number of checks -during bootstrap. The settings listed in :ref:`bootstrap checks -` must be addressed on the Docker **host system** in order -to start CrateDB successfully and when :ref:`going into production -`. - - -Memory ------- - -You must calculate and explicitly `set the maximum memory`_ that the container -can use. This is dependent on your host system and should typically be as high -as possible. - -You must then calculate the appropriate heap size (typically half the container's -memory limit, see :ref:`CRATE_HEAP_SIZE ` -for details), and pass this to CrateDB, which in turn passes it to the JVM. - -It is not necessary to configure swap memory since CrateDB does not use swap. - - -CPU ---- - -You must calculate and explicitly `set the maximum number of CPUs`_ that the -container can use. This is dependent on your host system and should typically -be as high as possible. - - -Combined configuration ----------------------- - -If you want the container to use a maximum of 1.5 CPUs, a maximum of 2 GB -memory, with a heap size of 1 GB, you could configure everything at once. For -example:: - - $ docker run -d \ - --cpus 1.5 \ - --memory 1g \ - --env CRATE_HEAP_SIZE=1g \ - crate \ - crate -Cnetwork.host=_site_ - - -.. _compose file version: https://docs.docker.com/compose/compose-file/compose-versioning/ -.. _containerization: https://www.docker.com/resources/what-container -.. _CrateDB Docker image: https://hub.docker.com/_/crate/ -.. _default bridge network: https://docs.docker.com/engine/network/drivers/bridge/#configure-the-default-bridge-network -.. _Docker Stack YAML file: https://docs.docker.com/reference/compose-file/legacy-versions/ -.. _Docker Swarm: https://docs.docker.com/engine/swarm/ -.. _Docker volume: https://docs.docker.com/engine/tutorials/dockervolumes/ -.. _Docker: https://www.docker.com/ -.. _healthcheck: https://docs.docker.com/engine/containers/run/#healthchecks -.. _horizontally scalable: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_out)_and_vertical_scaling_(scale_up) -.. _set the maximum memory: https://docs.docker.com/engine/containers/resource_constraints/#memory -.. _set the maximum number of CPUs: https://docs.docker.com/engine/containers/resource_constraints/#cpu -.. _shared-nothing architecture: https://en.wikipedia.org/wiki/Shared-nothing_architecture -.. _user-defined network: https://docs.docker.com/network/bridge/ diff --git a/docs/install/container/index.md b/docs/install/container/index.md new file mode 100644 index 00000000..1fe327a8 --- /dev/null +++ b/docs/install/container/index.md @@ -0,0 +1,59 @@ +(install-container)= + +# Container Setup + +```{eval-rst} +.. div:: sd-text-muted + + Install CrateDB in container environments. +``` + +CrateDB is ideal for containerized environments, creating and scaling a cluster +takes minutes and your valuable data is always in sync and available. + +## Quickstart + +CrateDB and [Docker] are great matches thanks to CrateDB's shared-nothing, +horizontally scalable architecture that lends itself well to containerization. + +In order to spin up a container using the most recent stable version of the +official [CrateDB Docker image], use: + +``` +docker run --publish=4200:4200 --publish=5432:5432 --env CRATE_HEAP_SIZE=1g --pull=always crate +``` + +:::{TIP} +If this command aborts with an error, please consult the {ref}`Docker +troubleshooting guide `. You are also +welcome to learn more about {ref}`resource_constraints` with respect +to running CrateDB within containers. +::: + +:::{CAUTION} +This type of invoking CrateDB will get you up and running quickly. + +Please note, by default, the CrateDB Docker container is ephemeral, so +data will not be stored in a persistent manner. When stopping the +container, all data will be lost. + +When you are ready to start using CrateDB for data you care about, please +consult the {ref}`full guide to CrateDB and Docker ` +in order to configure the Docker setup appropriately by using persistent +disk volumes. +::: + +## Advanced + +Advanced container setup scenarios using Docker +and Kubernetes. + +```{toctree} +:maxdepth: 1 + +Docker +Kubernetes +``` + +[cratedb docker image]: https://hub.docker.com/_/crate/ +[docker]: https://www.docker.com/ diff --git a/docs/install/container/index.rst b/docs/install/container/index.rst deleted file mode 100644 index 01f911b3..00000000 --- a/docs/install/container/index.rst +++ /dev/null @@ -1,61 +0,0 @@ -.. _install-container: - -############### -Container Setup -############### - -.. div:: sd-text-muted - - Install CrateDB in container environments. - -CrateDB is ideal for containerized environments, creating and scaling a cluster -takes minutes and your valuable data is always in sync and available. - -********** -Quickstart -********** - -CrateDB and Docker_ are great matches thanks to CrateDB's shared-nothing, -horizontally scalable architecture that lends itself well to containerization. - -In order to spin up a container using the most recent stable version of the -official `CrateDB Docker image`_, use:: - - docker run --publish=4200:4200 --publish=5432:5432 --env CRATE_HEAP_SIZE=1g --pull=always crate - -.. TIP:: - - If this command aborts with an error, please consult the :ref:`Docker - troubleshooting guide `. You are also - welcome to learn more about :ref:`resource_constraints` with respect - to running CrateDB within containers. - -.. CAUTION:: - - This type of invoking CrateDB will get you up and running quickly. - - Please note, by default, the CrateDB Docker container is ephemeral, so - data will not be stored in a persistent manner. When stopping the - container, all data will be lost. - - When you are ready to start using CrateDB for data you care about, please - consult the :ref:`full guide to CrateDB and Docker ` - in order to configure the Docker setup appropriately by using persistent - disk volumes. - - -******** -Advanced -******** - -Advanced container setup scenarios using Docker -and Kubernetes. - -.. toctree:: - :maxdepth: 1 - - Docker - Kubernetes - -.. _Docker: https://www.docker.com/ -.. _CrateDB Docker image: https://hub.docker.com/_/crate/ diff --git a/docs/install/container/kubernetes/index.md b/docs/install/container/kubernetes/index.md new file mode 100644 index 00000000..5366f7d1 --- /dev/null +++ b/docs/install/container/kubernetes/index.md @@ -0,0 +1,42 @@ +# Run CrateDB on Kubernetes + +CrateDB is ideal for containerized environments, creating and scaling a cluster +takes minutes and your valuable data is always in sync and available. + +## Prerequisites + +Both of following methods assume [familiarity with Kubernetes]. + +Before continuing you should already have a Kubernetes cluster up-and-running +with at least one master node and one worker node. + +:::{SEEALSO} +You can use [kubeadm] to bootstrap a Kubernetes cluster by hand. + +Alternatively, cloud services such as [Azure Kubernetes Service] or the +[Amazon Kubernetes Service] can do this for you. +::: + +## Method 1 - Classic kubernetes + +Install the resources to run your CrateDB. + +## Method 2 - Kubernetes operator + +You can also use the CrateDB custom resource and the Crate Operator to quickly +install your CrateDB. + +```{rubric} Table of contents +``` + +```{toctree} +:maxdepth: 1 + +kubernetes +kubernetes-operator +``` + +[amazon kubernetes service]: https://aws.amazon.com/eks/ +[azure kubernetes service]: https://azure.microsoft.com/en-us/services/kubernetes-service/ +[familiarity with kubernetes]: https://kubernetes.io/docs/tutorials/kubernetes-basics/ +[kubeadm]: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ diff --git a/docs/install/container/kubernetes/index.rst b/docs/install/container/kubernetes/index.rst deleted file mode 100644 index fc6a4acd..00000000 --- a/docs/install/container/kubernetes/index.rst +++ /dev/null @@ -1,45 +0,0 @@ -========================= -Run CrateDB on Kubernetes -========================= - -CrateDB is ideal for containerized environments, creating and scaling a cluster -takes minutes and your valuable data is always in sync and available. - -Prerequisites -------------- - -Both of following methods assume `familiarity with Kubernetes`_. - -Before continuing you should already have a Kubernetes cluster up-and-running -with at least one master node and one worker node. - -.. SEEALSO:: - - You can use `kubeadm`_ to bootstrap a Kubernetes cluster by hand. - - Alternatively, cloud services such as `Azure Kubernetes Service`_ or the - `Amazon Kubernetes Service`_ can do this for you. - -Method 1 - Classic kubernetes ------------------------------ - -Install the resources to run your CrateDB. - -Method 2 - Kubernetes operator ------------------------------- - -You can also use the CrateDB custom resource and the Crate Operator to quickly -install your CrateDB. - -.. rubric:: Table of contents - -.. toctree:: - :maxdepth: 1 - - kubernetes - kubernetes-operator - -.. _familiarity with Kubernetes: https://kubernetes.io/docs/tutorials/kubernetes-basics/ -.. _kubeadm: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/ -.. _Azure Kubernetes Service: https://azure.microsoft.com/en-us/services/kubernetes-service/ -.. _Amazon Kubernetes Service: https://aws.amazon.com/eks/ diff --git a/docs/install/container/kubernetes/kubernetes-operator.md b/docs/install/container/kubernetes/kubernetes-operator.md new file mode 100644 index 00000000..bd37ecc5 --- /dev/null +++ b/docs/install/container/kubernetes/kubernetes-operator.md @@ -0,0 +1,131 @@ +(cratedb-kubernetes-operator)= + +# Run CrateDB with Kubernetes Operator + +The [CrateDB Kubernetes Operator] provides a convenient way to run [CrateDB] +clusters inside Kubernetes. + +Using the operator we can just deploy a `CrateDB` resource without having to +deal with services, persistent volumes, persistent volume claims, and stateful +sets. The only prerequisite is to have a suitable storage class. + +## Installation + +[Helm] must be installed to install the Crate operator chart. +Once Helm is set up properly, add the repo as follows: + +```console +$ helm repo add crate-operator https://crate.github.io/crate-operator +``` + +Install the crate-operator chart: + +```console +$ kubectl create namespace crate-operator + +$ helm install crate-operator crate-operator/crate-operator --namespace crate-operator --set env.CRATEDB_OPERATOR_DEBUG_VOLUME_STORAGE_CLASS= +``` + +:::{NOTE} +`kubectl get storageclass` gives you an list of the available StorageClasses +on your setup. Be careful with what you choose! +::: + +:::{NOTE} +To be able to deploy the custom resource `CrateDB` to a Kubernetes cluster, +the API needs to be extended with a [Custom Resource Definition] (CRD). +It is installed as a dependency of the `crate-operator` chart, but it can be +installed separately. See the [Crate Operator Chart documentation] for +further details. +::: + +## Run CrateDB + +A minimal custom resource for a three-node CrateDB cluster may look like this: + +`dev-cluster.yaml`: + +```yaml +apiVersion: cloud.crate.io/v1 +kind: CrateDB +metadata: + name: my-cluster + namespace: dev +spec: + cluster: + imageRegistry: crate + name: crate-dev + version: 5.8.1 + nodes: + data: + - name: hot + replicas: 3 + resources: + limits: + cpu: 4 + memory: 4Gi + disk: + count: 1 + size: 128GiB + storageClass: + heapRatio: 0.25 +``` + +:::{NOTE} +The operator imposes an affinity constraint of 1 CrateDB node per Kubernetes node. +To deploy a CrateDB cluster with multiple nodes on a single machine for +testing purposes, {ref}`manually deploy a StatefulSet `. +::: + +:::{WARNING} +Specifying a `cpu` number under `limits` is mandatory. +::: + +```console +$ kubectl create namespace dev + +$ kubectl --namespace dev create -f dev-cluster.yaml +... + +$ kubectl --namespace dev get cratedbs +NAMESPACE NAME AGE +dev my-cluster 36s +``` + +We can check the status of the deployment by looking at the latest events: + +```console +$ kubectl get events --sort-by='.lastTimestamp' -n dev +``` + +and the status of the pods: + +```console +$ kubectl get pods --namespace dev +``` + +Once we have a pod running for each CrateDB node, the cluster +is ready. Congratulations! + +The operator created a user named `system` for you and a Loadbalancer +to access the cluster. + +```console +$ kubectl get secret user-system-my-cluster -o json | jq -r '.data.password' | base64 -d + +$ kubectl get service crate-my-cluster -o json | jq -r '.status.loadBalancer.ingress[0].ip' +``` + +As an alternative you can access the cluster via `kubectl port-forwarding` +to port `4200`. Which allows you to authenticate with the `crate` user. + +:::{NOTE} +You can find the Crate Operator features in the `Features` section +of [CrateDB Kubernetes Operator]. +::: + +[crate operator chart documentation]: https://github.com/crate/crate-operator/blob/master/deploy/charts/crate-operator/README.md +[cratedb]: https://github.com/crate/crate +[cratedb kubernetes operator]: https://github.com/crate/crate-operator +[custom resource definition]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ +[helm]: https://helm.sh diff --git a/docs/install/container/kubernetes/kubernetes-operator.rst b/docs/install/container/kubernetes/kubernetes-operator.rst deleted file mode 100644 index 10598b9a..00000000 --- a/docs/install/container/kubernetes/kubernetes-operator.rst +++ /dev/null @@ -1,138 +0,0 @@ -.. _cratedb-kubernetes-operator: - -==================================== -Run CrateDB with Kubernetes Operator -==================================== - -The `CrateDB Kubernetes Operator`_ provides a convenient way to run `CrateDB`_ -clusters inside Kubernetes. - -Using the operator we can just deploy a ``CrateDB`` resource without having to -deal with services, persistent volumes, persistent volume claims, and stateful -sets. The only prerequisite is to have a suitable storage class. - -Installation -============ - -`Helm`_ must be installed to install the Crate operator chart. -Once Helm is set up properly, add the repo as follows: - -.. code-block:: console - - $ helm repo add crate-operator https://crate.github.io/crate-operator - -Install the crate-operator chart: - -.. code-block:: console - - $ kubectl create namespace crate-operator - - $ helm install crate-operator crate-operator/crate-operator --namespace crate-operator --set env.CRATEDB_OPERATOR_DEBUG_VOLUME_STORAGE_CLASS= - -.. NOTE:: - - ``kubectl get storageclass`` gives you an list of the available StorageClasses - on your setup. Be careful with what you choose! - - -.. NOTE:: - - To be able to deploy the custom resource ``CrateDB`` to a Kubernetes cluster, - the API needs to be extended with a `Custom Resource Definition`_ (CRD). - It is installed as a dependency of the ``crate-operator`` chart, but it can be - installed separately. See the `Crate Operator Chart documentation`_ for - further details. - -Run CrateDB -=========== - -A minimal custom resource for a three-node CrateDB cluster may look like this: - -``dev-cluster.yaml``: - -.. code-block:: yaml - - apiVersion: cloud.crate.io/v1 - kind: CrateDB - metadata: - name: my-cluster - namespace: dev - spec: - cluster: - imageRegistry: crate - name: crate-dev - version: 5.8.1 - nodes: - data: - - name: hot - replicas: 3 - resources: - limits: - cpu: 4 - memory: 4Gi - disk: - count: 1 - size: 128GiB - storageClass: - heapRatio: 0.25 - -.. NOTE:: - - The operator imposes an affinity constraint of 1 CrateDB node per Kubernetes node. - To deploy a CrateDB cluster with multiple nodes on a single machine for - testing purposes, :ref:`manually deploy a StatefulSet `. - - -.. WARNING:: - - Specifying a `cpu` number under `limits` is mandatory. - -.. code-block:: console - - $ kubectl create namespace dev - - $ kubectl --namespace dev create -f dev-cluster.yaml - ... - - $ kubectl --namespace dev get cratedbs - NAMESPACE NAME AGE - dev my-cluster 36s - -We can check the status of the deployment by looking at the latest events: - -.. code-block:: console - - $ kubectl get events --sort-by='.lastTimestamp' -n dev - -and the status of the pods: - -.. code-block:: console - - $ kubectl get pods --namespace dev - -Once we have a pod running for each CrateDB node, the cluster -is ready. Congratulations! - -The operator created a user named ``system`` for you and a Loadbalancer -to access the cluster. - -.. code-block:: console - - $ kubectl get secret user-system-my-cluster -o json | jq -r '.data.password' | base64 -d - - $ kubectl get service crate-my-cluster -o json | jq -r '.status.loadBalancer.ingress[0].ip' - -As an alternative you can access the cluster via ``kubectl port-forwarding`` -to port ``4200``. Which allows you to authenticate with the `crate` user. - -.. NOTE:: - - You can find the Crate Operator features in the ``Features`` section - of `CrateDB Kubernetes Operator`_. - - -.. _CrateDB Kubernetes Operator: https://github.com/crate/crate-operator -.. _CrateDB: https://github.com/crate/crate -.. _Helm: https://helm.sh -.. _Custom Resource Definition: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/ -.. _Crate Operator Chart documentation: https://github.com/crate/crate-operator/blob/master/deploy/charts/crate-operator/README.md diff --git a/docs/install/container/kubernetes/kubernetes.md b/docs/install/container/kubernetes/kubernetes.md new file mode 100644 index 00000000..7c570569 --- /dev/null +++ b/docs/install/container/kubernetes/kubernetes.md @@ -0,0 +1,369 @@ +(cratedb-kubernetes)= + +# CrateDB and Kubernetes + +```{eval-rst} +.. div:: sd-text-muted + + CrateDB and Kubernetes are a great match. +``` + +CrateDB’s [horizontally scalable] `shared-nothing architecture` lends itself +well to [containerization]. + +[Kubernetes] is an open-source container orchestration system for the +management, deployment, and scaling of containerized systems. + +Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB. + +:::{NOTE} +While Kubernetes works with a variety of container technologies, this +document only covers its use with Docker. +::: + +:::{SEEALSO} +A complimentary blog post miniseries that walks you through the process of +[setting up your first CrateDB cluster on Kubernetes]. + +A lower-level introduction to {ref}`running CrateDB on Docker `. + +A guide to {ref}`scaling CrateDB on Kubernetes `. + +The official [CrateDB Docker image]. +::: + +## Managing Kubernetes + +Kubernetes deployments can be [managed] in many different ways. Which one +makes sense for you will depend on your situation. +On Kubernetes, CrateDB is managed as a resource. The following commands +will give you an idea of how. + +You can create a resource like so: + +```console +sh$ kubectl create -f crate-controller.yaml --namespace crate +statefulset.apps/crate-controller created +``` + +Here, we are creating a [StatefulSet] controller in the `crate` namespace +using a configuration file named `crate-controller.yaml`. + +You can update the resource after editing the configuration file, like so: + +```console +sh$ kubectl replace -f crate-controller.yaml --namespace crate +statefulset.apps/crate replaced +``` + +If your StatefulSet uses the default [rolling update strategy], this command will +restart your pods with the new configuration one-by-one. + +:::{WARNING} +If you use a regular `replace` command, pods are restarted, and any +[persistent volumes] will still be intact. + +If, however, you pass the `--force` option to the `replace` command, +resources are deleted and recreated, and the pods will come back up with no +data. +::: + +## Configuration + +A range of Kubernetes [configuration] snippets that can be +used to create a three-node CrateDB cluster. + +### Services + +A Kubernetes pod is ephemeral and so are its network addresses. Typically, this +means that it is inadvisable to connect to pods directly. + +A Kubernetes [service] allows you to define a network access policy for a set +of pods. You can then use the network address of the service to communicate +with the pods. The network address of the service remains static even though the +constituent pods may come and go. + +For our purposes, we define two services: an [internal service] and an +[external service]. + +#### Internal service + +CrateDB uses the internal service for {ref}`node discovery via DNS +` and {ref}`inter-node communication +`. + +Here's an example configuration snippet: + +```yaml +kind: Service +apiVersion: v1 +metadata: + name: crate-internal-service + labels: + app: crate +spec: + # A static IP address is assigned to this service. This IP address is + # only reachable from within the Kubernetes cluster. + type: ClusterIP + ports: + # Port 4300 for inter-node communication. + - port: 4300 + name: crate-internal + selector: + # Apply this to all nodes with the `app:crate` label. + app: crate +``` + +#### External service + +The external service provides a stable network address for external clients. + +Here's an example configuration snippet: + +```yaml +kind: Service +apiVersion: v1 +metadata: + name: crate-external-service + labels: + app: crate +spec: + # Create an externally reachable load balancer. + type: LoadBalancer + ports: + # Port 4200 for HTTP clients. + - port: 4200 + name: crate-web + # Port 5432 for PostgreSQL wire protocol clients. + - port: 5432 + name: postgres + selector: + # Apply this to all nodes with the `app:crate` label. + app: crate +``` + +:::{NOTE} +In production, a [LoadBalancer] service type is typically only available on +hosted cloud platforms that provide externally managed load balancers. +However, an [ingress] resource can be used to provide internally managed +load balancers. + +For local development, [Minikube] provides a LoadBalancer service. +::: + +### Controller + +A Kubernetes [pod] is a group of one or more containers. Pods are designed to +provide discrete units of functionality. + +CrateDB nodes are self-contained, so we don't need to use more than one +container in a pod. We can configure our pods as a single container running +CrateDB. + +Pods are designed to be fungible computing units, meaning they can be created or +destroyed at will. This, in turn, means that: + +- A cluster can be scaled in or out by destroying or creating pods +- A cluster can be healed by replacing pods +- A cluster can be rebalanced by rescheduling pods (i.e., destroying the pod on + one Kubernetes node and recreating it on a new node) + +However, CrateDB nodes that leave and then want to rejoin a cluster must retain +their state. That is, they must continue to use the same name and must continue +to use the same data on disk. + +For this reason, we use the [StatefulSet] controller to define our cluster, +which ensures that CrateDB nodes retain state across restarts or rescheduling. + +The following configuration snippet defines a controller for a three-node +CrateDB 5.1.1 cluster: + +```yaml +kind: StatefulSet +apiVersion: "apps/v1" +metadata: + # This is the name used as a prefix for all pods in the set. + name: crate +spec: + serviceName: "crate-set" + # Our cluster has three nodes. + replicas: 3 + selector: + matchLabels: + # The pods in this cluster have the `app:crate` app label. + app: crate + template: + metadata: + labels: + app: crate + spec: + # InitContainers run before the main containers of a pod are + # started, and they must terminate before the primary containers + # are initialized. Here, we use one to set the correct memory + # map limit. + initContainers: + - name: init-sysctl + image: busybox + imagePullPolicy: IfNotPresent + command: ["sysctl", "-w", "vm.max_map_count=262144"] + securityContext: + privileged: true + # This final section is the core of the StatefulSet configuration. + # It defines the container to run in each pod. + containers: + - name: crate + # Use the CrateDB 5.1.1 Docker image. + image: crate:5.1.1 + # Pass in configuration to CrateDB via command-line options. + # We are setting the name of the node's explicitly, which is + # needed to determine the initial master nodes. These are set to + # the name of the pod. + # We are using the SRV records provided by Kubernetes to discover + # nodes within the cluster. + args: + - -Cnode.name=${POD_NAME} + - -Ccluster.name=${CLUSTER_NAME} + - -Ccluster.initial_master_nodes=crate-0,crate-1,crate-2 + - -Cdiscovery.seed_providers=srv + - -Cdiscovery.srv.query=_crate-internal._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local + - -Cgateway.recover_after_data_nodes=2 + - -Cgateway.expected_data_nodes=${EXPECTED_NODES} + - -Cpath.data=/data + volumeMounts: + # Mount the `/data` directory as a volume named `data`. + - mountPath: /data + name: data + resources: + limits: + # How much memory each pod gets. + memory: 512Mi + ports: + # Port 4300 for inter-node communication. + - containerPort: 4300 + name: crate-internal + # Port 4200 for HTTP clients. + - containerPort: 4200 + name: crate-web + # Port 5432 for PostgreSQL wire protocol clients. + - containerPort: 5432 + name: postgres + # Environment variables passed through to the container. + env: + # This is variable is detected by CrateDB. + - name: CRATE_HEAP_SIZE + value: "256m" + # The rest of these variables are used in the command-line + # options. + - name: EXPECTED_NODES + value: "3" + - name: CLUSTER_NAME + value: "my-crate" + - name: POD_NAME + valueFrom: + fieldRef: + fieldPath: metadata.name + - name: NAMESPACE + valueFrom: + fieldRef: + fieldPath: metadata.namespace + volumeClaimTemplates: + # Use persistent storage. + - metadata: + name: data + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi +``` + +:::{CAUTION} +If you are not running CrateDB 5.1.1, you must adapt this example +configuration to your specific CrateDB version. +::: + +:::{SEEALSO} +CrateDB supports {ref}`configuration via command-line options +` and {ref}`node discovery via DNS +`. + +Explicitly {ref}`configure heap memory ` for optimum performance. + +You must set memory map limits correctly. Consult the {ref}`bootstrap checks +` documentation for more information. +::: + +### Persistent volume + +As mentioned in the [Controller] section, CrateDB containers must be able to +retain state between restarts and rescheduling. Stateful containers can be +achieved with [persistent volumes]. + +Persistent volumes can be provisioned in many different ways, so the specific +configuration will depend on your setup. + +#### Microsoft Azure + +You can create a [StorageClass] for [Azure Managed Disks] with a +configuration snippet like this: + +```yaml +apiVersion: storage.k8s.io/v1 +kind: StorageClass +metadata: + labels: + addonmanager.kubernetes.io/mode: Reconcile + app.kubernetes.io/managed-by: kube-addon-manager + app.kubernetes.io/name: crate-premium + app.kubernetes.io/part-of: infrastructure + app.kubernetes.io/version: "0.1" + storage-tier: premium + volume-type: ssd + name: crate-premium +parameters: + kind: Managed + storageaccounttype: Premium_LRS +provisioner: kubernetes.io/azure-disk +reclaimPolicy: Delete +volumeBindingMode: Immediate +``` + +You can then use this in your controller configuration with something like this: + +```yaml +[...] + volumeClaimTemplates: + - metadata: + name: persistant-data + spec: + # This will create one 100GB read-write Azure Managed Disks volume + # for every CrateDB pod. + accessModes: [ "ReadWriteOnce" ] + storageClassName: crate-premium + resources: + requests: + storage: 100g +``` + +[azure managed disks]: https://azure.microsoft.com/en-us/pricing/details/managed-disks/ +[configuration]: https://kubernetes.io/docs/concepts/configuration/overview/ +[containerization]: https://www.docker.com/resources/what-container +[cratedb docker image]: https://hub.docker.com/_/crate/ +[docker]: https://www.docker.com/ +[horizontally scalable]: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_out)_and_vertical_scaling_(scale_up) +[ingress]: https://kubernetes.io/docs/concepts/services-networking/ingress/ +[kubernetes]: https://kubernetes.io/ +[loadbalancer]: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer +[managed]: https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/ +[minikube]: https://kubernetes.io/docs/setup/minikube/ +[persistent volume]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ +[persistent volumes]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ +[pod]: https://kubernetes.io/docs/concepts/workloads/pods/ +[rolling update strategy]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#rolling-updates +[service]: https://kubernetes.io/docs/concepts/services-networking/service/ +[services]: https://kubernetes.io/docs/concepts/services-networking/service/ +[setting up your first cratedb cluster on kubernetes]: https://cratedb.com/blog/run-your-first-cratedb-cluster-on-kubernetes-part-one +[shared-nothing architecture]: https://en.wikipedia.org/wiki/Shared-nothing_architecture +[statefulset]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ +[storageclass]: https://kubernetes.io/docs/concepts/storage/storage-classes/ diff --git a/docs/install/container/kubernetes/kubernetes.rst b/docs/install/container/kubernetes/kubernetes.rst deleted file mode 100644 index 2fb47b5b..00000000 --- a/docs/install/container/kubernetes/kubernetes.rst +++ /dev/null @@ -1,387 +0,0 @@ -.. _cratedb-kubernetes: - -====================== -CrateDB and Kubernetes -====================== - -.. div:: sd-text-muted - - CrateDB and Kubernetes are a great match. - -CrateDB’s `horizontally scalable`_ `shared-nothing architecture` lends itself -well to `containerization`_. - -`Kubernetes`_ is an open-source container orchestration system for the -management, deployment, and scaling of containerized systems. - -Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB. - -.. NOTE:: - - While Kubernetes works with a variety of container technologies, this - document only covers its use with Docker. - -.. SEEALSO:: - - A complimentary blog post miniseries that walks you through the process of - `setting up your first CrateDB cluster on Kubernetes`_. - - A lower-level introduction to :ref:`running CrateDB on Docker `. - - A guide to :ref:`scaling CrateDB on Kubernetes `. - - The official `CrateDB Docker image`_. - - -Managing Kubernetes -=================== - -Kubernetes deployments can be `managed`_ in many different ways. Which one -makes sense for you will depend on your situation. -On Kubernetes, CrateDB is managed as a resource. The following commands -will give you an idea of how. - -You can create a resource like so: - -.. code-block:: console - - sh$ kubectl create -f crate-controller.yaml --namespace crate - statefulset.apps/crate-controller created - -Here, we are creating a `StatefulSet`_ controller in the ``crate`` namespace -using a configuration file named ``crate-controller.yaml``. - -You can update the resource after editing the configuration file, like so: - -.. code-block:: console - - sh$ kubectl replace -f crate-controller.yaml --namespace crate - statefulset.apps/crate replaced - -If your StatefulSet uses the default `rolling update strategy`_, this command will -restart your pods with the new configuration one-by-one. - -.. WARNING:: - - If you use a regular ``replace`` command, pods are restarted, and any - `persistent volumes`_ will still be intact. - - If, however, you pass the ``--force`` option to the ``replace`` command, - resources are deleted and recreated, and the pods will come back up with no - data. - - -Configuration -============= - -A range of Kubernetes `configuration`_ snippets that can be -used to create a three-node CrateDB cluster. - - -Services --------- - -A Kubernetes pod is ephemeral and so are its network addresses. Typically, this -means that it is inadvisable to connect to pods directly. - -A Kubernetes `service`_ allows you to define a network access policy for a set -of pods. You can then use the network address of the service to communicate -with the pods. The network address of the service remains static even though the -constituent pods may come and go. - -For our purposes, we define two services: an `internal service`_ and an -`external service`_. - - -Internal service -................ - -CrateDB uses the internal service for :ref:`node discovery via DNS -` and :ref:`inter-node communication -`. - -Here's an example configuration snippet: - -.. code-block:: yaml - - kind: Service - apiVersion: v1 - metadata: - name: crate-internal-service - labels: - app: crate - spec: - # A static IP address is assigned to this service. This IP address is - # only reachable from within the Kubernetes cluster. - type: ClusterIP - ports: - # Port 4300 for inter-node communication. - - port: 4300 - name: crate-internal - selector: - # Apply this to all nodes with the `app:crate` label. - app: crate - - -External service -................ - -The external service provides a stable network address for external clients. - -Here's an example configuration snippet: - -.. code-block:: yaml - - kind: Service - apiVersion: v1 - metadata: - name: crate-external-service - labels: - app: crate - spec: - # Create an externally reachable load balancer. - type: LoadBalancer - ports: - # Port 4200 for HTTP clients. - - port: 4200 - name: crate-web - # Port 5432 for PostgreSQL wire protocol clients. - - port: 5432 - name: postgres - selector: - # Apply this to all nodes with the `app:crate` label. - app: crate - -.. NOTE:: - - In production, a `LoadBalancer`_ service type is typically only available on - hosted cloud platforms that provide externally managed load balancers. - However, an `ingress`_ resource can be used to provide internally managed - load balancers. - - For local development, `Minikube`_ provides a LoadBalancer service. - - -Controller ----------- - -A Kubernetes `pod`_ is a group of one or more containers. Pods are designed to -provide discrete units of functionality. - -CrateDB nodes are self-contained, so we don't need to use more than one -container in a pod. We can configure our pods as a single container running -CrateDB. - -Pods are designed to be fungible computing units, meaning they can be created or -destroyed at will. This, in turn, means that: - -- A cluster can be scaled in or out by destroying or creating pods - -- A cluster can be healed by replacing pods - -- A cluster can be rebalanced by rescheduling pods (i.e., destroying the pod on - one Kubernetes node and recreating it on a new node) - -However, CrateDB nodes that leave and then want to rejoin a cluster must retain -their state. That is, they must continue to use the same name and must continue -to use the same data on disk. - -For this reason, we use the `StatefulSet`_ controller to define our cluster, -which ensures that CrateDB nodes retain state across restarts or rescheduling. - -The following configuration snippet defines a controller for a three-node -CrateDB 5.1.1 cluster: - -.. code-block:: yaml - - kind: StatefulSet - apiVersion: "apps/v1" - metadata: - # This is the name used as a prefix for all pods in the set. - name: crate - spec: - serviceName: "crate-set" - # Our cluster has three nodes. - replicas: 3 - selector: - matchLabels: - # The pods in this cluster have the `app:crate` app label. - app: crate - template: - metadata: - labels: - app: crate - spec: - # InitContainers run before the main containers of a pod are - # started, and they must terminate before the primary containers - # are initialized. Here, we use one to set the correct memory - # map limit. - initContainers: - - name: init-sysctl - image: busybox - imagePullPolicy: IfNotPresent - command: ["sysctl", "-w", "vm.max_map_count=262144"] - securityContext: - privileged: true - # This final section is the core of the StatefulSet configuration. - # It defines the container to run in each pod. - containers: - - name: crate - # Use the CrateDB 5.1.1 Docker image. - image: crate:5.1.1 - # Pass in configuration to CrateDB via command-line options. - # We are setting the name of the node's explicitly, which is - # needed to determine the initial master nodes. These are set to - # the name of the pod. - # We are using the SRV records provided by Kubernetes to discover - # nodes within the cluster. - args: - - -Cnode.name=${POD_NAME} - - -Ccluster.name=${CLUSTER_NAME} - - -Ccluster.initial_master_nodes=crate-0,crate-1,crate-2 - - -Cdiscovery.seed_providers=srv - - -Cdiscovery.srv.query=_crate-internal._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local - - -Cgateway.recover_after_data_nodes=2 - - -Cgateway.expected_data_nodes=${EXPECTED_NODES} - - -Cpath.data=/data - volumeMounts: - # Mount the `/data` directory as a volume named `data`. - - mountPath: /data - name: data - resources: - limits: - # How much memory each pod gets. - memory: 512Mi - ports: - # Port 4300 for inter-node communication. - - containerPort: 4300 - name: crate-internal - # Port 4200 for HTTP clients. - - containerPort: 4200 - name: crate-web - # Port 5432 for PostgreSQL wire protocol clients. - - containerPort: 5432 - name: postgres - # Environment variables passed through to the container. - env: - # This is variable is detected by CrateDB. - - name: CRATE_HEAP_SIZE - value: "256m" - # The rest of these variables are used in the command-line - # options. - - name: EXPECTED_NODES - value: "3" - - name: CLUSTER_NAME - value: "my-crate" - - name: POD_NAME - valueFrom: - fieldRef: - fieldPath: metadata.name - - name: NAMESPACE - valueFrom: - fieldRef: - fieldPath: metadata.namespace - volumeClaimTemplates: - # Use persistent storage. - - metadata: - name: data - spec: - accessModes: - - ReadWriteOnce - resources: - requests: - storage: 1Gi - -.. CAUTION:: - - If you are not running CrateDB 5.1.1, you must adapt this example - configuration to your specific CrateDB version. - -.. SEEALSO:: - - CrateDB supports :ref:`configuration via command-line options - ` and :ref:`node discovery via DNS - `. - - Explicitly :ref:`configure heap memory ` for optimum performance. - - You must set memory map limits correctly. Consult the :ref:`bootstrap checks - ` documentation for more information. - - -Persistent volume ------------------ - -As mentioned in the `Controller`_ section, CrateDB containers must be able to -retain state between restarts and rescheduling. Stateful containers can be -achieved with `persistent volumes`_. - -Persistent volumes can be provisioned in many different ways, so the specific -configuration will depend on your setup. - - -Microsoft Azure -............... - -You can create a `StorageClass`_ for `Azure Managed Disks`_ with a -configuration snippet like this: - -.. code-block:: yaml - - apiVersion: storage.k8s.io/v1 - kind: StorageClass - metadata: - labels: - addonmanager.kubernetes.io/mode: Reconcile - app.kubernetes.io/managed-by: kube-addon-manager - app.kubernetes.io/name: crate-premium - app.kubernetes.io/part-of: infrastructure - app.kubernetes.io/version: "0.1" - storage-tier: premium - volume-type: ssd - name: crate-premium - parameters: - kind: Managed - storageaccounttype: Premium_LRS - provisioner: kubernetes.io/azure-disk - reclaimPolicy: Delete - volumeBindingMode: Immediate - -You can then use this in your controller configuration with something like this: - -.. code-block:: yaml - - [...] - volumeClaimTemplates: - - metadata: - name: persistant-data - spec: - # This will create one 100GB read-write Azure Managed Disks volume - # for every CrateDB pod. - accessModes: [ "ReadWriteOnce" ] - storageClassName: crate-premium - resources: - requests: - storage: 100g - -.. _Azure Managed Disks: https://azure.microsoft.com/en-us/pricing/details/managed-disks/ -.. _configuration: https://kubernetes.io/docs/concepts/configuration/overview/ -.. _containerization: https://www.docker.com/resources/what-container -.. _CrateDB Docker image: https://hub.docker.com/_/crate/ -.. _Docker: https://www.docker.com/ -.. _horizontally scalable: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_out)_and_vertical_scaling_(scale_up) -.. _Ingress: https://kubernetes.io/docs/concepts/services-networking/ingress/ -.. _Kubernetes: https://kubernetes.io/ -.. _LoadBalancer: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer -.. _managed: https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/ -.. _Minikube: https://kubernetes.io/docs/setup/minikube/ -.. _persistent volume: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ -.. _persistent volumes: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ -.. _pod: https://kubernetes.io/docs/concepts/workloads/pods/ -.. _rolling update strategy: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#rolling-updates -.. _service: https://kubernetes.io/docs/concepts/services-networking/service/ -.. _services: https://kubernetes.io/docs/concepts/services-networking/service/ -.. _setting up your first CrateDB cluster on Kubernetes: https://cratedb.com/blog/run-your-first-cratedb-cluster-on-kubernetes-part-one -.. _shared-nothing architecture : https://en.wikipedia.org/wiki/Shared-nothing_architecture -.. _StatefulSet: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ -.. _StorageClass: https://kubernetes.io/docs/concepts/storage/storage-classes/ diff --git a/docs/install/debian-ubuntu.md b/docs/install/debian-ubuntu.md new file mode 100644 index 00000000..aa0ed9d4 --- /dev/null +++ b/docs/install/debian-ubuntu.md @@ -0,0 +1,90 @@ +```{highlight} bash +``` + +(debian)= + +(ubuntu)= + +(install-deb)= + +(install-debian)= + +(install-ubuntu)= + +# CrateDB on Debian, Ubuntu, and Derivates + +Install CrateDB [deb] packages using the [apt] package manager. + +This installation method is suitable for Debian systems and derivates +like Ubuntu. + +## Configure package repository + +You will need to configure your system to register with and trust packages from +the CrateDB package repository: + +``` +# Install prerequisites. +sudo apt update +sudo apt install --yes apt-transport-https apt-utils curl gnupg lsb-release + +# Import the public GPG key for verifying the package signatures. +curl -sS https://cdn.crate.io/downloads/debian/DEB-GPG-KEY-crate | \ + sudo tee /etc/apt/trusted.gpg.d/cratedb.asc + +# Add CrateDB repository to Apt +echo "deb https://cdn.crate.io/downloads/debian/stable/ default main" | \ + sudo tee /etc/apt/sources.list.d/crate-stable.list +``` + +:::{NOTE} +CrateDB provides two repositories. A *stable* and a *testing* repository. To use +the testing repository, replace `stable` with `testing` in the command +above. You can read more about our [release workflow]. +::: + +Now, update the package sources: + +``` +sh$ sudo apt update +``` + +You should see a success message. This indicates that the CrateDB package +repository is correctly registered. + +## Install CrateDB + +With everything set up, you can install CrateDB: + +``` +sh$ sudo apt install crate +``` + +After the installation is finished, you can start the `crate` service: + +``` +sh$ sudo systemctl start crate +``` + +Once the service is up and running, you can access CrateDB by visiting: + +``` +http://localhost:4200/ +``` + +## Configure CrateDB + +Please visit the {ref}`install-configure` documentation section to learn +about the location and meaning of CrateDB's configuration files. + +```{eval-rst} +.. include:: _control-linux.rst +``` + +```{eval-rst} +.. include:: _post-install.rst +``` + +[apt]: https://en.wikipedia.org/wiki/APT_(software) +[deb]: https://en.wikipedia.org/wiki/Deb_(file_format) +[release workflow]: https://github.com/crate/crate/blob/master/devs/docs/release.rst diff --git a/docs/install/debian-ubuntu.rst b/docs/install/debian-ubuntu.rst deleted file mode 100644 index 6458e822..00000000 --- a/docs/install/debian-ubuntu.rst +++ /dev/null @@ -1,77 +0,0 @@ -.. highlight:: bash - -.. _debian: -.. _ubuntu: -.. _install-deb: -.. _install-debian: -.. _install-ubuntu: - -######################################## -CrateDB on Debian, Ubuntu, and Derivates -######################################## - -Install CrateDB deb_ packages using the apt_ package manager. - -This installation method is suitable for Debian systems and derivates -like Ubuntu. - -Configure package repository -============================ - -You will need to configure your system to register with and trust packages from -the CrateDB package repository:: - - # Install prerequisites. - sudo apt update - sudo apt install --yes apt-transport-https apt-utils curl gnupg lsb-release - - # Import the public GPG key for verifying the package signatures. - curl -sS https://cdn.crate.io/downloads/debian/DEB-GPG-KEY-crate | \ - sudo tee /etc/apt/trusted.gpg.d/cratedb.asc - - # Add CrateDB repository to Apt - echo "deb https://cdn.crate.io/downloads/debian/stable/ default main" | \ - sudo tee /etc/apt/sources.list.d/crate-stable.list - -.. NOTE:: - - CrateDB provides two repositories. A *stable* and a *testing* repository. To use - the testing repository, replace ``stable`` with ``testing`` in the command - above. You can read more about our `release workflow`_. - -Now, update the package sources:: - - sh$ sudo apt update - -You should see a success message. This indicates that the CrateDB package -repository is correctly registered. - -Install CrateDB -=============== - -With everything set up, you can install CrateDB:: - - sh$ sudo apt install crate - -After the installation is finished, you can start the ``crate`` service:: - - sh$ sudo systemctl start crate - -Once the service is up and running, you can access CrateDB by visiting:: - - http://localhost:4200/ - - -Configure CrateDB -================= - -Please visit the :ref:`install-configure` documentation section to learn -about the location and meaning of CrateDB's configuration files. - - -.. include:: _control-linux.rst -.. include:: _post-install.rst - -.. _apt: https://en.wikipedia.org/wiki/APT_(software) -.. _deb: https://en.wikipedia.org/wiki/Deb_(file_format) -.. _release workflow: https://github.com/crate/crate/blob/master/devs/docs/release.rst diff --git a/docs/install/index.rst b/docs/install/index.md similarity index 62% rename from docs/install/index.rst rename to docs/install/index.md index 9da2ca39..aceaa038 100644 --- a/docs/install/index.rst +++ b/docs/install/index.md @@ -1,40 +1,44 @@ -.. _install: +(install)= -####### -Install -####### +# Install +```{eval-rst} .. div:: sd-text-muted Install CrateDB on different operating systems and environments, for on-premises and development operations. +``` -.. toctree:: - :maxdepth: 3 - :hidden: +```{toctree} +:hidden: true +:maxdepth: 3 - Debian, Ubuntu - Red Hat, SUSE - Windows - Tarball +Debian, Ubuntu +Red Hat, SUSE +Windows +Tarball - container/index - cloud/index +container/index +cloud/index - configure +configure +``` +% Layout stolen from Streamlink. -.. Layout stolen from Streamlink. -.. https://github.com/streamlink/streamlink/blob/master/docs/install.rst?plain=1 +% https://github.com/streamlink/streamlink/blob/master/docs/install.rst?plain=1 -.. Icons from sphinx{design}. -.. https://sphinx-design.readthedocs.io/en/latest/badges_buttons.html#inline-icons -.. https://fontawesome.com/icons/ +% Icons from sphinx{design}. -.. sphinx-design currently doesn't support autosectionlabel, so set labels for -.. the following sections explicitly +% https://sphinx-design.readthedocs.io/en/latest/badges_buttons.html#inline-icons +% https://fontawesome.com/icons/ +% sphinx-design currently doesn't support autosectionlabel, so set labels for + +% the following sections explicitly + +```{eval-rst} .. grid:: 2 2 2 4 :padding: 0 :class-container: installation-grid @@ -138,36 +142,34 @@ Install :octicon:`gear` +``` -We recommend to use the package-based installation methods for :ref:`install-deb` and -:ref:`install-rpm`, by subscribing to the corresponding package release channels. - -Alternatively, you can also do an :ref:`install-tarball`. +We recommend to use the package-based installation methods for {ref}`install-deb` and +{ref}`install-rpm`, by subscribing to the corresponding package release channels. +Alternatively, you can also do an {ref}`install-tarball`. -***** -Notes -***** +## Notes After the installation is finished, the CrateDB service should be up and -running, and will run a HTTP server on ``localhost:4200``. To access the -:ref:`Admin UI ` from your local machine, navigate -to:: - - http://localhost:4200/ - -.. note:: - - CrateDB requires a `Java virtual machine`_ to run. - - - Starting with CrateDB 4.2, Java is bundled with CrateDB, and no extra - installation is necessary. - - - CrateDB versions before 4.2 required a separate Java installation. For - CrateDB 3.0 to 4.1, Java 11 is the minimum requirement. CrateDB versions - before 3.0 require Java 8. We recommend to use OpenJDK_ on Linux Systems. - - -.. _Java virtual machine: https://en.wikipedia.org/wiki/Java_virtual_machine -.. _OpenJDK: https://openjdk.java.net/projects/jdk/ -.. _Other releases of CrateDB: https://cdn.crate.io/downloads/releases/ +running, and will run a HTTP server on `localhost:4200`. To access the +{ref}`Admin UI ` from your local machine, navigate +to: + +``` +http://localhost:4200/ +``` + +:::{note} +CrateDB requires a [Java virtual machine] to run. + +- Starting with CrateDB 4.2, Java is bundled with CrateDB, and no extra + installation is necessary. +- CrateDB versions before 4.2 required a separate Java installation. For + CrateDB 3.0 to 4.1, Java 11 is the minimum requirement. CrateDB versions + before 3.0 require Java 8. We recommend to use [OpenJDK] on Linux Systems. +::: + +[java virtual machine]: https://en.wikipedia.org/wiki/Java_virtual_machine +[openjdk]: https://openjdk.java.net/projects/jdk/ +[other releases of cratedb]: https://cdn.crate.io/downloads/releases/ diff --git a/docs/install/redhat.md b/docs/install/redhat.md new file mode 100644 index 00000000..193c7013 --- /dev/null +++ b/docs/install/redhat.md @@ -0,0 +1,102 @@ +```{highlight} bash +``` + +(red-hat)= + +(install-rpm)= + +(install-redhat)= + +(install-suse)= + +# CrateDB on Red Hat, SUSE, and Derivates + +Install CrateDB [RPM] packages using the [DNF], [YUM], or [ZYpp] package managers. + +This installation method is suitable for RedHat Enterprise Linux (RHEL) and compatible +systems like Fedora, CentOS, Rocky Linux, AlmaLinux, AWS Linux, Oracle Linux, or +Scientific Linux. Installation also works on openSUSE and SUSE Linux Enterprise Server +(SLES) systems. + +## Configure package repository + +To register with the CrateDB package repository, create a file called `cratedb.repo` +in the `/etc/yum.repos.d/` directory for RedHat based distributions, or in the +`/etc/zypp/repos.d/` directory for OpenSuSE based distributions, containing: + +``` +[cratedb-ce-stable] +name=CrateDB RPM package repository - $basearch - Stable +baseurl=https://cdn.crate.io/downloads/yum/7/$basearch +enabled=0 +gpgcheck=1 +gpgkey=https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate +autorefresh=1 +type=rpm-md + +[cratedb-ce-testing] +name=CrateDB RPM package repository - $basearch - Testing +baseurl=https://cdn.crate.io/downloads/yum/testing/7/$basearch +enabled=0 +gpgcheck=1 +gpgkey=https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate +autorefresh=1 +type=rpm-md +``` + +:::{NOTE} +The configured repository is disabled by default. This eliminates the +possibility of accidentally upgrading CrateDB when upgrading the rest +of the system. Each install or upgrade command must explicitly enable +the repository as indicated in the sample installation command below. +::: + +CrateDB provides both *stable release* and *testing release* channels. You +can read more about the [release workflow]. + +## Install CrateDB + +With everything set up, you can install CrateDB: + +``` +sudo dnf install --enablerepo=cratedb-ce-stable crate +``` + +:::{TIP} +On older Red Hat and CentOS installations, please use the `yum` command +instead of `dnf`. On SUSE based installations, please use the `zypper` +command. +::: + +## Configure CrateDB + +Please visit the {ref}`install-configure` documentation section to learn +about the location and meaning of CrateDB's configuration files. + +## Trust signing key + +In order to trust the package signing key upfront, before being prompted +to do it on the first installation of CrateDB, you can also import it +into your repository keyring, like that: + +``` +# Install prerequisites. +yum install sudo + +# Import the public GPG key for verifying the package signatures. +sudo rpm --import https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate +``` + +```{eval-rst} +.. include:: _control-linux.rst +``` + +```{eval-rst} +.. include:: _post-install.rst +``` + +[dnf]: https://en.wikipedia.org/wiki/DNF_(software) +[release workflow]: https://github.com/crate/crate/blob/master/devs/docs/release.rst +[rpm]: https://en.wikipedia.org/wiki/RPM_Package_Manager +[yum]: https://en.wikipedia.org/wiki/Yum_(software) +[zypp]: https://en.wikipedia.org/wiki/ZYpp diff --git a/docs/install/redhat.rst b/docs/install/redhat.rst deleted file mode 100644 index f57afc51..00000000 --- a/docs/install/redhat.rst +++ /dev/null @@ -1,97 +0,0 @@ -.. highlight:: bash - -.. _red-hat: -.. _install-rpm: -.. _install-redhat: -.. _install-suse: - -####################################### -CrateDB on Red Hat, SUSE, and Derivates -####################################### - -Install CrateDB RPM_ packages using the DNF_, YUM_, or ZYpp_ package managers. - -This installation method is suitable for RedHat Enterprise Linux (RHEL) and compatible -systems like Fedora, CentOS, Rocky Linux, AlmaLinux, AWS Linux, Oracle Linux, or -Scientific Linux. Installation also works on openSUSE and SUSE Linux Enterprise Server -(SLES) systems. - -Configure package repository -============================ - -To register with the CrateDB package repository, create a file called ``cratedb.repo`` -in the ``/etc/yum.repos.d/`` directory for RedHat based distributions, or in the -``/etc/zypp/repos.d/`` directory for OpenSuSE based distributions, containing:: - - [cratedb-ce-stable] - name=CrateDB RPM package repository - $basearch - Stable - baseurl=https://cdn.crate.io/downloads/yum/7/$basearch - enabled=0 - gpgcheck=1 - gpgkey=https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate - autorefresh=1 - type=rpm-md - - [cratedb-ce-testing] - name=CrateDB RPM package repository - $basearch - Testing - baseurl=https://cdn.crate.io/downloads/yum/testing/7/$basearch - enabled=0 - gpgcheck=1 - gpgkey=https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate - autorefresh=1 - type=rpm-md - -.. NOTE:: - - The configured repository is disabled by default. This eliminates the - possibility of accidentally upgrading CrateDB when upgrading the rest - of the system. Each install or upgrade command must explicitly enable - the repository as indicated in the sample installation command below. - -CrateDB provides both *stable release* and *testing release* channels. You -can read more about the `release workflow`_. - - -Install CrateDB -=============== - -With everything set up, you can install CrateDB:: - - sudo dnf install --enablerepo=cratedb-ce-stable crate - -.. TIP:: - - On older Red Hat and CentOS installations, please use the ``yum`` command - instead of ``dnf``. On SUSE based installations, please use the ``zypper`` - command. - - -Configure CrateDB -================= - -Please visit the :ref:`install-configure` documentation section to learn -about the location and meaning of CrateDB's configuration files. - - -Trust signing key -================= - -In order to trust the package signing key upfront, before being prompted -to do it on the first installation of CrateDB, you can also import it -into your repository keyring, like that:: - - # Install prerequisites. - yum install sudo - - # Import the public GPG key for verifying the package signatures. - sudo rpm --import https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate - - -.. include:: _control-linux.rst -.. include:: _post-install.rst - -.. _DNF: https://en.wikipedia.org/wiki/DNF_(software) -.. _release workflow: https://github.com/crate/crate/blob/master/devs/docs/release.rst -.. _RPM: https://en.wikipedia.org/wiki/RPM_Package_Manager -.. _YUM: https://en.wikipedia.org/wiki/Yum_(software) -.. _ZYpp: https://en.wikipedia.org/wiki/ZYpp diff --git a/docs/install/tarball.md b/docs/install/tarball.md new file mode 100644 index 00000000..9e777ec1 --- /dev/null +++ b/docs/install/tarball.md @@ -0,0 +1,74 @@ +(install-tarball)= + +(install-macos)= + +# Installation from Tarball Archive + +```{eval-rst} +.. div:: sd-text-muted + + How to use the release archives to install CrateDB. +``` + +The walkthrough is suitable to install and run CrateDB on +[Unix-like] systems, for example Linux and macOS. + +:::{CAUTION} +You may experience performance issues when running releases from the public +archive on ARM-based macOS systems. For improved performance, we recommend +manually building CrateDB suited for ARM-based macOS. Detailed instructions +can be found in our [manual build guide]. +::: + +1. Download the latest [CrateDB release archive]. Please make sure to select + the right release archive matching your system. + +2. Once downloaded, extract the archive either using your favorite terminal or + command line shell or by using a GUI tool like [7-Zip]: + + ``` + # Extract tarball on Unix-like systems + tar -xzf crate-*.tar.gz + ``` + +3. On the terminal, change into the extracted `crate` directory: + + ``` + cd crate-* + ``` + +4. Run a CrateDB single-node instance on the local network interface: + + ``` + ./bin/crate + ``` + +:::{NOTE} +When running a specific version of CrateDB from tarball on a macOS +system for the first time, it is possible that you will encounter an error +like: **"java" cannot be opened because developer cannot be verified.** + +This is expected and can be fixed in your system settings: +: - Navigate to **System Preferences** -> **Security and Privacy** + - On the page you will see an **Allow Anyway** button for "java" + - After confirming, run the `/bin/crate` command again. You will be + asked to confirm once more with **Open** button. After that CrateDB + will run as expected. +::: + +5. In order to stop CrateDB again, use {kbd}`ctrl-c`. + +:::{SEEALSO} +Consult the {ref}`crate-reference:cli` documentation for further information +about the `./bin/crate` command. +::: + +```{eval-rst} +.. include:: _post-install.rst + +``` + +[7-zip]: https://www.7-zip.org/ +[cratedb release archive]: https://cdn.crate.io/downloads/releases/cratedb/ +[manual build guide]: https://github.com/crate/crate/blob/master/devs/docs/basics.rst +[unix-like]: https://en.wikipedia.org/wiki/Unix-like diff --git a/docs/install/tarball.rst b/docs/install/tarball.rst deleted file mode 100644 index 4fe2c414..00000000 --- a/docs/install/tarball.rst +++ /dev/null @@ -1,66 +0,0 @@ -.. _install-tarball: -.. _install-macos: - -################################# -Installation from Tarball Archive -################################# - -.. div:: sd-text-muted - - How to use the release archives to install CrateDB. - -The walkthrough is suitable to install and run CrateDB on -`Unix-like`_ systems, for example Linux and macOS. - -.. CAUTION:: - - You may experience performance issues when running releases from the public - archive on ARM-based macOS systems. For improved performance, we recommend - manually building CrateDB suited for ARM-based macOS. Detailed instructions - can be found in our `manual build guide`_. - -1. Download the latest `CrateDB release archive`_. Please make sure to select - the right release archive matching your system. - -2. Once downloaded, extract the archive either using your favorite terminal or - command line shell or by using a GUI tool like `7-Zip`_:: - - # Extract tarball on Unix-like systems - tar -xzf crate-*.tar.gz - -3. On the terminal, change into the extracted ``crate`` directory:: - - cd crate-* - -4. Run a CrateDB single-node instance on the local network interface:: - - ./bin/crate - -.. NOTE:: - - When running a specific version of CrateDB from tarball on a macOS - system for the first time, it is possible that you will encounter an error - like: **"java" cannot be opened because developer cannot be verified.** - - This is expected and can be fixed in your system settings: - - Navigate to **System Preferences** -> **Security and Privacy** - - On the page you will see an **Allow Anyway** button for "java" - - After confirming, run the ``/bin/crate`` command again. You will be - asked to confirm once more with **Open** button. After that CrateDB - will run as expected. - -5. In order to stop CrateDB again, use :kbd:`ctrl-c`. - -.. SEEALSO:: - - Consult the :ref:`crate-reference:cli` documentation for further information - about the ``./bin/crate`` command. - - -.. include:: _post-install.rst - - -.. _7-Zip: https://www.7-zip.org/ -.. _CrateDB release archive: https://cdn.crate.io/downloads/releases/cratedb/ -.. _manual build guide: https://github.com/crate/crate/blob/master/devs/docs/basics.rst -.. _Unix-like: https://en.wikipedia.org/wiki/Unix-like diff --git a/docs/install/windows.md b/docs/install/windows.md new file mode 100644 index 00000000..0454af91 --- /dev/null +++ b/docs/install/windows.md @@ -0,0 +1,84 @@ +```{highlight} bash +``` + +(install-windows)= + +# Running CrateDB on Windows + +How to use the release archives to run CrateDB on Microsoft Windows. + +:::{CAUTION} +We do not officially support CrateDB on Windows for production use. If +you would like to deploy CrateDB on Windows, please feel free to [contact +us][contact us] so we can work with you on a solution. +::: + +1. Download the latest [CrateDB release archive] for Windows. + +2. Once downloaded, extract the archive either using your favorite terminal or + command-line shell or by using a GUI tool like [7-Zip]. We recommend + using [PowerShell] when using terminal: + + ``` + # Extract Zip archive + unzip -o crate-*.zip + ``` + +3. On the terminal, change into the extracted `crate` directory: + + ``` + cd crate-* + ``` + +4. Run a CrateDB single-node instance on the local network interface: + + ``` + ./bin/crate + ``` + +5. You will be notified by an INFO message similar to this, when your + single-node cluster is started successfully: + + ``` + [2022-07-04T19:41:12,340][INFO ][o.e.n.Node] [Aiguille Verte] started + ``` + +6. In order to stop CrateDB again, use {kbd}`ctrl-c`. You will be asked to + terminate the job. Input {kbd}`Y`: + + ``` + Terminate batch job (Y/N)? Y + ``` + +:::{SEEALSO} +Consult the {ref}`crate-reference:cli` documentation for further information +about the `./bin/crate` command. +::: + +:::{NOTE} +If you are installing CrateDB on a recent [Windows Server] edition, +setting up the latest *Microsoft Visual C++ 2019 Redistributable* package +is required. You can download it at [msvcrt x86-64], [msvcrt x86-32] or [msvcrt ARM64]. + +Within the terminal, as a Windows user, the prompt after +[starting PowerShell] will look like this. + +```doscon +PS> ./bin/crate +``` +::: + +```{eval-rst} +.. include:: _post-install.rst + +``` + +[7-zip]: https://www.7-zip.org/ +[contact us]: https://cratedb.com/contact/ +[cratedb release archive]: https://cdn.crate.io/downloads/releases/cratedb/x64_windows/ +[msvcrt arm64]: https://aka.ms/vs/16/release/VC_redist.arm64.exe +[msvcrt x86-32]: https://aka.ms/vs/16/release/vc_redist.x86.exe +[msvcrt x86-64]: https://aka.ms/vs/16/release/vc_redist.x64.exe +[powershell]: https://docs.microsoft.com/en-us/powershell/scripting/overview?view=powershell-7.2 +[starting powershell]: https://learn.microsoft.com/en-us/powershell/scripting/learn/ps101/01-getting-started?view=powershell-7.4#how-to-launch-powershell +[windows server]: https://www.microsoft.com/en-us/windows-server diff --git a/docs/install/windows.rst b/docs/install/windows.rst deleted file mode 100644 index 38a72f36..00000000 --- a/docs/install/windows.rst +++ /dev/null @@ -1,74 +0,0 @@ -.. highlight:: bash - -.. _install-windows: - -########################## -Running CrateDB on Windows -########################## - -How to use the release archives to run CrateDB on Microsoft Windows. - -.. CAUTION:: - - We do not officially support CrateDB on Windows for production use. If - you would like to deploy CrateDB on Windows, please feel free to `contact - us`_ so we can work with you on a solution. - -#. Download the latest `CrateDB release archive`_ for Windows. - -#. Once downloaded, extract the archive either using your favorite terminal or - command-line shell or by using a GUI tool like `7-Zip`_. We recommend - using `PowerShell`_ when using terminal:: - - # Extract Zip archive - unzip -o crate-*.zip - -#. On the terminal, change into the extracted ``crate`` directory:: - - cd crate-* - -#. Run a CrateDB single-node instance on the local network interface:: - - ./bin/crate - -#. You will be notified by an INFO message similar to this, when your - single-node cluster is started successfully:: - - [2022-07-04T19:41:12,340][INFO ][o.e.n.Node] [Aiguille Verte] started - -#. In order to stop CrateDB again, use :kbd:`ctrl-c`. You will be asked to - terminate the job. Input :kbd:`Y`:: - - Terminate batch job (Y/N)? Y - -.. SEEALSO:: - - Consult the :ref:`crate-reference:cli` documentation for further information - about the ``./bin/crate`` command. - - -.. NOTE:: - - If you are installing CrateDB on a recent `Windows Server`_ edition, - setting up the latest *Microsoft Visual C++ 2019 Redistributable* package - is required. You can download it at `msvcrt x86-64`_, `msvcrt x86-32`_ or `msvcrt ARM64`_. - - Within the terminal, as a Windows user, the prompt after - `starting PowerShell`_ will look like this. - - .. code-block:: doscon - - PS> ./bin/crate - -.. include:: _post-install.rst - - -.. _7-Zip: https://www.7-zip.org/ -.. _contact us: https://cratedb.com/contact/ -.. _CrateDB release archive: https://cdn.crate.io/downloads/releases/cratedb/x64_windows/ -.. _msvcrt ARM64: https://aka.ms/vs/16/release/VC_redist.arm64.exe -.. _msvcrt x86-32: https://aka.ms/vs/16/release/vc_redist.x86.exe -.. _msvcrt x86-64: https://aka.ms/vs/16/release/vc_redist.x64.exe -.. _Powershell: https://docs.microsoft.com/en-us/powershell/scripting/overview?view=powershell-7.2 -.. _starting PowerShell: https://learn.microsoft.com/en-us/powershell/scripting/learn/ps101/01-getting-started?view=powershell-7.4#how-to-launch-powershell -.. _Windows Server: https://www.microsoft.com/en-us/windows-server From 59a26f12532cf19d884bf0e27e92f4018e115b63 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 17:26:15 +0200 Subject: [PATCH 2/6] Install: Fix the build after converting to MyST Markdown --- docs/install/cloud/azure/terraform.md | 2 +- docs/install/debian-ubuntu.md | 11 +++++------ docs/install/redhat.md | 11 +++++------ docs/install/tarball.md | 5 ++--- docs/install/windows.md | 5 ++--- 5 files changed, 15 insertions(+), 19 deletions(-) diff --git a/docs/install/cloud/azure/terraform.md b/docs/install/cloud/azure/terraform.md index 4994992c..10e0f813 100644 --- a/docs/install/cloud/azure/terraform.md +++ b/docs/install/cloud/azure/terraform.md @@ -115,7 +115,7 @@ The Azure-specific variables need to be adjusted according to your environment: | ``location`` | The geographic region in which to create the Azure | ``az account list-locations`` | | | resources | | +---------------+----------+--------------------------------------------------------------+----------------------------------+ -| ``storage_account_type`` | Storage Account Type of the disk containing the CrateDB | `List of Storage Account Types`_ | +| ``storage_account_type`` | Storage Account Type of the disk containing the CrateDB | [List of Storage Account Types] | | | data directory | | +--------------------------+--------------------------------------------------------------+----------------------------------+ | ``size`` | Specifies the size of the VM | ``az vm list-sizes`` | diff --git a/docs/install/debian-ubuntu.md b/docs/install/debian-ubuntu.md index aa0ed9d4..7a0dcc57 100644 --- a/docs/install/debian-ubuntu.md +++ b/docs/install/debian-ubuntu.md @@ -77,13 +77,12 @@ http://localhost:4200/ Please visit the {ref}`install-configure` documentation section to learn about the location and meaning of CrateDB's configuration files. -```{eval-rst} -.. include:: _control-linux.rst -``` +:::{include} _control-linux.md +::: + +:::{include} _post-install.md +::: -```{eval-rst} -.. include:: _post-install.rst -``` [apt]: https://en.wikipedia.org/wiki/APT_(software) [deb]: https://en.wikipedia.org/wiki/Deb_(file_format) diff --git a/docs/install/redhat.md b/docs/install/redhat.md index 193c7013..36cd4be8 100644 --- a/docs/install/redhat.md +++ b/docs/install/redhat.md @@ -87,13 +87,12 @@ yum install sudo sudo rpm --import https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate ``` -```{eval-rst} -.. include:: _control-linux.rst -``` +:::{include} _control-linux.md +::: + +:::{include} _post-install.md +::: -```{eval-rst} -.. include:: _post-install.rst -``` [dnf]: https://en.wikipedia.org/wiki/DNF_(software) [release workflow]: https://github.com/crate/crate/blob/master/devs/docs/release.rst diff --git a/docs/install/tarball.md b/docs/install/tarball.md index 9e777ec1..1b9c100d 100644 --- a/docs/install/tarball.md +++ b/docs/install/tarball.md @@ -63,10 +63,9 @@ Consult the {ref}`crate-reference:cli` documentation for further information about the `./bin/crate` command. ::: -```{eval-rst} -.. include:: _post-install.rst +:::{include} _post-install.md +::: -``` [7-zip]: https://www.7-zip.org/ [cratedb release archive]: https://cdn.crate.io/downloads/releases/cratedb/ diff --git a/docs/install/windows.md b/docs/install/windows.md index 0454af91..cfcb0433 100644 --- a/docs/install/windows.md +++ b/docs/install/windows.md @@ -68,10 +68,9 @@ PS> ./bin/crate ``` ::: -```{eval-rst} -.. include:: _post-install.rst +:::{include} _post-install.md +::: -``` [7-zip]: https://www.7-zip.org/ [contact us]: https://cratedb.com/contact/ From e539f4b8358644eb42f911b53d2eec9ebc94b69a Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 19:09:35 +0200 Subject: [PATCH 3/6] Install: Fix formatting after converting to MyST Markdown --- docs/install/_control-linux.md | 6 +++-- docs/install/_post-install.md | 3 ++- docs/install/cloud/aws/ec2-setup.md | 24 +++++++++++++----- docs/install/cloud/azure/terraform.md | 36 +++++++++++++++------------ docs/install/cloud/azure/vm.md | 33 +++++++++++++++++++----- 5 files changed, 71 insertions(+), 31 deletions(-) diff --git a/docs/install/_control-linux.md b/docs/install/_control-linux.md index 0df0c3a9..4251e7a9 100644 --- a/docs/install/_control-linux.md +++ b/docs/install/_control-linux.md @@ -1,4 +1,5 @@ -# Control CrateDB on Linux +:::{rubric} Control CrateDB on Linux +::: You can control the `crate` service with the `systemctl` utility program: @@ -9,7 +10,8 @@ sudo systemctl COMMAND crate Replace `COMMAND` with `start`, `stop`, `restart`, `status` and so on. -# Notes +:::{rubric} Notes +::: After the installation is finished, the `crate` service should be installed, but may not be configured to start automatically. Use the following command to diff --git a/docs/install/_post-install.md b/docs/install/_post-install.md index 12ec3e94..d468df63 100644 --- a/docs/install/_post-install.md +++ b/docs/install/_post-install.md @@ -1,4 +1,5 @@ -# Post-install notes +:::{rubric} Post-install notes +::: After successfully installing CrateDB, for example on your workstation, the web-based Admin UI can be visited at: diff --git a/docs/install/cloud/aws/ec2-setup.md b/docs/install/cloud/aws/ec2-setup.md index d9bb0feb..72d80a06 100644 --- a/docs/install/cloud/aws/ec2-setup.md +++ b/docs/install/cloud/aws/ec2-setup.md @@ -32,12 +32,24 @@ with the same cluster name and form a cluster. Once you have your instances running and CrateDB installed, you can enable EC2 discovery: -| CrateDB Version | Reference | Example | -| --------------- | --------- | ------- | -| >=4.x | [latest] | ``` -discovery.seed_providers: ec2 ``` | -| \<=3.x | [3.3] | ``` -discovery.zen.hosts_provider: ec2 ``` | +````{list-table} +--- +header-rows: 1 +--- +* - CrateDB Version + - Reference + - Configuration Example +* - \>=4.x + - [latest] + - ```yaml + discovery.seed_providers: ec2 + ``` +* - <=3.x + - [3.3] + - ```yaml + discovery.zen.hosts_provider: ec2 + ``` +```` To be able to use the EC2 API, CrateDB must [sign the requests] by using AWS credentials consisting of an access key and a secret key. Therefore diff --git a/docs/install/cloud/azure/terraform.md b/docs/install/cloud/azure/terraform.md index 10e0f813..7da754d1 100644 --- a/docs/install/cloud/azure/terraform.md +++ b/docs/install/cloud/azure/terraform.md @@ -105,22 +105,26 @@ output "cratedb" { The Azure-specific variables need to be adjusted according to your environment: -```{eval-rst} -+--------------------------+--------------------------------------------------------------+----------------------------------+ -| Variable | Explanation | How to obtain | -+==========================+==============================================================+==================================+ -| ``subscription_id`` | The ID of the Azure subscription to use for creating the | ``az account list`` | -| | resource group in | | -+---------------+----------+--------------------------------------------------------------+----------------------------------+ -| ``location`` | The geographic region in which to create the Azure | ``az account list-locations`` | -| | resources | | -+---------------+----------+--------------------------------------------------------------+----------------------------------+ -| ``storage_account_type`` | Storage Account Type of the disk containing the CrateDB | [List of Storage Account Types] | -| | data directory | | -+--------------------------+--------------------------------------------------------------+----------------------------------+ -| ``size`` | Specifies the size of the VM | ``az vm list-sizes`` | -+--------------------------+--------------------------------------------------------------+----------------------------------+ -``` +````{list-table} +--- +header-rows: 1 +--- +* - Variable + - Explanation + - How to obtain +* - `subscription_id` + - The ID of the Azure subscription to use for creating the resource group in. + - `az account list` +* - `location` + - The geographic region in which to create the Azure resources. + - `az account list-locations` +* - `storage_account_type` + - Storage Account Type of the disk containing the CrateDB data directory. + - [List of Storage Account Types] +* - `size` + - Specifies the size of the VM. + - `az vm list-sizes` +```` ## Execution diff --git a/docs/install/cloud/azure/vm.md b/docs/install/cloud/azure/vm.md index 145a6215..4c8ce038 100644 --- a/docs/install/cloud/azure/vm.md +++ b/docs/install/cloud/azure/vm.md @@ -93,12 +93,33 @@ configuration file at */etc/crate/crate.yml*. Uncomment / add these lines: -| CrateDB Version | Reference | Configuration Example | -| --------------- | --------- | --------------------- | -| \<=4.x | [latest] | ```yaml -discovery.seed_hosts: - node1.example.com:4300 - node2.example.com:4300 - 10.0.1.102:4300 - 10.0.1.103:4300 ``` | -| \<=3.x | [3.3] | ```yaml -discovery.zen.ping.unicast.hosts: - node1.example.com:4300 - node2.example.com:4300 - 10.0.1.102:4300 - 10.0.1.103:4300 ``` | + +````{list-table} +--- +header-rows: 1 +--- +* - CrateDB Version + - Reference + - Configuration Example +* - \>=4.x + - [latest] + - ```yaml + discovery.seed_hosts: + - node1.example.com:4300 + - node2.example.com:4300 + - 10.0.1.102:4300 + - 10.0.1.103:4300 + ``` +* - <=3.x + - [3.3] + - ```yaml + discovery.zen.ping.unicast.hosts: + - node1.example.com:4300 + - node2.example.com:4300 + - 10.0.1.102:4300 + - 10.0.1.103:4300 + ``` +```` Note: You might want to try {ref}`DNS based discovery ` for inter-node communication. From c06f3227971bd36f086a33cbefba04a10b0c68fb Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 19:09:51 +0200 Subject: [PATCH 4/6] Install: Improve syntax/formatting after converting to MyST Markdown --- docs/install/_control-linux.md | 6 +- docs/install/cloud/aws/aws-terraform-setup.md | 4 +- docs/install/cloud/aws/ec2-setup.md | 13 +- docs/install/cloud/aws/s3-setup.md | 5 +- docs/install/cloud/azure/terraform.md | 4 +- docs/install/cloud/azure/vm.md | 2 +- docs/install/cloud/index.md | 2 +- docs/install/configure.md | 2 +- docs/install/container/docker.md | 25 +- docs/install/container/index.md | 10 +- .../container/kubernetes/kubernetes.md | 8 +- docs/install/debian-ubuntu.md | 21 +- docs/install/index.md | 222 +++++++++--------- docs/install/redhat.md | 16 +- docs/install/tarball.md | 15 +- docs/install/windows.md | 15 +- 16 files changed, 169 insertions(+), 201 deletions(-) diff --git a/docs/install/_control-linux.md b/docs/install/_control-linux.md index 4251e7a9..9731e0f4 100644 --- a/docs/install/_control-linux.md +++ b/docs/install/_control-linux.md @@ -3,7 +3,7 @@ You can control the `crate` service with the `systemctl` utility program: -``` +```shell sudo systemctl COMMAND crate ``` @@ -17,12 +17,12 @@ After the installation is finished, the `crate` service should be installed, but may not be configured to start automatically. Use the following command to start CrateDB: -``` +```shell sudo systemctl start crate ``` In order to make the service reboot-safe, invoke: -``` +```shell sudo systemctl enable crate ``` diff --git a/docs/install/cloud/aws/aws-terraform-setup.md b/docs/install/cloud/aws/aws-terraform-setup.md index 4b4496b2..f9b27135 100644 --- a/docs/install/cloud/aws/aws-terraform-setup.md +++ b/docs/install/cloud/aws/aws-terraform-setup.md @@ -1,6 +1,6 @@ (aws-terraform-setup)= -# Running CrateDB via Terraform +# Deploy using Terraform In {ref}`ec2_setup`, we elaborated on how to leverage EC2's functionality to set up a CrateDB cluster. Here, we will explore how to automate this kind of setup. @@ -50,7 +50,7 @@ The CrateDB Terraform configuration consists of a set of variables to customize your deployment. Create a new file `main.tf` with the following content and adjust variable values as needed: -``` +```terraform module "cratedb-cluster" { source = "github.com/crate/crate-terraform.git/aws" diff --git a/docs/install/cloud/aws/ec2-setup.md b/docs/install/cloud/aws/ec2-setup.md index 72d80a06..0acbed1f 100644 --- a/docs/install/cloud/aws/ec2-setup.md +++ b/docs/install/cloud/aws/ec2-setup.md @@ -1,9 +1,6 @@ -```{highlight} yaml -``` - (ec2-setup)= -# Running CrateDB on Amazon EC2 +# CrateDB on Amazon EC2 ## Introduction @@ -60,7 +57,7 @@ CrateDB binds to the loopback interface by default. To get EC2 discovery working, you need to update the {ref}`host ` setting, in order to bind to and publish the site-local address: -``` +```yaml network.host: _site_ ``` @@ -134,7 +131,7 @@ security group then, you can easily filter instances by that group. For example, when you launch your instances with the security group `sg-crate-demo`, your CrateDB setting would be: -``` +```yaml discovery.ec2.groups: sg-crate-demo ``` @@ -181,7 +178,7 @@ This way, any number of tags can be used for filtering, using the Filtering by tags can help when you want to launch several CrateDB clusters within the same security group, e.g: -``` +```yaml discovery.ec2: groups: sg-crate-demo tag.env: production @@ -198,7 +195,7 @@ you have several clusters for the same tenant in different availability zones same security group (e.g. `sg-crate-demo`) and filter the instances used for discovery by availability zone: -``` +```yaml discovery.ec2: groups: sg-crate-demo availability_zones: us-west-1 diff --git a/docs/install/cloud/aws/s3-setup.md b/docs/install/cloud/aws/s3-setup.md index 48d58a29..da15a1df 100644 --- a/docs/install/cloud/aws/s3-setup.md +++ b/docs/install/cloud/aws/s3-setup.md @@ -1,6 +1,3 @@ -```{highlight} yaml -``` - (s3-setup)= # Using Amazon S3 as a snapshot repository @@ -15,7 +12,7 @@ Support for *Snapshot* and *Restore* to the [Amazon S3] service is enabled by default in CrateDB. If you need to explicitly turn it off, disable the cloud setting in the `crate.yml` file: -``` +```yaml cloud.enabled: false ``` diff --git a/docs/install/cloud/azure/terraform.md b/docs/install/cloud/azure/terraform.md index 7da754d1..7ae5afdb 100644 --- a/docs/install/cloud/azure/terraform.md +++ b/docs/install/cloud/azure/terraform.md @@ -1,6 +1,6 @@ (azure-terraform-setup)= -# Running CrateDB via Terraform +# Deploy using Terraform In {ref}`azure_vm_setup`, we elaborated on how to leverage Azure's functionality to set up a CrateDB cluster. Here, we will explore how to automate this kind of @@ -51,7 +51,7 @@ The CrateDB Terraform configuration consists of a set of variables to customize your deployment. Create a new file `main.tf` with the following content and adjust variable values as needed: -``` +```terraform module "cratedb-cluster" { source = "github.com/crate/crate-terraform.git/azure" diff --git a/docs/install/cloud/azure/vm.md b/docs/install/cloud/azure/vm.md index 4c8ce038..8ce84b4c 100644 --- a/docs/install/cloud/azure/vm.md +++ b/docs/install/cloud/azure/vm.md @@ -1,6 +1,6 @@ (azure-vm-setup)= -# Running CrateDB on Azure VMs +# CrateDB on Azure VMs Getting CrateDB working on Azure with Linux or Windows is a simple process. You can use Azure's management console or CLI interface ([Learn how to install diff --git a/docs/install/cloud/index.md b/docs/install/cloud/index.md index 721a6d47..8e7330dd 100644 --- a/docs/install/cloud/index.md +++ b/docs/install/cloud/index.md @@ -1,6 +1,6 @@ (install-cloud)= -# Cloud Hosting +# Cloud hosting CrateDB provides packages and executables that will work on any operating system capable of running Java. diff --git a/docs/install/configure.md b/docs/install/configure.md index cb1580d0..1b9ad047 100644 --- a/docs/install/configure.md +++ b/docs/install/configure.md @@ -32,7 +32,7 @@ working directory. Here is an example: -``` +```shell # Configure heap size (defaults to 256m min, 1g max). CRATE_HEAP_SIZE=2g diff --git a/docs/install/container/docker.md b/docs/install/container/docker.md index 20782799..b89ec3e9 100644 --- a/docs/install/container/docker.md +++ b/docs/install/container/docker.md @@ -1,6 +1,3 @@ -```{highlight} sh -``` - (cratedb-docker)= # Run CrateDB on Docker @@ -35,7 +32,7 @@ with no vote. To create the [user-defined network], run the command: -``` +```shell sh$ docker network create crate ``` @@ -120,7 +117,7 @@ f79116373877 crate "/docker-entrypoin..." 16 seconds ago You can have a look at the container's logs in tail mode like this: -```text +```shell sh$ docker logs -f crate01 ``` @@ -139,7 +136,7 @@ page that lists a single node. Now add the second node, `crate02`, to the cluster: -``` +```shell sh$ docker run --rm -d \ --name=crate02 \ --net=crate \ @@ -169,7 +166,7 @@ should see two nodes. You can now add `crate03` like this: -``` +```shell sh$ docker run --rm -d \ --name=crate03 \ --net=crate -p 4203:4200 \ @@ -213,7 +210,7 @@ If the limit cannot be adjusted on the host system, the memory map limit check can be bypassed by passing the `-Cnode.store.allow_mmap=false` option to the `crate` command: -``` +```shell sh$ docker run -d --name=crate01 \ --net=crate -p 4201:4200 --env CRATE_HEAP_SIZE=1g \ crate -Cnetwork.host=_site_ \ @@ -227,7 +224,7 @@ This will result in degraded performance. You can also start a single node without any {ref}`bootstrap checks ` by passing the `-Cdiscovery.type=single-node` option: -``` +```shell sh$ docker run -d --name=crate01 \ --net=crate -p 4201:4200 \ --env CRATE_HEAP_SIZE=1g \ @@ -255,7 +252,7 @@ If you wanted to run `crash` inside a user-defined network called `crate` and connect to three hosts named `crate01`, `crate02`, and `crate03` (i.e. the example covered in the [Creating a Cluster] section) you could run: -``` +```shell $ docker run --rm -ti \ --net=crate crate \ crash --hosts crate01 crate02 crate03 @@ -370,7 +367,7 @@ per host machine. If you are running one container per machine, you can map the container ports to the host ports so that the host acts like a native installation. For example: -``` +```shell $ docker run -d -p 4200:4200 -p 4300:4300 -p 5432:5432 --env CRATE_HEAP_SIZE=1g crate \ crate -Cnetwork.host=_site_ ``` @@ -382,7 +379,7 @@ and go, and any data inside them is lost when the container is removed. For this reason, you should mount a persistent `data` directory on your host machine to the `/data` directory inside the container: -``` +```shell $ docker run -d -v /srv/crate/data:/data --env CRATE_HEAP_SIZE=1g crate \ crate -Cnetwork.host=_site_ ``` @@ -401,7 +398,7 @@ removed. Here is an example of how you could mount the `crate.yml` config file: -``` +```shell $ docker run -d \ -v /srv/crate/config/crate.yml:/crate/config/crate.yml \ --env CRATE_HEAP_SIZE=1g crate \ @@ -470,7 +467,7 @@ If you want the container to use a maximum of 1.5 CPUs, a maximum of 2 GB memory, with a heap size of 1 GB, you could configure everything at once. For example: -``` +```shell $ docker run -d \ --cpus 1.5 \ --memory 1g \ diff --git a/docs/install/container/index.md b/docs/install/container/index.md index 1fe327a8..1339ca9f 100644 --- a/docs/install/container/index.md +++ b/docs/install/container/index.md @@ -1,12 +1,10 @@ (install-container)= -# Container Setup +# Container setup -```{eval-rst} -.. div:: sd-text-muted - - Install CrateDB in container environments. -``` +:::{div} sd-text-muted +Install CrateDB in container environments. +::: CrateDB is ideal for containerized environments, creating and scaling a cluster takes minutes and your valuable data is always in sync and available. diff --git a/docs/install/container/kubernetes/kubernetes.md b/docs/install/container/kubernetes/kubernetes.md index 7c570569..ed4d7da7 100644 --- a/docs/install/container/kubernetes/kubernetes.md +++ b/docs/install/container/kubernetes/kubernetes.md @@ -2,11 +2,9 @@ # CrateDB and Kubernetes -```{eval-rst} -.. div:: sd-text-muted - - CrateDB and Kubernetes are a great match. -``` +:::{div} sd-text-muted +CrateDB and Kubernetes are a great match. +::: CrateDB’s [horizontally scalable] `shared-nothing architecture` lends itself well to [containerization]. diff --git a/docs/install/debian-ubuntu.md b/docs/install/debian-ubuntu.md index 7a0dcc57..9bd701e3 100644 --- a/docs/install/debian-ubuntu.md +++ b/docs/install/debian-ubuntu.md @@ -1,29 +1,24 @@ -```{highlight} bash -``` - (debian)= - (ubuntu)= - (install-deb)= - (install-debian)= - (install-ubuntu)= # CrateDB on Debian, Ubuntu, and Derivates +:::{div} sd-text-muted Install CrateDB [deb] packages using the [apt] package manager. +::: This installation method is suitable for Debian systems and derivates like Ubuntu. -## Configure package repository +## Package repository -You will need to configure your system to register with and trust packages from +Configure your system to register with and trust packages from the CrateDB package repository: -``` +```shell # Install prerequisites. sudo apt update sudo apt install --yes apt-transport-https apt-utils curl gnupg lsb-release @@ -45,7 +40,7 @@ above. You can read more about our [release workflow]. Now, update the package sources: -``` +```shell sh$ sudo apt update ``` @@ -56,13 +51,13 @@ repository is correctly registered. With everything set up, you can install CrateDB: -``` +```shell sh$ sudo apt install crate ``` After the installation is finished, you can start the `crate` service: -``` +```shell sh$ sudo systemctl start crate ``` diff --git a/docs/install/index.md b/docs/install/index.md index aceaa038..09672bdd 100644 --- a/docs/install/index.md +++ b/docs/install/index.md @@ -2,16 +2,14 @@ # Install -```{eval-rst} -.. div:: sd-text-muted - - Install CrateDB on different operating systems and environments, - for on-premises and development operations. -``` +:::{div} sd-text-muted +Install CrateDB on different operating systems and environments, +for on-premises and development operations. +::: -```{toctree} -:hidden: true -:maxdepth: 3 +:::{toctree} +:maxdepth: 1 +:hidden: Debian, Ubuntu Red Hat, SUSE @@ -22,7 +20,7 @@ container/index cloud/index configure -``` +::: % Layout stolen from Streamlink. @@ -38,111 +36,109 @@ configure % the following sections explicitly -```{eval-rst} -.. grid:: 2 2 2 4 - :padding: 0 - :class-container: installation-grid - - .. grid-item-card:: Debian, Ubuntu - :link: install-debian - :link-type: ref - :link-alt: Debian and Ubuntu Linux - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :fab:`linux` - :fab:`ubuntu` - - .. grid-item-card:: Red Hat, SUSE - :link: install-rpm - :link-type: ref - :link-alt: RPM Linux: Red Hat, SUSE - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :fab:`redhat` - :fab:`suse` - - .. grid-item-card:: macOS - :link: install-macos - :link-type: ref - :link-alt: macOS - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :fab:`apple` - - .. grid-item-card:: Windows - :link: install-windows - :link-type: ref - :link-alt: Windows - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :fab:`windows` - - .. grid-item-card:: Tarball Archive - :link: install-tarball - :link-type: ref - :link-alt: Installation from Tarball - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :octicon:`archive` - - .. grid-item-card:: Container Setup - :link: install-container - :link-type: ref - :link-alt: Container Setup - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :octicon:`container` - - .. grid-item-card:: Cloud Hosting - :link: install-cloud - :link-type: ref - :link-alt: Cloud Hosting - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :fa:`cloud` - - .. grid-item-card:: Config Settings - :link: install-configure - :link-type: ref - :link-alt: Configuration Settings - :padding: 3 - :text-align: center - :class-card: sd-pt-3 - :class-body: sd-fs-1 - :class-title: sd-fs-6 - - :octicon:`gear` +::::{grid} 2 2 2 4 +:padding: 0 +:class-container: installation-grid + +:::{grid-item-card} Debian, Ubuntu +:link: install-debian +:link-type: ref +:link-alt: Debian and Ubuntu Linux +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{fab}`linux` +{fab}`ubuntu` +::: +:::{grid-item-card} Red Hat, SUSE +:link: install-rpm +:link-type: ref +:link-alt: RPM Linux: Red Hat, SUSE +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{fab}`redhat` +{fab}`suse` +::: -``` +:::{grid-item-card} macOS +:link: install-macos +:link-type: ref +:link-alt: macOS +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{fab}`apple` +::: + +:::{grid-item-card} Windows +:link: install-windows +:link-type: ref +:link-alt: Windows +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{fab}`windows` +::: + +:::{grid-item-card} Tarball Archive +:link: install-tarball +:link-type: ref +:link-alt: Installation from Tarball +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{octicon}`archive` +::: + +:::{grid-item-card} Container Setup +:link: install-container +:link-type: ref +:link-alt: Container Setup +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{octicon}`container` +::: + +:::{grid-item-card} Cloud Hosting +:link: install-cloud +:link-type: ref +:link-alt: Cloud Hosting +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{octicon}`cloud` +::: + +:::{grid-item-card} Config Settings +:link: install-configure +:link-type: ref +:link-alt: Configuration Settings +:padding: 3 +:text-align: center +:class-card: sd-pt-3 +:class-body: sd-fs-1 +:class-title: sd-fs-6 +{octicon}`gear` +::: + +:::: We recommend to use the package-based installation methods for {ref}`install-deb` and {ref}`install-rpm`, by subscribing to the corresponding package release channels. diff --git a/docs/install/redhat.md b/docs/install/redhat.md index 36cd4be8..61d7c45c 100644 --- a/docs/install/redhat.md +++ b/docs/install/redhat.md @@ -1,30 +1,26 @@ -```{highlight} bash -``` - (red-hat)= - (install-rpm)= - (install-redhat)= - (install-suse)= # CrateDB on Red Hat, SUSE, and Derivates +:::{div} sd-text-muted Install CrateDB [RPM] packages using the [DNF], [YUM], or [ZYpp] package managers. +::: This installation method is suitable for RedHat Enterprise Linux (RHEL) and compatible systems like Fedora, CentOS, Rocky Linux, AlmaLinux, AWS Linux, Oracle Linux, or Scientific Linux. Installation also works on openSUSE and SUSE Linux Enterprise Server (SLES) systems. -## Configure package repository +## Package repository To register with the CrateDB package repository, create a file called `cratedb.repo` in the `/etc/yum.repos.d/` directory for RedHat based distributions, or in the `/etc/zypp/repos.d/` directory for OpenSuSE based distributions, containing: -``` +```ini [cratedb-ce-stable] name=CrateDB RPM package repository - $basearch - Stable baseurl=https://cdn.crate.io/downloads/yum/7/$basearch @@ -58,7 +54,7 @@ can read more about the [release workflow]. With everything set up, you can install CrateDB: -``` +```shell sudo dnf install --enablerepo=cratedb-ce-stable crate ``` @@ -79,7 +75,7 @@ In order to trust the package signing key upfront, before being prompted to do it on the first installation of CrateDB, you can also import it into your repository keyring, like that: -``` +```shell # Install prerequisites. yum install sudo diff --git a/docs/install/tarball.md b/docs/install/tarball.md index 1b9c100d..2e383e28 100644 --- a/docs/install/tarball.md +++ b/docs/install/tarball.md @@ -1,14 +1,11 @@ (install-tarball)= - (install-macos)= # Installation from Tarball Archive -```{eval-rst} -.. div:: sd-text-muted - - How to use the release archives to install CrateDB. -``` +:::{div} sd-text-muted +How to use the release archives to install CrateDB. +::: The walkthrough is suitable to install and run CrateDB on [Unix-like] systems, for example Linux and macOS. @@ -26,20 +23,20 @@ can be found in our [manual build guide]. 2. Once downloaded, extract the archive either using your favorite terminal or command line shell or by using a GUI tool like [7-Zip]: - ``` + ```shell # Extract tarball on Unix-like systems tar -xzf crate-*.tar.gz ``` 3. On the terminal, change into the extracted `crate` directory: - ``` + ```shell cd crate-* ``` 4. Run a CrateDB single-node instance on the local network interface: - ``` + ```shell ./bin/crate ``` diff --git a/docs/install/windows.md b/docs/install/windows.md index cfcb0433..74900809 100644 --- a/docs/install/windows.md +++ b/docs/install/windows.md @@ -1,9 +1,6 @@ -```{highlight} bash -``` - (install-windows)= -# Running CrateDB on Windows +# CrateDB on Windows How to use the release archives to run CrateDB on Microsoft Windows. @@ -19,34 +16,34 @@ us][contact us] so we can work with you on a solution. command-line shell or by using a GUI tool like [7-Zip]. We recommend using [PowerShell] when using terminal: - ``` + ```doscon # Extract Zip archive unzip -o crate-*.zip ``` 3. On the terminal, change into the extracted `crate` directory: - ``` + ```doscon cd crate-* ``` 4. Run a CrateDB single-node instance on the local network interface: - ``` + ```doscon ./bin/crate ``` 5. You will be notified by an INFO message similar to this, when your single-node cluster is started successfully: - ``` + ```text [2022-07-04T19:41:12,340][INFO ][o.e.n.Node] [Aiguille Verte] started ``` 6. In order to stop CrateDB again, use {kbd}`ctrl-c`. You will be asked to terminate the job. Input {kbd}`Y`: - ``` + ```text Terminate batch job (Y/N)? Y ``` From b7ef9cf3b2e98032cc176f96710b57be4e2357e4 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 19:34:57 +0200 Subject: [PATCH 5/6] Chore: Fix broken link references --- docs/feature/cloud/index.md | 2 +- docs/install/container/docker.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/feature/cloud/index.md b/docs/feature/cloud/index.md index 8ba7a419..8f7cbee1 100644 --- a/docs/feature/cloud/index.md +++ b/docs/feature/cloud/index.md @@ -65,7 +65,7 @@ needs. :::{rubric} CrateDB Cloud ::: -- {ref}`cloud:cluster-deployment-marketplace` +- {ref}`cloud:index` - {ref}`Documentation ` - [Web Console] diff --git a/docs/install/container/docker.md b/docs/install/container/docker.md index b89ec3e9..fa71f82d 100644 --- a/docs/install/container/docker.md +++ b/docs/install/container/docker.md @@ -79,7 +79,7 @@ Breaking the command down: {ref}`docker-compose` as reference). - Puts the container into the `crate` network and maps port `4201` on your host machine to port `4200` on the container (admin UI). -- Defines the environment variable:ref:`CRATE_HEAP_SIZE `, +- Defines the environment variable {ref}`CRATE_HEAP_SIZE `, which is used by CrateDB to allocate 1 GB for its heap memory. - Runs the command `crate` inside the container with parameters: : - `network.host`: The `_site_` value results in the binding of the From d7ed7f36a7c142ead46e5a52ef88f563c50124e1 Mon Sep 17 00:00:00 2001 From: Andreas Motl Date: Fri, 10 Oct 2025 19:30:29 +0200 Subject: [PATCH 6/6] Install: Implement suggestions by CodeRabbit --- docs/feature/cloud/index.md | 1 - docs/install/cloud/aws/aws-terraform-setup.md | 2 +- docs/install/cloud/aws/s3-setup.md | 10 +-- docs/install/cloud/azure/terraform.md | 2 +- docs/install/cloud/azure/vm.md | 12 ++-- docs/install/container/docker.md | 71 +++++++++++-------- .../container/kubernetes/kubernetes.md | 31 ++++---- docs/install/debian-ubuntu.md | 6 +- docs/install/redhat.md | 39 +++++++--- docs/install/windows.md | 5 ++ 10 files changed, 109 insertions(+), 70 deletions(-) diff --git a/docs/feature/cloud/index.md b/docs/feature/cloud/index.md index 8f7cbee1..5d8e5d30 100644 --- a/docs/feature/cloud/index.md +++ b/docs/feature/cloud/index.md @@ -65,7 +65,6 @@ needs. :::{rubric} CrateDB Cloud ::: -- {ref}`cloud:index` - {ref}`Documentation ` - [Web Console] diff --git a/docs/install/cloud/aws/aws-terraform-setup.md b/docs/install/cloud/aws/aws-terraform-setup.md index f9b27135..3f5fdd78 100644 --- a/docs/install/cloud/aws/aws-terraform-setup.md +++ b/docs/install/cloud/aws/aws-terraform-setup.md @@ -2,7 +2,7 @@ # Deploy using Terraform -In {ref}`ec2_setup`, we elaborated on how to leverage EC2's functionality to set +In {ref}`ec2-setup`, we elaborated on how to leverage EC2's functionality to set up a CrateDB cluster. Here, we will explore how to automate this kind of setup. [Terraform] is an infrastructure as code tool, often used as an abstraction diff --git a/docs/install/cloud/aws/s3-setup.md b/docs/install/cloud/aws/s3-setup.md index da15a1df..27803cf0 100644 --- a/docs/install/cloud/aws/s3-setup.md +++ b/docs/install/cloud/aws/s3-setup.md @@ -25,10 +25,10 @@ instances. ### Authentication -It is recommended to restrict the permissions of CrateDB on the S3 to only the -required extend. First, an IAM role is required. This [AWS guide] gives a -short description of how to create a policy offer using the CLI or the AWS -management console. Further, access of the snapshot to the S3 bucket needs to +It is recommended to restrict the permissions of CrateDB on S3 to only the +required extent. First, an IAM role is required. This [AWS IAM policy guide] +explains how to create a policy by using the CLI or the AWS Management Console. +Further, access of the snapshot to the S3 bucket needs to be restricted. An example policy file granting anybody access to a bucket called `snaps.example.com` is attached below: @@ -92,7 +92,7 @@ within the policy: ``` [amazon s3]: https://aws.amazon.com/s3/ -[aws guide]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html +[AWS IAM policy guide]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-roles-for-amazon-ec2.html [aws policy examples]: https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html [aws principals]: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html [iam roles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html diff --git a/docs/install/cloud/azure/terraform.md b/docs/install/cloud/azure/terraform.md index 7ae5afdb..4aa2cc15 100644 --- a/docs/install/cloud/azure/terraform.md +++ b/docs/install/cloud/azure/terraform.md @@ -2,7 +2,7 @@ # Deploy using Terraform -In {ref}`azure_vm_setup`, we elaborated on how to leverage Azure's functionality to +In {ref}`azure-vm-setup`, we elaborated on how to leverage Azure's functionality to set up a CrateDB cluster. Here, we will explore how to automate this kind of setup. diff --git a/docs/install/cloud/azure/vm.md b/docs/install/cloud/azure/vm.md index 8ce84b4c..36f3ed1c 100644 --- a/docs/install/cloud/azure/vm.md +++ b/docs/install/cloud/azure/vm.md @@ -23,7 +23,7 @@ under the *new* left hand panel of the Azure portal. ### Create a network security group CrateDB uses two ports, one for inter-node communication (`4300`) and one for -it's http endpoint (`4200`), so access to these needs to be opened. +its HTTP endpoint (`4200`), so access to these needs to be opened. Create a *New Security Group*, giving it a name and assigning it to the 'Resource Group' just created. @@ -32,7 +32,7 @@ Create a *New Security Group*, giving it a name and assigning it to the :alt: Create New Security Group ``` -Find that security group in your resources list and open it's settings, +Find that security group in your resources list and open its settings, navigating to the *Inbound security rules* section. ```{image} /_assets/img/install/cloud/azure-nsg-inbound.png @@ -71,7 +71,7 @@ earlier to the subnet. ### Create virtual machines Next create virtual machines to act as your CrateDB nodes. In this tutorial, I -chose two low-specification Ubuntu 14.04 servers, but you likely have your own +chose two low-specification Ubuntu servers, but you likely have your own preferred configurations. Most importantly, make sure you select the Virtual Network created earlier. @@ -142,14 +142,14 @@ the same steps as for Azure and Linux. ### Create virtual machines Similar steps to creating Virtual Machines for Azure and Linux, but create the -VM based on the 'Windows Server 2012 R2 Datacenter' image. +VM based on a recent "Windows Server LTS" image. ### Install CrateDB *Note that these instructions should be followed on each VM in your cluster.* To install CrateDB on Windows Server, you will need a [Java JDK installed]. -Ensure that the `JAVA*HOME` environment variable is set. +Ensure that the `JAVA_HOME` environment variable is set. ```{image} /_assets/img/install/cloud/azure-envvar.png :alt: Environment Variables @@ -171,7 +171,7 @@ We need to allow the ports CrateDB uses through the Windows Firewall :alt: Firewall configuration ``` -Start crate by running `bin/crate`. +Start CrateDB by running `.\bin\crate.bat` (PowerShell) or `bin\crate.bat` (CMD). [3.3]: https://github.com/crate/crate/blob/3.3/blackbox/docs/config/cluster.rst#discovery [java jdk installed]: https://www.oracle.com/java/technologies/downloads/#java8 diff --git a/docs/install/container/docker.md b/docs/install/container/docker.md index fa71f82d..e36e965b 100644 --- a/docs/install/container/docker.md +++ b/docs/install/container/docker.md @@ -22,6 +22,8 @@ The official [CrateDB Docker image]. ## Quick start +(container-create-cluster)= + ### Creating a cluster To get started with CrateDB and Docker, you will create a three-node cluster @@ -38,8 +40,10 @@ sh$ docker network create crate You should then be able to see something like this: -```text +```shell sh$ docker network ls +``` +```text NETWORK ID NAME DRIVER SCOPE 1bf1b7acd66f bridge bridge local 51cebbdf7d2b crate bridge local @@ -55,7 +59,7 @@ container `crate03` will run cluster node `crate03`. You can then create your first CrateDB container and node, like this: -``` +```shell sh$ docker run --rm -d \ --name=crate01 \ --net=crate \ @@ -82,25 +86,26 @@ Breaking the command down: - Defines the environment variable {ref}`CRATE_HEAP_SIZE `, which is used by CrateDB to allocate 1 GB for its heap memory. - Runs the command `crate` inside the container with parameters: - : - `network.host`: The `_site_` value results in the binding of the - CrateDB process to a site-local IP address. - - `node.name`: Defines the node's name as `crate01` (used by - master election). - - `discovery.seed_hosts`: This parameter lists the other hosts in the - cluster. The format is a comma-separated list of `host:port` entries, - where port defaults to setting `transport.tcp.port`. Each node must - contain the name of all the other hosts in this list. Notice also that - any node in the cluster might be started at any time, and this will - create connection exceptions in the log files, however all nodes will - eventually be running and interconnected. - - `cluster.initial_master_nodes`: Defines the list of master-eligible - node names which will participate in the vote of the first master - (first bootstrap). If this parameter is not defined, then it is expected - that the node will join an already formed cluster. This parameter is only - relevant for the first election. - - `gateway.expected_data_nodes` and `gateway.recover_after_data_nodes`: - Specifies how many nodes you expect in the cluster and how many nodes must - be discovered before the cluster state is recovered. + + - `network.host`: The `_site_` value results in the binding of the + CrateDB process to a site-local IP address. + - `node.name`: Defines the node's name as `crate01` (used by + master election). + - `discovery.seed_hosts`: This parameter lists the other hosts in the + cluster. The format is a comma-separated list of `host:port` entries, + where port defaults to setting `transport.tcp.port`. Each node must + contain the name of all the other hosts in this list. Notice also that + any node in the cluster might be started at any time, and this will + create connection exceptions in the log files, however all nodes will + eventually be running and interconnected. + - `cluster.initial_master_nodes`: Defines the list of master-eligible + node names which will participate in the vote of the first master + (first bootstrap). If this parameter is not defined, then it is expected + that the node will join an already formed cluster. This parameter is only + relevant for the first election. + - `gateway.expected_data_nodes` and `gateway.recover_after_data_nodes`: + Specifies how many nodes you expect in the cluster and how many nodes must + be discovered before the cluster state is recovered. :::{NOTE} If this command aborts with an error, consult the @@ -250,10 +255,11 @@ The CrateDB Shell, `crash`, is bundled with the Docker image. If you wanted to run `crash` inside a user-defined network called `crate` and connect to three hosts named `crate01`, `crate02`, and `crate03` -(i.e. the example covered in the [Creating a Cluster] section) you could run: +(i.e. the example covered in the {ref}`container-create-cluster` section) +you could run: ```shell -$ docker run --rm -ti \ +sh$ docker run --rm -ti \ --net=crate crate \ crash --hosts crate01 crate02 crate03 ``` @@ -266,7 +272,7 @@ Docker's Compose tool allows developers to define and run multi-container Docker applications that can be started with a single `docker-compose up` command. -Read about Docker Compose specifics [here](https://docs.docker.com/compose/). +Read about Docker Compose specifics in the [Docker Compose documentation](https://docs.docker.com/compose/). You can define the services that make up your app in a `docker-compose.yml` file. To recreate the three-node cluster in the previous example, you can @@ -357,6 +363,11 @@ In the file above: - The start order of the containers is not deterministic and you want all three containers to be up and running before the election of the master node. +:::{NOTE} +The `deploy` section is used by Docker Swarm (`docker stack deploy`) and is +ignored by `docker compose up`. +::: + ## Best Practices ### One container per host @@ -368,7 +379,7 @@ If you are running one container per machine, you can map the container ports to the host ports so that the host acts like a native installation. For example: ```shell -$ docker run -d -p 4200:4200 -p 4300:4300 -p 5432:5432 --env CRATE_HEAP_SIZE=1g crate \ +sh$ docker run -d -p 4200:4200 -p 4300:4300 -p 5432:5432 --env CRATE_HEAP_SIZE=1g crate \ crate -Cnetwork.host=_site_ ``` @@ -380,7 +391,7 @@ this reason, you should mount a persistent `data` directory on your host machine to the `/data` directory inside the container: ```shell -$ docker run -d -v /srv/crate/data:/data --env CRATE_HEAP_SIZE=1g crate \ +sh$ docker run -d -v /srv/crate/data:/data --env CRATE_HEAP_SIZE=1g crate \ crate -Cnetwork.host=_site_ ``` @@ -399,7 +410,7 @@ removed. Here is an example of how you could mount the `crate.yml` config file: ```shell -$ docker run -d \ +sh$ docker run -d \ -v /srv/crate/config/crate.yml:/crate/config/crate.yml \ --env CRATE_HEAP_SIZE=1g crate \ crate -Cnetwork.host=_site_ @@ -408,7 +419,7 @@ $ docker run -d \ Here, `/srv/crate/config/crate.yml` is an example path, and should be replaced with the path to your host machine's `crate.yml` file. -## Troubleshooting +## Healthcheck troubleshooting The official [CrateDB Docker image] ships with a liveness [healthcheck] configured. @@ -468,9 +479,9 @@ memory, with a heap size of 1 GB, you could configure everything at once. For example: ```shell -$ docker run -d \ +sh$ docker run -d \ --cpus 1.5 \ - --memory 1g \ + --memory 2g \ --env CRATE_HEAP_SIZE=1g \ crate \ crate -Cnetwork.host=_site_ diff --git a/docs/install/container/kubernetes/kubernetes.md b/docs/install/container/kubernetes/kubernetes.md index ed4d7da7..916efa9a 100644 --- a/docs/install/container/kubernetes/kubernetes.md +++ b/docs/install/container/kubernetes/kubernetes.md @@ -6,7 +6,7 @@ CrateDB and Kubernetes are a great match. ::: -CrateDB’s [horizontally scalable] `shared-nothing architecture` lends itself +CrateDB’s [horizontally scalable] [shared-nothing architecture] lends itself well to [containerization]. [Kubernetes] is an open-source container orchestration system for the @@ -15,8 +15,8 @@ management, deployment, and scaling of containerized systems. Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB. :::{NOTE} -While Kubernetes works with a variety of container technologies, this -document only covers its use with Docker. +While Kubernetes supports multiple container runtimes, this document +uses Docker-compatible container images. ::: :::{SEEALSO} @@ -41,7 +41,7 @@ You can create a resource like so: ```console sh$ kubectl create -f crate-controller.yaml --namespace crate -statefulset.apps/crate-controller created +statefulset.apps/crate created ``` Here, we are creating a [StatefulSet] controller in the `crate` namespace @@ -149,6 +149,12 @@ load balancers. For local development, [Minikube] provides a LoadBalancer service. ::: +:::{WARNING} +Ensure proper network controls and authentication are in place before +exposing ports 4200 (HTTP) and 5432 (PostgreSQL) via a LoadBalancer. +Restrict source IPs and configure users/roles to avoid unauthorized access. +::: + ### Controller A Kubernetes [pod] is a group of one or more containers. Pods are designed to @@ -213,7 +219,7 @@ spec: # Use the CrateDB 5.1.1 Docker image. image: crate:5.1.1 # Pass in configuration to CrateDB via command-line options. - # We are setting the name of the node's explicitly, which is + # We are setting the node name explicitly, which is # needed to determine the initial master nodes. These are set to # the name of the pod. # We are using the SRV records provided by Kubernetes to discover @@ -228,9 +234,9 @@ spec: - -Cgateway.expected_data_nodes=${EXPECTED_NODES} - -Cpath.data=/data volumeMounts: - # Mount the `/data` directory as a volume named `data`. + # Mount the volume named `cratedb-data` to the `/data` directory. - mountPath: /data - name: data + name: cratedb-data resources: limits: # How much memory each pod gets. @@ -247,7 +253,7 @@ spec: name: postgres # Environment variables passed through to the container. env: - # This is variable is detected by CrateDB. + # This variable is detected by CrateDB. - name: CRATE_HEAP_SIZE value: "256m" # The rest of these variables are used in the command-line @@ -267,7 +273,7 @@ spec: volumeClaimTemplates: # Use persistent storage. - metadata: - name: data + name: cratedb-data spec: accessModes: - ReadWriteOnce @@ -333,7 +339,7 @@ You can then use this in your controller configuration with something like this: [...] volumeClaimTemplates: - metadata: - name: persistant-data + name: cratedb-data spec: # This will create one 100GB read-write Azure Managed Disks volume # for every CrateDB pod. @@ -341,26 +347,23 @@ You can then use this in your controller configuration with something like this: storageClassName: crate-premium resources: requests: - storage: 100g + storage: 100Gi ``` [azure managed disks]: https://azure.microsoft.com/en-us/pricing/details/managed-disks/ [configuration]: https://kubernetes.io/docs/concepts/configuration/overview/ [containerization]: https://www.docker.com/resources/what-container [cratedb docker image]: https://hub.docker.com/_/crate/ -[docker]: https://www.docker.com/ [horizontally scalable]: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_out)_and_vertical_scaling_(scale_up) [ingress]: https://kubernetes.io/docs/concepts/services-networking/ingress/ [kubernetes]: https://kubernetes.io/ [loadbalancer]: https://kubernetes.io/docs/concepts/services-networking/service/#loadbalancer [managed]: https://kubernetes.io/docs/concepts/cluster-administration/manage-deployment/ [minikube]: https://kubernetes.io/docs/setup/minikube/ -[persistent volume]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ [persistent volumes]: https://kubernetes.io/docs/concepts/storage/persistent-volumes/ [pod]: https://kubernetes.io/docs/concepts/workloads/pods/ [rolling update strategy]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#rolling-updates [service]: https://kubernetes.io/docs/concepts/services-networking/service/ -[services]: https://kubernetes.io/docs/concepts/services-networking/service/ [setting up your first cratedb cluster on kubernetes]: https://cratedb.com/blog/run-your-first-cratedb-cluster-on-kubernetes-part-one [shared-nothing architecture]: https://en.wikipedia.org/wiki/Shared-nothing_architecture [statefulset]: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ diff --git a/docs/install/debian-ubuntu.md b/docs/install/debian-ubuntu.md index 9bd701e3..455c3003 100644 --- a/docs/install/debian-ubuntu.md +++ b/docs/install/debian-ubuntu.md @@ -4,13 +4,13 @@ (install-debian)= (install-ubuntu)= -# CrateDB on Debian, Ubuntu, and Derivates +# CrateDB on Debian, Ubuntu, and Derivatives :::{div} sd-text-muted Install CrateDB [deb] packages using the [apt] package manager. ::: -This installation method is suitable for Debian systems and derivates +This installation method is suitable for Debian systems and derivatives like Ubuntu. ## Package repository @@ -63,7 +63,7 @@ sh$ sudo systemctl start crate Once the service is up and running, you can access CrateDB by visiting: -``` +```text http://localhost:4200/ ``` diff --git a/docs/install/redhat.md b/docs/install/redhat.md index 61d7c45c..296e08ce 100644 --- a/docs/install/redhat.md +++ b/docs/install/redhat.md @@ -3,22 +3,22 @@ (install-redhat)= (install-suse)= -# CrateDB on Red Hat, SUSE, and Derivates +# CrateDB on Red Hat, SUSE, and Derivatives :::{div} sd-text-muted Install CrateDB [RPM] packages using the [DNF], [YUM], or [ZYpp] package managers. ::: -This installation method is suitable for RedHat Enterprise Linux (RHEL) and compatible -systems like Fedora, CentOS, Rocky Linux, AlmaLinux, AWS Linux, Oracle Linux, or +This installation method is suitable for Red Hat Enterprise Linux (RHEL) and compatible +systems like Fedora, CentOS, Rocky Linux, AlmaLinux, Amazon Linux, Oracle Linux, or Scientific Linux. Installation also works on openSUSE and SUSE Linux Enterprise Server (SLES) systems. ## Package repository To register with the CrateDB package repository, create a file called `cratedb.repo` -in the `/etc/yum.repos.d/` directory for RedHat based distributions, or in the -`/etc/zypp/repos.d/` directory for OpenSuSE based distributions, containing: +in the `/etc/yum.repos.d/` directory for Red Hat based distributions, or in the +`/etc/zypp/repos.d/` directory for openSUSE-based distributions, containing: ```ini [cratedb-ce-stable] @@ -50,17 +50,41 @@ the repository as indicated in the sample installation command below. CrateDB provides both *stable release* and *testing release* channels. You can read more about the [release workflow]. +## Prerequisites + +If `sudo` is missing, run this as root: +```shell +# Red Hat-compatible systems (RHEL/CentOS 8+) +dnf install sudo +``` +or: +```shell +# Red Hat-compatible systems (RHEL/CentOS 7) +yum install sudo +``` +or: +```shell +# SUSE-based systems +zypper install sudo +``` + ## Install CrateDB With everything set up, you can install CrateDB: ```shell +# Red Hat-compatible systems sudo dnf install --enablerepo=cratedb-ce-stable crate ``` +or: +```shell +# SUSE-based systems +sudo zypper install --repo=cratedb-ce-stable crate +``` :::{TIP} On older Red Hat and CentOS installations, please use the `yum` command -instead of `dnf`. On SUSE based installations, please use the `zypper` +instead of `dnf`. On SUSE-based installations, please use the `zypper` command. ::: @@ -76,9 +100,6 @@ to do it on the first installation of CrateDB, you can also import it into your repository keyring, like that: ```shell -# Install prerequisites. -yum install sudo - # Import the public GPG key for verifying the package signatures. sudo rpm --import https://cdn.crate.io/downloads/yum/RPM-GPG-KEY-crate ``` diff --git a/docs/install/windows.md b/docs/install/windows.md index 74900809..4f4d2451 100644 --- a/docs/install/windows.md +++ b/docs/install/windows.md @@ -16,6 +16,11 @@ us][contact us] so we can work with you on a solution. command-line shell or by using a GUI tool like [7-Zip]. We recommend using [PowerShell] when using terminal: + ```powershell + # Extract Zip archive + Expand-Archive -Path .\crate-*.zip -DestinationPath . + ``` + ```doscon # Extract Zip archive unzip -o crate-*.zip