Releases · berops/claudie

26 Nov 10:55

Despire

v0.9.16

56e059b

v0.9.16 Latest

Latest

v0.9.16

What's Changed

The open stack provider will now use image names instead of image ids, this was due to the possibility of the ids being replaced by the provider and no longer valid #1902

Bug fixes

Fix cloudflare account id propagation when updating to newer claudie versions #1904

Assets 6

13 Nov 09:41

Despire

v0.9.15

97232b0

v0.9.15

Bug fixes

Fixes issues with incompatible docker api in the ansibler service that resulted in the error from #1885

Assets 6

17 Oct 12:27

Despire

v0.9.14

e0b73c1

v0.9.14

What's Changed

Correctly remove taints,annotations,labels when removed from a NodePool in the InputManifest #1852
In some cases unnecessary tasks were spawned which would prolong the building of the cluster without any side-effect, these have been removed #1856
Expand machine spec to contain number of GPUs #1854
Inside the NodePool specification it is now possible to specify the number of GPUs the instance has
which is made use of when autoscaling based on GPU workload.

- name: autoscaled
  providerSpec:
   name: aws
   region: eu-central-1
   zone: eu-central-1a
  autoscaler:
    min: 0
    max: 20
  # GPU machine type name.
  serverType: g4dn.xlarge
  machineSpec:
    # explicitly specify how many GPU's the instance type provides.
    nvidiaGpu: 1
  image: ami-07eef52105e8a2059

Add support for OpenStack provider, with the main aim of supporting the openstack offering from OVH #1857
It is now possible to use an openstack provider within the InputManifest.
The support for openstack has been added in the v0.9.14 version of the claudie templates.

    - name: ovh-1
      providerType: openstack
      templates:
        repository: "https://github.com/berops/claudie-config"
        tag: v0.9.14
        path: "templates/terraformer/openstack"
      secretRef:
        name: ovh-secret
        namespace: e2e-secrets

Assets 6

13 Sep 15:04

Despire

v0.9.13

159b095

v0.9.13

What's Changed

Concurrency limits are now configurable #1838
Autoscaled nodepools are now limited to 256 nodes #1839
Metadata secret will now be updated after node deletion #1841
Builder TTL has been made configurable via the BUILDER_TTL env, with a default value of 2 hours #1850

Bug fixes

Prometheus metric for currently deleted nodes has been fixed #1849

Assets 6

19 Aug 09:12

Despire

v0.9.12

b9efabf

v0.9.12

What's Changed

Retries were added to reading the output from OpenTofu, which could occasionally fail. #1824
Increased concurrency limits to decrease the build time of larger clusters. This change also affects Claudie's memory requirements, which should fit within 8 GB. #1819
For autoscaled events, Terraformer will now skip refreshing the LoadBalancers and DNS infrastructure, if present. #1830

Assets 6

04 Aug 16:31

Despire

v0.9.11

b285ee4

v0.9.11

What's Changed

READ ME: A lot of core changes are made in this release, before updating an already deployed Claudie instance, make sure you have working backups of your kuberentes clusters

InputManifest was extended to also include a NoProxy list in the proxy settings to bypass the proxy for the listed endpoints, if used. #1745

kubernetes:
    clusters:
      - name: proxy-example
        version: "1.30.0"
        network: 192.168.2.0/24
        installationProxy:
            mode: "on"
            noProxy: ".suse.com"

Update kubeone to 1.10 #1749
Migrate to OpenTofu v1.6.2 from terraform v1.5.7 #1755

READ ME: OpenTofu 1.6.2 is compatible with the previosly used Terraform version 1.5.7, while claudie will take care of the update, make sure you have working backups if you are updating an already deployed Claudie instance, in case of a disaster scenario
Add sprig to all templates used within claudie #1768
Builder will now support faster termination and wait only on the current task being processed instead of the whole workflow #1770
Claudie will now support proper HA DNS Loadbalancing #1777

This feature will be available with the latest claudie templates v0.9.11

READ ME: for already deployed Claudie instances, if you used Cloudflare as a provider you will need to update your secret to also include the Accound ID the token was created for.
NGINX was replaced by Envoy on Loadbalancers. #1735

READ ME: If you update an already deployed Claudie instance, this is a one time update that will introduce a small downtime of the services while NGINX is being replaced with Envoy.
Upgraded all terraform providers to the latest possible version that still supports the claudie templates version v0.9.8 #1782
Claudie will now perform a rollout restart for the NVIDIA GPU operator daemonset as part of the workflow, which overwrites the /etc/containerd/config.yml. #1790

Bug fixes

Return partially updated state instead of always defaulting to current state after error in deletion #1793
Restarting SSH session after updating environmnet variables, is now part of the ansible workflow, which previosly caused issue in which the updated environment variables were not reflected in a re-used SSH connection #1792
Fixed a memory leak in the autoscaler service. #1787

Assets 6

09 Apr 15:04

Despire

v0.9.10

926f566

v0.9.10

What's Changed

Decrease the amount of retries for cleanup of static nodes during deletion from 4 to 2 #1729

Bug fixes

Fix panic when deleting clusters with static nodes for which DNS was not built correctly #1724
Fix propagation of desired state from operator to manager service #1726
Fix multiple HTTP proxy environment variables present in /etc/environment #1727
Fix partial DNS apply, which would left part of the infrastructure untracked #1728

Assets 6

01 Apr 12:25

Despire

v0.9.9

530b7a5

v0.9.9

What's Changed

General maintenance release, updated dependencies used by Claudie #1709
Upgrading Longhorn from version 1.7.0 to version 1.8.1 #1709

After upgrading Longhorn to the newer version, some pods of the old and new versions will coexist if your cluster uses a PVC that uses the Longhorn storage class (which is the default), as they would reference the old v1.7.0.

To upgrade the volumes to the newer version, it's possible to use the Longhorn UI to set Settings > Concurrent Automatic Engine Upgrade Per Node Limit to a value greater than 0 to upgrade old volumes.
This is a setting that controls how Longhorn automatically upgrades volumes’ engines to the new default engine image after upgrading Longhorn manager. More on: https://longhorn.io/docs/1.8.1/deploy/upgrade/auto-upgrade-engine/

Once the upgrade is complete, the old engine image pods and the instance manager will be terminated after ~60 minutes of non-use (after all volumes have been upgraded to use the latest Longhorn version) You can also follow the official Longhorn post on this: https://longhorn.io/kb/troubleshooting-some-old-instance-manager-pods-are-still-running-after-upgrade/

Assets 6

19 Mar 13:11

Despire

v0.9.8

72c4533

v0.9.8

What's Changed

Added support for alternative names for load balancers #1693
```
   dns:
     dnsZone: example.com
     provider: example
     hostname: main
     alternativeNames:
       - other
```
Templates that Claudie uses by default, will be updated separately to make use of the alternative names.

Bug fixes

If the current state was not built and some of the nodes did not have an assigned IP address, Claudie would fail to correctly determine if the nodes were reachable. #1691
Claudie will now increase the limits for fs.inotify to a higher number, as depending on the workload on each node, reaching the limits would result in an error from which Claudie would not recover. #1696
Annotations for static nodepools will now be correctly propagated. #1696

Assets 6

12 Mar 13:18

Despire

v0.9.7

600878b

Claudie v0.9.7

v0.9.7

What's Changed

Additional settings were added to roles for LoadBalancers. #1685.

It is now possible to configure adding/removing proxy protocol and sticky sessions.

stickySessions will always forward traffic to the same node based on the IP hash.

proxyProtocol will turn on the proxy protocol. If used, the application to which the traffic is redirected must support this protocol.
```
  loadBalancers:
  roles:
    - name: example-role
      protocol: tcp
      port: 6443
      targetPort: 6443
      targetPools:
        - htz-kube-nodes
      # added
      settings:
        proxyProtocol: off (default will be on)
        stickySession: on. (default will be off)
```

Claudie will now ping nodes to check If any of the nodes became unreachable, Claudie will report the problem and will not work on any changes until the connectivity issue is resolved. #1658

For unreachable nodes within the kubernetes cluster, Claudie will give you the options of resolving the issue or removing the node from the InputManifest or via kubectl, Claudie will report the following issue

fix the unreachable nodes by either:
 - fixing the connectivity issue
 - if the connectivity issue cannot be resolved, you can:
   - delete the whole nodepool from the kubernetes cluster in the InputManifest
   - delete the selected unreachable node/s manually from the cluster via 'kubectl'
     - if its a static node you will also need to remove it from the InputManifest
     - if its a dynamic node claudie will replace it.
     NOTE: if the unreachable node is the kube-apiserver, claudie will not be able to recover
           after the deletion.

For unreachable nodes within the loadbalancer cluster, Claudie will give you the options of resolving the issue or removing the nodepool or load balancer from the InputManifest, Claudie will report the following issue

fix the unreachable nodes by either:
 - fixing the connectivity issue
 - if the connectivity issue cannot be resolved, you can:
   - delete the whole nodepool from the loadbalancer cluster in the InputManifest
   - delete the whole loadbalancer cluster from the InputManifest

Bug fixes

It may be the case that the cluster-autoscaler image may not share the same version as the specified kubernetes version in the InputManifest. Claudie will now correctly recognize this and pick the latest available cluster-autoscaler image #1680
Claudie will now set the limits of max open file descriptors on each node to 65535 #1679

Assets 6

Releases: berops/claudie

v0.9.16

v0.9.16

What's Changed

Bug fixes

Uh oh!

v0.9.15

v0.9.15

Bug fixes

Uh oh!

v0.9.14

v0.9.14

What's Changed

Uh oh!

v0.9.13

v0.9.13

What's Changed

Bug fixes

Uh oh!

v0.9.12

v0.9.12

What's Changed

Uh oh!

v0.9.11

v0.9.11

What's Changed

Bug fixes

Uh oh!

v0.9.10

v0.9.10

What's Changed

Bug fixes

Uh oh!

v0.9.9

v0.9.9

What's Changed

Uh oh!

v0.9.8

v0.9.8

What's Changed

Bug fixes

Uh oh!

Claudie v0.9.7

v0.9.7

What's Changed

Bug fixes

Uh oh!