Skip to content

wsarecv timeouts on multiple api's #1121

@chefcook

Description

@chefcook

Description

Timeouts all over the place.
Here an example for ske

stackit_ske_cluster.ske_cluster: Still creating... [30s elapsed]
stackit_ske_cluster.ske_cluster: Still creating... [40s elapsed]
stackit_ske_cluster.ske_cluster: Still creating... [50s elapsed]
╷
│ Error: Error creating/updating cluster
│
│   with stackit_ske_cluster.ske_cluster,
│   on kubernetes.tf line 1, in resource "stackit_ske_cluster" "ske_cluster":
│    1: resource "stackit_ske_cluster" "ske_cluster" {
│
│ Calling API: Put "https://ske.api.stackit.cloud/v2/projects/xxxxxx/regions/eu01/clusters/xxxxxxx": read tcp 172.30.1.127:11607->193.148.160.167:443: wsarecv: A
│ connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

Steps to reproduce

resource "stackit_ske_cluster" "ske_cluster" {
  project_id             = stackit_resourcemanager_project.main_project.project_id
  name                   = "d-ske"
  kubernetes_version_min = "1.34.3"

  network = {
    id = stackit_network.main_vnet.network_id 
  }

  node_pools = [
    {
      name               = "nplin1"
      machine_type       = "c1a.8d" # 8 CPU / 16 GB
      minimum            = 1
      maximum            = 3
      availability_zones = ["eu01-3"]
      os_name            = "ubuntu"
      volume_size        = 300 
      volume_type        = "storage_premium_perf1"
      allow_system_components = true
    }
  ]

  extensions = {
    acl = {
      enabled       = true
      allowed_cidrs = ["xxxxx/32"]
    }
  }

  maintenance = {
    enable_kubernetes_version_updates    = true
    enable_machine_image_version_updates = true
    start = "02:00:00Z"
    end   = "03:00:00Z"
  }
}
  1. Run terraform apply
  2. ...

Actual behavior

This timeout is just an example. It happens on other places as well like SQL. (i guess on all places where it takes bit longer to deploy)

Same behavior on apply and on destroy

Expected behavior

Now im not a network guy. Based on exp from azure i can tell that timeouts where never a thing.
I know its not a fair comparison of backends but i would expect at least much greater timeout windows.
Especialy when creating a ske cluster alone takes round about 10 min on avg.

I personally solved that with a retry script, and it works
but i guess it would be the the api can handle longer request by its own

from start to finish it take about ~30 min to deploy a

  • project
  • sna
  • network setting and ip
  • sfs shares
  • ske
  • sql flex

Environment

  • OS: windows_amd64
  • Terraform version (see terraform --version): `v1.11.4
  • Version of the STACKIT Terraform provider: v0.77.0

Additional information

Here are some more findings from different api
Placing those here because same nature

https://mssql-flex-service.api.stackit.cloud

stackit_public_ip.mqtt_dashboard_ip: Destroying... [id=xxxxxx,eu01,xxxxxx]
stackit_public_ip.mqtt_dashboard_ip: Destruction complete after 4s
stackit_sqlserverflex_instance.sql_instance: Destroying... [id=xxxxxx,eu01,xxxxxx]
stackit_sqlserverflex_instance.sql_instance: Still destroying... [id=xxxxxx,eu01,xxxxxx, 10s elapsed]
stackit_sfs_export_policy.rw_policy: Destroying... [id=xxxxxx,eu01,xxxxxx]
stackit_sfs_export_policy.rw_policy: Destruction complete after 0s
stackit_sfs_resource_pool.main_pool: Destroying... [id=xxxxxx,eu01,xxxxxx]
stackit_sfs_resource_pool.main_pool: Destruction complete after 0s
╷
│ Error: Error deleting instance
│
│ Calling API: Delete "https://mssql-flex-service.api.stackit.cloud/v2/projects/xxxxxx/regions/eu01/instances/xxxxxx": read tcp 172.30.1.127:10383->193.148.160.167:443: wsarecv: A
│ connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
╵

https://iaas.api.stackit.cloud

Error: Error creating network area region
│
│   with stackit_network_area_region.eu01_config,
│   on network.tf line 9, in resource "stackit_network_area_region" "eu01_config":
│    9: resource "stackit_network_area_region" "eu01_config" {
│
│ Calling API: Put "https://iaas.api.stackit.cloud/v2/organizations/xxxxxx/network-areas/xxxxxx/regions/eu01": read tcp
│ 172.30.1.127:10553->193.148.160.167:443: wsarecv: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because
│ connected host has failed to respond.
╵

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions