golang: how to gracefully shutdown/interupt for loop in k8s containerBAE
I am new to golang. I have a go program which is a for loop, like for { f() } f may take seconds to 1 min. the for loop runs for ever, and can be killed by KILL signal. in k8s container, we may kill the program/process very often. So i hope the KILL signal only interupt the for loop after function f finished and not started. How to do this? Any example? Thanks
30 April 2024 at 12:09

golang: how to gracefully shutdown/interupt for loop in k8s container

I am new to golang. I have a go program which is a for loop, like

for {
  f()
}

f may take seconds to 1 min. the for loop runs for ever, and can be killed by KILL signal.

in k8s container, we may kill the program/process very often. So i hope the KILL signal only interupt the for loop after function f finished and not started.

How to do this? Any example?

Thanks

Kafka not able to connect with zookeeper with error "Timed out waiting for connection while in state: CONNECTING"

I am trying to run my kafka and zookeeper in kubernetes pods.

Here is my zookeeper-service.yaml:

apiVersion: v1
kind: Service
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: zookeeper-svc
  name: zookeeper-svc
spec:
  ports:
  - name: "2181"
    port: 2181
    targetPort: 2181
  selector:
    io.kompose.service: zookeeper
status:
  loadBalancer: {}

Below is zookeeper-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: zookeeper
  name: zookeeper
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        io.kompose.service: zookeeper
    spec:
      containers:
      - image: wurstmeister/zookeeper
        name: zookeeper
        ports:
        - containerPort: 2181
        resources: {}
      restartPolicy: Always
status: {}

kafka-deployment.yaml is as below:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    kompose.cmd: kompose convert -f docker-compose.yml
    kompose.version: 1.1.0 (36652f6)
  creationTimestamp: null
  labels:
    io.kompose.service: kafka
  name: kafka
spec:
  replicas: 1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        io.kompose.service: kafka
    spec:
      containers:
      - env:
        - name: KAFKA_ADVERTISED_HOST_NAME
          value: kafka
        - name: KAFKA_ZOOKEEPER_CONNECT
          value: zookeeper:2181
        - name: KAFKA_PORT
          value: "9092"
        - name: KAFKA_ZOOKEEPER_CONNECT_TIMEOUT_MS
          value: "60000"
        image: wurstmeister/kafka
        name: kafka
        ports:
        - containerPort: 9092
        resources: {}
      restartPolicy: Always
status: {}

I first start the zookeeper service and deployment. Once the zookeeper is started and kubectl get pods shows it in running state, I start kafka deployment. Kafka deployment starts failing and restarting again and again, due to restartPolicy as always. When I checked the logs from kafka docker, I found that it is not able to connect to zookeeper service and the connection timesout. Here are the logs from kafka container.

[2018-09-03 07:06:06,670] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
atkafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$ waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:230)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
at kafka.zookeeper.ZooKeeperClient.kafka$zookeeper$ZooKeeperClient$$waitUntilConnected(ZooKeeperClient.scala:226)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:95)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1580)
at kafka.server.KafkaServer.kafka$server$KafkaServer$$createZkClient$1(KafkaServer.scala:348)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:372)
at kafka.server.KafkaServer.startup(KafkaServer.scala:202)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
[2018-09-03 07:06:06,671] INFO shutting down (kafka.server.KafkaServer)
[2018-09-03 07:06:06,673] WARN  (kafka.utils.CoreUtils$)
java.lang.NullPointerException
atkafka.server.KafkaServer$$anonfun$shutdown$5.apply$mcV$sp(KafkaServer.scala:579)
at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:86)
at kafka.server.KafkaServer.shutdown(KafkaServer.scala:579)
at kafka.server.KafkaServer.startup(KafkaServer.scala:329)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
[2018-09-03 07:06:06,676] INFO shut down completed 
(kafka.server.KafkaServer)
[2018-09-03 07:06:06,677] ERROR Exiting Kafka. 
(kafka.server.KafkaServerStartable)
[2018-09-03 07:06:06,678] INFO shutting down 
(kafka.server.KafkaServer)

What could be the reason for this ? and solutions ?

Edit: logs from zookeeper pod:

2018-09-03 10:32:39,562 [myid:] - INFO  
[main:ZooKeeperServerMain@96] - Starting server
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:zookeeper.version=3.4.9-1757313, built on 
08/23/2016 06:50 GMT
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:host.name=zookeeper-7594d99b-sgm6p
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.version=1.7.0_65
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.vendor=Oracle Corporation
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.home=/usr/lib/jvm/java-7-openjdk-amd64/jre
2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.class.path=/opt/zookeeper- 
3.4.9/bin/../build/classes:/opt/zookeeper- 
3.4.9/bin/../build/lib/*.jar:/opt/zookeeper-3.4.9/bin/../lib/slf4j- 
log4j12-1.6.1.jar:/opt/zookeeper-3.4.9/bin/../lib/slf4j-api-1.6. 
1.ja r:/opt/zookeeper-3.4.9/bin/../lib/netty- 
3.10.5.Final.jar:/opt/zookeeper-3.4.9/bin/../lib/log4j- 
1.2.16.jar:/opt/zookeeper-3.4.9/bin/../lib/jline- 
0.9.94.jar:/opt/zookeeper-3.4.9/bin/../zookeeper- 
3.4.9.jar:/opt/zookeeper- 
3.4.9/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.9/bin/../conf:

2018-09-03 10:32:39,567 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.io.tmpdir=/tmp
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:java.compiler=<NA>
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.name=Linux
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.arch=amd64 
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:os.version=4.15.0-20-generic
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] -     
Server environment:user.name=root
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:user.home=/root
2018-09-03 10:32:39,569 [myid:] - INFO  [main:Environment@100] - 
Server environment:user.dir=/opt/zookeeper-3.4.9
2018-09-03 10:32:39,570 [myid:] - INFO  [main:ZooKeeperServer@815] 
- 
tickTime set to 2000
2018-09-03 10:32:39,571 [myid:] - INFO  [main:ZooKeeperServer@824] 
- 
minSessionTimeout set to -1
2018-09-03 10:32:39,571 [myid:] - INFO  [main:ZooKeeperServer@833] 
- 
maxSessionTimeout set to -1
2018-09-03 10:32:39,578 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] 
- binding to port 0.0.0.0/0.0.0.0:2181

Edit: starting logs from kafka container:

Excluding KAFKA_HOME from broker config
[Configuring] 'advertised.host.name' in 
'/opt/kafka/config/server.properties'
[Configuring] 'port' in '/opt/kafka/config/server.properties'
[Configuring] 'broker.id' in '/opt/kafka/config/server.properties'
Excluding KAFKA_VERSION from broker config
[Configuring] 'zookeeper.connect' in 
'/opt/kafka/config/server.properties'
[Configuring] 'log.dirs' in '/opt/kafka/config/server.properties'
[Configuring] 'zookeeper.connect.timeout.ms' in 
'/opt/kafka/config/server.properties'
 [2018-09-05 10:47:22,036] INFO Registered 
kafka:type=kafka.Log4jController MBean 
(kafka.utils.Log4jControllerRegistration$) 
[2018-09-05 10:47:23,145] INFO starting (kafka.server.KafkaServer)
[2018-09-05 10:47:23,148] INFO Connecting to zookeeper on 
zookeeper:2181 (kafka.server.KafkaServer)
[2018-09-05 10:47:23,288] INFO [ZooKeeperClient] Initializing a new 
session to zookeeper:2181. (kafka.zookeeper.ZooKeeperClient)
[2018-09-05 10:47:23,300] INFO Client 
environment:zookeeper.version=3.4.13- 
2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 
GMT (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,300] INFO Client environment:host.name=kafka 
-757dc6c47b-zpzfz (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,300] INFO Client 
environment:java.version=1.8.0_171 (org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.vendor=Oracle Corporation 
(org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.home=/usr/lib/jvm/java-1.8-openjdk/jre 
(org.apache.zookeeper.ZooKeeper)
[2018-09-05 10:47:23,301] INFO Client 
environment:java.class.path=/opt/kafka/bin/../libs/activation- 
1.1.1.jar:/opt/kafka/bin/../libs/aopalliance-repackaged-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/argparse4j- 
0.7.0.jar:/opt/kafka/bin/../libs/audience-annotations- 
0.5.0.jar:/opt/kafka/bin/../libs/commons-lang3- 
3.5.jar:/opt/kafka/bin/../libs/connect-api- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-basic-auth-extension- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-file- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-json- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-runtime- 
2.0.0.jar:/opt/kafka/bin/../libs/connect-transforms- 
2.0.0.jar:/opt/kafka/bin/../libs/guava- 
20.0.jar:/opt/kafka/bin/../libs/hk2-api-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/hk2-locator-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/hk2-utils-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/jackson-annotations- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-core- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-databind- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-jaxrs-json-provider- 
2.9.6.jar:/opt/kafka/bin/../libs/jackson-module-jaxb-annotations- 
CR2.jar:/opt/kafka/bin/../libs/javax.annotation-api- 
1.2.jar:/opt/kafka/bin/../libs/javax.inject- 
1.jar:/opt/kafka/bin/../libs/javax.inject-2.5.0- 
b42.jar:/opt/kafka/bin/../libs/javax.servlet-api- 
3.1.0.jar:/opt/kafka/bin/../libs/javax.ws.rs-api- 
2.1.jar:/opt/kafka/bin/../libs/jaxb-api- 
2.3.0.jar:/opt/kafka/bin/../libs/jersey-client- 
2.27.jar:/opt/kafka/bin/../libs/jersey-common- 
2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet 
-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-core- 
2.27.jar:/opt/kafka/bin/../libs/jersey-hk2- 
2.27.jar:/opt/kafka/bin/../libs/jersey-media-jaxb- 
2.27.jar:/opt/kafka/bin/../libs/jersey-server 
-2.27.jar:/opt/kafka/bin/../libs/jetty-client 
-9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-continuation- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-http- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-io- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-security- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-server- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-servlet- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-servlets- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jetty-util- 
9.4.11.v20180605.jar:/opt/kafka/bin/../libs/jopt-simple- 
5.0.4.jar:/opt/kafka/bin/../libs/kafka-clients- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-log4j-appender- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-examples- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-scala_2.11- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-streams-test-utils- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka-tools- 
2.0.0.jar:/opt/kafka/bin/../libs/kafka_2.11-2.0.0- 
sources.jar:/opt/kafka/bin/../libs/kafka_2.11-2 
 .0.0.jar:/opt/kafka/bin/../libs/log4j 
1.2.17.jar:/opt/kafka/bin/../libs/lz4-java- 
1.4.1.jar:/opt/kafka/bin/../libs/maven-artifact- 
3.5.3.jar:/opt/kafka/bin/../libs/metrics-core- 
2.2.0.jar:/opt/kafka/bin/../libs/osgi-resource-locator- 
1.0.1.jar:/opt/kafka/bin/../libs/plexus-utils- 
3.1.0.jar:/opt/kafka/bin/../libs/reflections- 
0.9.11.jar:/opt/kafka/bin/../libs/rocksdbjni- 
5.7.3.jar:/opt/kafka/bin/../libs/scala-library- 
2.11.12.jar:/opt/kafka/bin/../libs/scala-logging_2.11- 
3.9.0.jar:/opt/kafka/bin/../libs/scala-reflect- 
2.11.12.jar:/opt/kafka/bin/../libs/slf4j-api- 
1.7.25.jar:/opt/kafka/bin/../libs/slf4j-log4j12- 
1.7.25.jar:/opt/kafka/bin/../libs/snappy-java- 
1.1.7.1.jar:/opt/kafka/bin/../libs/validation-api- 
1.1.0.Final.jar:/opt/kafka/bin/../libs/zkclient- 
0.10.jar:/opt/kafka/bin/../libs/zookeeper-3.4.13.jar 
(org.apache.zookeeper.ZooKeeper)

output for kubectl get svc -o wide is as follows:

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE       SELECTOR
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP    50m       <none>
zookeeper    ClusterIP   10.98.180.138   <none>        2181/TCP   48m       io.kompose.service=zookeeper

output from kubectl get pods -o wide:

NAME                       READY     STATUS             RESTARTS   AGE       IP           NODE
kafka-757dc6c47b-zpzfz     0/1       CrashLoopBackOff   15         1h        10.32.0.17   administrator-thinkpad-l480
zookeeper-7594d99b-784n9   1/1       Running            0          1h        10.32.0.19   administrator-thinkpad-l480

Edit: output from kubectl describe pod kafka-757dc6c47b-zpzfz:

Name:           kafka-757dc6c47b-zpzfz
Namespace:      default
Node:           administrator-thinkpad-l480/10.11.17.86
Start Time:     Wed, 05 Sep 2018 16:17:06 +0530
Labels:         io.kompose.service=kafka
            pod-template-hash=3138727036
Annotations:    <none>
Status:         Running
IP:             10.32.0.17
Controlled By:  ReplicaSet/kafka-757dc6c47b
Containers:
  kafka:
   Container ID:docker://2bdc06d876ae23437c61f4e95539a67903cdb61e88fd9c68377b47c7705293a3
    Image:          wurstmeister/kafka
    Image ID:       docker-pullable://wurstmeister/kafka@sha256:2e3ff64e70ea983530f590282f36991c0a1b105350510f53cc3d1a0279b83c28
    Port:           9092/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 05 Sep 2018 17:29:06 +0530
      Finished:     Wed, 05 Sep 2018 17:29:14 +0530
    Ready:          False
    Restart Count:  18
    Environment:
      KAFKA_ADVERTISED_HOST_NAME:          kafka
      KAFKA_ZOOKEEPER_CONNECT:             zookeeper:2181
      KAFKA_PORT:                          9092
      KAFKA_ZOOKEEPER_CONNECT_TIMEOUT_MS:  160000
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-nhb9z (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-nhb9z:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-nhb9z
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
             node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason   Age                From                                  Message
  ----     ------   ----               ----                                  -------
  Warning  BackOff  3m (x293 over 1h)  kubelet, administrator-thinkpad-l480  Back-off restarting failed container

K8s FQDN EndpointSlice not recognized by ingress?

Needing ingress -> service (no label selectors) -> service (with label selectors) -> pods...

Setting up a blue/green environment and due to cross-app restrictions, I need an ingress sending traffic to a service which forwards traffic to another service name, which may frequently change (so I'm trying to avoid hardcoding any IP addresses) and will also need to support services in another namespace, and later, in another cluster altogether. I'm reading that I may need to manually create my own EndpointSlice for the ExternalName type service to be used by the ingress BUT I am using the aws alb controller for ALB creation using the ingress annotations and apparently it doesn't support ExternalName services. My ingress is giving an endpoints not found error even after creating an endpointslice with FQDN type, regardless of if the live-svc below is of ExternalName type or not.

Important: The service the external ingress sends traffic to needs to receive traffic from other pods internally when referenced by its internal DNS name live-svc.myNamespace.svc.cluster.local. Traffic can't exit the cluster and come back in for inter-app communication, due to latency requirements. So, users could access pods by going to app-live.mydomain.com, OR other apps within the cluster could reference the live-svc DNS name to access and traffic stays internal to the cluster.

Below is what I'm trying. The ingress should reach live-svc which then directs traffic to secondaryService which directs traffic to pods based on label selectors. I think the EndpointSlice is what is causing it to not work. Do I have to create an Endpoints type resource, instead? I thought EndpointSlice is replacing Endpoints and is recommended?

apiVersion: v1
kind: Service
metadata:
  name: live-svc
  namespace: myNamespace
spec:
  #type: ExternalName
  #externalName: secondaryService.myNamespace.svc.cluster.local #this secondary service does select pods successfully
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: live-svc
  namespace: myNamespace
  labels:
    kubernetes.io/service-name: live-svc
addressType: FQDN
ports:
  - name: https
    protocol: TCP
    port: 443
endpoints:
  - addresses:
      - secondaryService.myNamespace.svc.cluster.local

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    ...
  name: app-public
  namespace: myNamespace
spec:
  rules:
  - host: app-live.mydomain.com
    http:
      paths:
      - backend:
          service:
            name: live-svc
            port:
              name: https
        path: /
        pathType: Prefix

kubernetes install error on Ubuntu20 while using deb http://apt.kubernetes.io/ kubernetes-xenial main

Err:10 https://packages.cloud.google.com/apt kubernetes-xenial Release 404 Not Found [IP: 142.250.67.46 443] Reading package lists... Done E: The repository 'http://apt.kubernetes.io kubernetes-xenial Release' does not have a Release file. N: Updating from such a repository can't be done securely, and is therefore disabled by default. N: See apt-secure(8) manpage for repository creation and user configuration details.

deb http://apt.kubernetes.io/ kubernetes-xenial main

Install error of Kubernetes

Unable to establish connection between elastic search and curator

Both my elastic search and curator are deployed in the same Kubernetes cluster and same namespace. I have given the elastic hostname, port and http auth in username and password in curator config file for connecting to elastic search. But when I am running the curator job I could see it is defaulting to local host. So I tried connecting using cluster DNS still it is the same

Note: My elastic search is exposed as NodePort. Hence given the hostname accordingly.

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/elastic_transport/_transport.py", line 342, in perform_request
  File "/usr/local/lib/python3.11/site-packages/elastic_transport/_node/_http_urllib3.py", line 202, in perform_request
elastic_transport.ConnectionError: Connection error caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f5c78b74cd0>: Failed to establish a new connection: [Errno 111] Connection refused)
2024-04-21 09:05:18,198 INFO      GET http://127.0.0.1:9200/ [status:N/A duration:0.000s]
2024-04-21 09:05:18,198 WARNING   Node <Urllib3HttpNode(http://127.0.0.1:9200)> has failed for 4 times in a row, putting on 8 second timeout
2024-04-21 09:05:18,198 CRITICAL  Unable to establish client connection to Elasticsearch!
2024-04-21 09:05:18,198 CRITICAL  Exception encountered: Connection error caused by: ConnectionError(Connection error caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x7f5c79554390>: Failed to establish a new connection: [Errno 111] Connection refuse

client:
      hosts: 
        - <my_cluster_master_hostname>
      port: <nodeport>
      url_prefix:
      use_ssl: False
      certificate:
      client_cert:
      client_key:
      ssl_no_validate: False
      http_auth: "username:password"
      timeout: 30
      master_only: False

Note:

Elastic search version: 7.17.9
Curator version: curator:8.0.14

I would like to successfully connect to my elastic search deployed inside my cluster from the curator and manage the indices but every time it is defaulting to local host and elastic is not deployed there. Hence unable to manage elastic-indices.

I have tried explicitly configuring the host and port along with credentials in the curator config file.
I also tried to connect using Kubernetes DNS Service Discovery.

None of the methods worked out.

Note: I am able to connect to my elastic search using elastic view using the same URL and authentication and also hit it in browser and getting response as expected

kubernetes nodeport service not responding

Here is my deploy yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: some-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: some-api
  template:
    metadata:
      labels:
        app: some-api
    spec:
      containers:
      - name: some-api-local
        image: "ealen/echo-server:0.9.2"
        ports:
        - containerPort: 80
      imagePullSecrets:
      - name: registrycred

---
apiVersion: v1
kind: Service
metadata:
  name: some-service
spec:
  type: NodePort
  ports:
  - name: http
    port: 5001
    targetPort: 80
    nodePort: 30002
  selector:
    app: some-api

The actual service is running on port 80 with logs spitting just below:

Just experimenting with kubernetes cluster deployments. I am trying to access a simple echo server via this service above like this:

10.109.89.108:30002

Output of: k get nodes --output wide

The ip above is my cluster IP that I got from kubectl get service. What I am doing wrong?

Tried to access as below:

192.168.49.2:30002 (using node ip)
10.109.89.108:30002 (cluster ip)

how can I programmatically with java, from my app itself, get the namespace my app belongs on kubernetes?

My app is running on kubernetes and I need to get the namespace my app is running.

 The problem is, that the same app could run in two different clusters dev or prod.

 I’m new in this and would like some help.

Thank you in advance.

Customize 404 page of nginx ingress controller on not found ingress

Nginx ingress controller returns 404 when a host is not bound to an ingress. I want to customize the 404 only in this situation. How can I achieve this, if possible at all. Creating a custom 404 for all 404 of my back-end servers is something I don't want.

Kubectl command throws error when executed from python script but manual execution works fine

I'm trying to execute kubectl commands in python

process = subprocess.run(["kubectl", "get", "pods", "-n", "argocd"])

which throws error:

__main__ - INFO - 
E0328 08:18:04.126243   25976 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0328 08:18:04.126801   25976 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0328 08:18:04.129854   25976 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0328 08:18:04.130344   25976 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0328 08:18:04.131804   25976 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
__main__ - ERROR - Failed to get po.
NoneType: None

But when I execute the same command manually from the master node I see output without any issues.

Can someone help me figure out what's the issue here?

Application Gateway for Containers

Application Gateway for Containers, recently made generally available by Azure, represents an evolution in application (layer 7) load balancing and dynamic traffic management for workloads running in a Kubernetes cluster.

I need to migrate On-prem and AWS infra to Azure Cloud [closed]

please advise if my plan order and tooling stack is good:

Not sure best practice to migrate Kuberenetes without down time

Existing Infrastructure:

• Fortinet Firewall – In AWS and in On-Premise.

MySql on RDS • Two Kubernetes Clusters EKS AND Kubernetes On-Premise • For logging in an on-premises environment, they are using ElasticSearch. • Spark & Hadoop – ML in On-Premises

I need it to be cost optimized and Low latency

1.Database : Use Azure Database Migration Service to migrate SQL databases from RDS to Azure SQL 2.Monitoring:
Migrate from ElasticSearch to Azure Monitor and Azure Log Analytics for integrated logging and monitoring. Consider using Azure Data Explorer for high-volume log analytics. Utilize Azure Sentinel for security information and event management (SIEM) to enhance security posture. 3. Firewall Migration:

Replace Fortinet Firewalls with Azure Firewall, a managed cloud-based network security service. Additionally, use Azure Network Security Groups (NSGs) for micro-segmentation. Consider implementing Azure Firewall Manager to centrally manage multiple Azure Firewall instances.

Spark & Hadoop Migration: Migrate ML workloads to Azure Databricks for Spark-based analytics and AI. Azure Databricks provides a fast, easy, and collaborative Apache Spark-based analytics platform. For Hadoop workloads, consider Azure HDInsight, a cloud service that makes it easy, fast, and cost-effective to process massive amounts of data.

Network Optimization: Use Azure ExpressRoute for private connections between Azure datacenters and infrastructure on-premises or in other clouds to reduce latency. Scalability and High Availability: Consider the use of Azure Front Door or Azure Traffic Manager for global load balancing and ensuring low latency across geographic locations.

Tried to find tools excpet Azure native

Deploy an Application to Azure Kubernetes Service

How to deploy applications to the Azure Kubernetes Cluster powered by arm64-based virtual machines

Learn How to Migrate an x86 Application to Multi-architecture with Arm on GKEPranay Bakre
How to migrate existing x86 containerized applications to Arm
29 February 2024 at 00:56

Learn How to Migrate an x86 Application to Multi-architecture with Arm on GKE

How to migrate existing x86 containerized applications to Arm

Learn How to Build and Deploy a Multi-architecture Application on Amazon EKS

How to build and deploy a multi-architecture application with x86/amd64 and arm64-based container images on Amazon EKS

Vm in kubernetes [closed]

Anyone try to run vm for production on a Kubernetes cluster. Is their a way to run kvm instance inside a pod ? I know that google run all the vm inside container is it planned for kubernetes ? Thank you

image attestation using Kyverno not working

I am writing a simple test to verify that Kyverno is able to block images without attestation from being deployed in k3s cluster.

https://github.com/whoissqr/cg-test-keyless-sign

I have the following ClusterPolicy

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: check-image-keyless
spec:
  validationFailureAction: Enforce
  webhookTimeoutSeconds: 30
  rules:
    - name: check-image-keyless
      match:
        any:
        - resources:
            kinds:
              - Pod
      verifyImages:
      - imageReferences:
        - "ghcr.io/whoissqr/cg-test-keyless-sign"
        attestors:
        - entries:
          - keyless:
              subject: "https://github.com/whoissqr/cg-test-keyless-sign/.github/workflows/main.yml@refs/heads/main"
              issuer: "https://token.actions.githubusercontent.com"
              rekor:
                url: https://rekor.sigstore.dev

and the following pod yaml

apiVersion: v1
kind: Pod
metadata:
  name: cg
  namespace: app
spec:
  containers:
    - image: ghcr.io/whoissqr/cg-test-keyless-sign
      name: cg-test-keyless-sign
      resources: {}

And, I purposely commented out the image cosign step in Github action so that the cosign verify failed as expected, but the pod deployment to k3s is still succeeded. What am I missing here?

name: Publish and Sign Container Image

on:
  schedule:
    - cron: '32 11 * * *'
  push:
    branches: [ main ]
    # Publish semver tags as releases.
    tags: [ 'v*.*.*' ]
  pull_request:
    branches: [ main ]

jobs:
  build:

    runs-on: ubuntu-latest
    permissions:
      contents: read
      packages: write
      id-token: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Install cosign
        uses: sigstore/[email protected]
          
      - name: Check install!
        run: cosign version
        
      - name: Setup Docker buildx
        uses: docker/setup-buildx-action@v2

      - name: Log into ghcr.io
        uses: docker/login-action@master
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Build and push container image
        id: push-step
        uses: docker/build-push-action@master
        with:
          push: true
          tags: ghcr.io/${{ github.repository }}:latest

      - name: Sign the images with GitHub OIDC Token
        env:
          DIGEST: ${{ steps.push-step.outputs.digest }}
          TAGS: ghcr.io/${{ github.repository }}
          COSIGN_EXPERIMENTAL: "true"
        run: |
          echo "dont sign image"
          # cosign sign --yes "${TAGS}@${DIGEST}"
        
      - name: Verify the images
        run: |
          cosign verify ghcr.io/whoissqr/cg-test-keyless-sign \
             --certificate-identity https://github.com/whoissqr/cg-test-keyless-sign/.github/workflows/main.yml@refs/heads/main \
             --certificate-oidc-issuer https://token.actions.githubusercontent.com | jq

      - name: Create k3s cluster
        uses: debianmaster/actions-k3s@master
        id: k3s
        with:
          version: 'latest'
          
      - name: Install Kyverno chart
        run: |
          helm repo add kyverno https://kyverno.github.io/kyverno/
          helm repo update
          helm install kyverno kyverno/kyverno -n kyverno --create-namespace

      - name: Apply image attestation policy
        run: |
          kubectl apply -f ./k3s/policy-check-image-keyless.yaml
          
      - name: Deploy pod to k3s
        run: |
          set -x
          # kubectl get nodes
          kubectl create ns app
          sleep 20
          # kubectl get pods -n app
          kubectl apply -f ./k3s/pod.yaml
          kubectl -n app wait --for=condition=Ready pod/cg
          kubectl get pods -n app
          kubectl -n app describe pod cg
          kubectl get polr -o wide

      - name: Install Kyverno CLI
        uses: kyverno/[email protected]
        with:
          release: 'v1.9.5'
          
      - name: Check policy using Kyverno CLI
        run: |
          kyverno version
          kyverno apply ./k3s/policy-check-image-keyless.yaml --cluster -v 10

in the GH action console

+ kubectl apply -f ./k3s/pod.yaml
pod/cg created
+ kubectl -n app wait --for=condition=Ready pod/cg
pod/cg condition met
+ kubectl get pods -n app
NAME   READY   STATUS    RESTARTS   AGE
cg     1/1     Running   0          12s

and the kyverno CLI output has

I0225 10:00:31.650505    6794 common.go:424]  "msg"="applying policy on resource" "policy"="check-image-keyless" "resource"="app/Pod/cg"
I0225 10:00:31.652646    6794 context.go:278]  "msg"="updated image info" "images"={"containers":{"cg-test-keyless-sign":{"registry":"ghcr.io","name":"cg-test-keyless-sign","path":"whoissqr/cg-test-keyless-sign","tag":"latest"}}}
I0225 10:00:31.654017    6794 utils.go:29]  "msg"="applied JSON patch" "patch"=[{"op":"replace","path":"/spec/containers/0/image","value":"ghcr.io/whoissqr/cg-test-keyless-sign:latest"}]
I0225 10:00:31.659697    6794 mutation.go:39] EngineMutate "msg"="start mutate policy processing" "kind"="Pod" "name"="cg" "namespace"="app" "policy"="check-image-keyless" "startTime"="2024-02-25T10:00:31.659674165Z"
I0225 10:00:31.659737    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.659815    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.659834    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.659940    6794 mutation.go:379] EngineMutate "msg"="finished processing policy" "kind"="Pod" "mutationRulesApplied"=0 "name"="cg" "namespace"="app" "policy"="check-image-keyless" "processingTime"="249.225µs"
I0225 10:00:31.659966    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660040    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.660059    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660153    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660218    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.660236    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660337    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660402    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.660421    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.660648    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"pods","singularName":"pod","namespaced":true,"version":"v1","kind":"Pod","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["po"],"categories":["all"]} "kind"="Pod"
I0225 10:00:31.660729    6794 imageVerify.go:121] EngineVerifyImages "msg"="processing image verification rule" "kind"="Pod" "name"="cg" "namespace"="app" "policy"="check-image-keyless" "ruleSelector"="All"
I0225 10:00:31.660889    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"daemonsets","singularName":"daemonset","namespaced":true,"group":"apps","version":"v1","kind":"DaemonSet","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["ds"],"categories":["all"]} "kind"="DaemonSet"
I0225 10:00:31.661037    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"deployments","singularName":"deployment","namespaced":true,"group":"apps","version":"v1","kind":"Deployment","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["deploy"],"categories":["all"]} "kind"="Deployment"
I0225 10:00:31.661184    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"jobs","singularName":"job","namespaced":true,"group":"batch","version":"v1","kind":"Job","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"categories":["all"]} "kind"="Job"
I0225 10:00:31.661327    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"statefulsets","singularName":"statefulset","namespaced":true,"group":"apps","version":"v1","kind":"StatefulSet","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["sts"],"categories":["all"]} "kind"="StatefulSet"
I0225 10:00:31.661465    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"replicasets","singularName":"replicaset","namespaced":true,"group":"apps","version":"v1","kind":"ReplicaSet","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["rs"],"categories":["all"]} "kind"="ReplicaSet"
I0225 10:00:31.661606    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"replicationcontrollers","singularName":"replicationcontroller","namespaced":true,"version":"v1","kind":"ReplicationController","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["rc"],"categories":["all"]} "kind"="ReplicationController"
I0225 10:00:31.661789    6794 validation.go:591] EngineVerifyImages "msg"="resource does not match rule" "kind"="Pod" "name"="cg" "namespace"="app" "policy"="check-image-keyless" "reason"="rule autogen-check-image-keyless not matched:\n 1. no resource matched"
I0225 10:00:31.661938    6794 discovery.go:269] dynamic-client "msg"="matched API resource to kind" "apiResource"={"name":"cronjobs","singularName":"cronjob","namespaced":true,"group":"batch","version":"v1","kind":"CronJob","verbs":["create","delete","deletecollection","get","list","patch","update","watch"],"shortNames":["cj"],"categories":["all"]} "kind"="CronJob"
I0225 10:00:31.662056    6794 validation.go:591] EngineVerifyImages "msg"="resource does not match rule" "kind"="Pod" "name"="cg" "namespace"="app" "policy"="check-image-keyless" "reason"="rule autogen-cronjob-check-image-keyless not matched:\n 1. no resource matched"
I0225 10:00:31.662091    6794 imageVerify.go:83] EngineVerifyImages "msg"="processed image verification rules" "applied"=0 "kind"="Pod" "name"="cg" "namespace"="app" "policy"="check-image-keyless" "successful"=true "time"="1.748335ms"
I0225 10:00:31.662113    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.662189    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.662208    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.662302    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.662368    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.662385    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.662481    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"
I0225 10:00:31.662544    6794 rule.go:286] autogen "msg"="generating rule for cronJob" 
I0225 10:00:31.662577    6794 rule.go:233] autogen "msg"="processing rule" "rulename"="check-image-keyless"

thanks!

Edit max_conns in Kubernetes ingress Ngnix?

Im trying to limit the number of concurrent connection to servers in my Nginx ingress.

is max_conns supported in Ngnix ingress? how can i edit or add it?

max_conns=number limits the maximum number of simultaneous active connections to the proxied server (1.11.5). Default value is zero, meaning there is no limit. If the server group does not reside in the shared memory, the limitation works per each worker process.

http://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream

exmple of an Nginx conf using max_conn

upstream backend {
server backend1.example.com  max_conns=3;
server backend2.example.com;}

thanks

Drupal Pod is crashing when we enable Readonly file system as True without error

Drupal kubernates Pod is crashing when we enable Read only file system as True(security group) without error.

As part of pen test, it has been identified to enabled the flag , but no luck pod is crashing without any error. Drupal Deployment file :

enter image description here

Can you pls help us as earliest. we have added correct mount point but still it is crashing. mount point :

enter image description here

Get hostname instead of K8 service name

I have a .NET app that was dockerized and running in a Kubernetes cluster behind a gateway. From within this .NET app, I'm trying to know the "domain name" that the user made the request from.

All my attempts so far give me back only the internal service name configured in K8.

k8 name: testservice-svc actual domain: test.com

In my .NET application I get the K8 name when I run this: "HttpContext.Request.Host.Value" How can I get the actual domain?

Normal view