โŒ

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

YOLOv8 android with Kotlin

So i got this code from combining some codes. I expect this code to run well with android/mobile camera. So... the camera works but the bounding-box code is the problem. Because when i run the code, the app keeps stopping/shutting down. Can anyone tell how to fix this code?

                val outputs = model.process(inputFeature0)
                val outputFeature0 = outputs.outputFeature0AsTensorBuffer.floatArray


                var mutable = bitmap.copy(Bitmap.Config.ARGB_8888, true)
                val canvas = Canvas(mutable)

              var h = bitmap.height
              var w = bitmap.width
                paint.textSize = h/15f
                paint.strokeWidth = h/85f

// the error part :
              var x = 0
                outputFeature0.forEachIndexed { index, fl ->
                    x = index
                    x *= 4
                    if(fl > 0.5){
                        paint.setColor(colors.get(index))
                        paint.style = Paint.Style.STROKE
                        canvas.drawRect(RectF(outputFeature0.get(x+1)*w, outputFeature0.get(x)*h, outputFeature0.get(x+3)*w, outputFeature0.get(x+2)*h), paint)
                        paint.style = Paint.Style.FILL
                        canvas.drawText(labels.get(outputFeature0.get(index).toInt())+" "+fl.toString(), outputFeature0.get(x+1)*w, outputFeature0.get(x)*h, paint)
                    }
                }

                imageView.setImageBitmap(mutable)

How to use PTQ/QAT/INT8 in YOLO-NAS for object detection?

I've custom-trained my model to PTQ and QAT with the tutorial

After this, I got some log files, .pth file, and ptq/qat onnx file from the output as in the tutorial. At the bottom of the tutorial, it says need to convert the qat-onnx file to an INT8 TensoRT file, then I converted it with the command trtexec --fp16 --int8 --onnx=model.onnx --saveEngine=model.trt

Now I got my TensorRT file (in a .trt format).

Then here goes my questions:

  1. How to do object detection with a TensorRT file?
  2. I've already got the .pth file from the output of PTQ & QAT training, why I can't directly use the .pth file for object detection? Just like the way I'm using the .pth file for object detection from a normal non-PTQ & QAT training, by using the code below:
from super_gradients.common.object_names import Models
from super_gradients.training import models

model = models.get(Models.YOLO_NAS_M,
checkpoint_path="yolonas-m/ckpt_best.pth",
num_classes=1)
predictions = model.predict("23.jpg")
predictions.show(show_confidence=False)

If I'm using the .pth file from the output of PTQ & QAT for object detection, it will have an error message as below:


ValueError                                Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/super_gradients/training/utils/checkpoint_utils.py](https://localhost:8080/#) in __call__(self, model_state_dict, checkpoint_state_dict)
198
199             if ckpt_val.shape != model_val.shape:
\--\> 200                 raise ValueError(f"ckpt layer {ckpt_key} with shape {ckpt_val.shape} does not match {model_key}" f" with shape {model_val.shape} in the model")
201             new_ckpt_dict\[model_key\] = ckpt_val
202         return new_ckpt_dict

ValueError: ckpt layer backbone.stem.conv.post_bn.weight with shape torch.Size(\[48\]) does not match backbone.stem.conv.branch_3x3.conv.weight with shape torch.Size(\[48, 3, 3, 3\]) in the model
  1. If there's really no way to use the PTQ & QAT INT8 TensorRT file or .pth file, can I use the ONNX file for object detection that is generated in the process of PTQ & QAT? and how?

I'm a newbie discovering YOLO-NAS. Thanks, everyone.

YOLONAS Can't do detection if I use opencv to open the video

I want to use OpenCV to use video detection with the Yolonas algorithm in my personal PC (already install super-gradients 3.1.3, torch support GPU version 1.13.1. CUDA 11.7 and cudnn), but it not works (Yolonas don't do a prediction).

But if I use Yolonas function (model.predict) it can do a prediction (only for images, video not). the weights I use (ckpt file) already training for 100 epochs and it runs well when I use yolonas function for images.

But if I run the code, in my google colab, the model can do the predictions. so what's wrong? my code or my Anaconda virtual environment?

This is my code :

import torch

from super_gradients.training import models

import argparse

import cv2

import numpy as np

def detect_objects(video):
    # if we have cuda, run the prediction with GPU, but if not run it with CPU
    device = 0 if torch.cuda.is_available() else "cpu"

    # Define our classes
    dataset_params = {
        'classes': ['fire', 'smoke', 'others']
    }

    # load the model and weights
    model = models.get("yolo_nas_s", 
                       num_classes=len(dataset_params['classes']),
                      checkpoint_path= "checkpoints/train_3/ckpt_best.pth").to(device)

    cap = cv2.VideoCapture(video)
    
    while cap.isOpened():
        ret, frame = cap.read()

        if ret:
            
            frame = cv2.resize(frame, (1280, 720))
            og_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
            frame = og_frame.copy()
            
            result = list(model.predict(frame, conf = 0.3, fuse_model = False)._images_prediction_lst)
            
            bboxes_xyxy = result[0].prediction.bboxes_xyxy.tolist()
            confidence = result[0].prediction.confidence.tolist()
            labels = result[0].prediction.labels.astype(int).tolist()

            # print(result, end ='/r')
            
            for i in range(len(labels)):
                x1, y1, x2, y2 = map(int, bboxes_xyxy[i])

                w = x2 - x1
                h = y2 - y1
                
                cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
                cv2.rectangle(frame, (x1 , y1), (x1 + 120, y1+30), (0,255,0), -1)
                
                conf = np.round(confidence[i], 2)
                text = dataset_params['classes'][labels[i]] + ' : ' + str(conf)
                cv2.putText(frame, text, (x1 + 15, y1 + 20), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 2)
                
            frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
            cv2.imshow("frame", frame)

            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            
    cap.release()
    cv2.destroyAllWindows()

    del model
    torch.cuda.empty_cache()

if __name__ == "__main__":

    video = 'testing.mp4'
    detect_objects(video)
โŒ
โŒ