โŒ

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

The LSTM training model predicts results that consistently form a horizontal line without any amplitude

The code of LSTM models:

class Net(nn.Module):
    def __init__(self,input_size,hidden_size,num_layers,output_size,batch_size,seq_length) -> None:
        super(Net,self).__init__()
        self.input_size=input_size
        self.hidden_size=hidden_size
        self.num_layers=num_layers
        self.output_size=output_size
        self.batch_size=batch_size
        self.seq_length=seq_length
        self.num_directions=1 
        self.liner1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.liner2 = nn.Linear(num_layers*hidden_size, output_size)
        self.dropout = nn.Dropout(0.2)
 
        self.lstm=nn.LSTM(input_size=hidden_size,hidden_size=hidden_size,num_layers=num_layers,batch_first=True,dropout=0.2) # LSTM
 
    def forward(self,x):
        batchsize = x.shape[0]
        x = self.liner1(x)
        x = self.relu(x)
 
        h_0 = torch.randn(self.num_directions * self.num_layers, x.size(0), self.hidden_size).to('cuda')
        c_0 = torch.randn(self.num_directions * self.num_layers, x.size(0), self.hidden_size).to('cuda')
 
        output, (h_n, c_n) = self.lstm(x, (h_0, c_0)) 
 
        output = h_n.permute(1,0,2).reshape(batchsize, -1) # 64,10,32  -> 64,32*2
        pred = self.dropout(output)
        pred = self.liner2(pred)     
        pred = pred[:, -1]     # (batch_size,)
        return pred         

the code of Config:

    # ๅ‚ๆ•ฐ่ฎพ็ฝฎ
    seq_length = 10  # ๆ—ถ้—ดๆญฅ้•ฟ
    input_size = 3  # ๅŽŸๆœฌไธบ3๏ผŒ็Žฐๅœจไธบ5๏ผŒ ๅˆ ๅŽปpostcodeไธŽtime
    num_layers = 2 # 4
    hidden_size = 128 # 512??
    batch_size = 64
    n_iters = 10000 # 50000 5000
    lr = 0.0001
    output_size = 1
    split_ratio = 0.9
    path = 'data/raw_sales.csv'
    moudle = Net(input_size, hidden_size, num_layers, output_size, batch_size, seq_length)
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(moudle.parameters(), lr=lr)
    scaler = MinMaxScaler()

and the result picture: enter image description here

the total data is about 10thousand(10000) so here is the question, i have changed the hidden_size and seq_size(windows_size),but the picture barely changed.the line is always horizontal.

1.change the hidden_size from 3 to 128ใ€256ใ€512

2.change the windows_size from 3 to 7ใ€10ใ€30

3.change the iters to a large number

expecting: the line could have amplitude,for example,it could be more slash.

4.20 23:30 the loss plot:

enter image description here

the datasets link:

https://www.kaggle.com/datasets/htagholdings/property-sales/data

i use the raw_data instead of processed data.

i have tried the proceesed data which is shorter and flatter than the raw data,the result seems to be cool,i guess its beacuse the raw_data always change which is not suitable for predict

the new result plot is here:

enter image description here

4/22 0:30

i have resample the data by monthly or quarterly,and the result seems so good: enter image description here

How to add new tokens to an existing Huggingface tokenizer?

How to add new tokens to an existing Huggingface AutoTokenizer?

Canonically, there's this tutorial from Huggingface https://huggingface.co/learn/nlp-course/chapter6/2 but it ends on the note of "quirks when using existing tokenizers". And then it points to the train_new_from_iterator() function in Chapter 7 but I can't seem to find reference to how to use it to extend the tokenizer without re-training it.

I've tried the solution from Training New AutoTokenizer Hugging Face that uses train_new_from_iterator() but that will re-train a tokenizer, but it is not extending it, the solution would replace the existing token indices. Training New AutoTokenizer Hugging Face

import pandas as pd

def batch_iterator(batch_size=3, size=8):
        df = pd.DataFrame({"note_text": ['foobar', 'helloworld']})
        for x in range(0, size, batch_size):
            yield df['note_text'].to_list()

old_tokenizer = AutoTokenizer.from_pretrained('roberta-base')
training_corpus = batch_iterator()
new_tokenizer = old_tokenizer.train_new_from_iterator(training_corpus, 32000)

print(len(old_tokenizer))
print(old_tokenizer( ['foobarzz', 'helloworld'] ))
print(new_tokenizer( ['foobarzz', 'hello world'] ))

[out]:

50265
{'input_ids': [[0, 21466, 22468, 7399, 2], [0, 20030, 1722, 39949, 2]], 'attention_mask': [[1, 1, 1, 1, 1], [1, 1, 1, 1, 1]]}
{'input_ids': [[0, 275, 2], [0, 276, 2]], 'attention_mask': [[1, 1, 1], [1, 1, 1]]}

Note: The reason why the new tokens starts from 275 and 276 is because there are reserved tokens from ids 0-274.

The expected behavior of new_tokenizer( ['foo bar', 'hello word'] ) is to have IDs beyond the tokenizer vocab size (i.e. 50265 for the roberta-base model) and it should look like this:

{'input_ids': [[0, 50265, 2], [0, 50266, 2]], 'attention_mask': [[1, 1, 1], [1, 1, 1]]}

'Chroma' object has no attribute 'persist'

I'm persisting the Chroma Database but it's giving me an error.

I'm basically redoing what's in this link.

https://github.com/hwchase17/chroma-langchain/blob/master/persistent-qa.ipynb

Is there any update in chromadb version and they have removed persist I don't get it.

!pip -q install chromadb openai langchain tiktoken

!pip install -q langchain-chroma

!pip install -q langchain_chroma  langchain_openai langchain_community

from langchain_chroma import Chroma
from langchain_openai import OpenAI
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.document_loaders import TextLoader
from langchain_community.document_loaders import DirectoryLoader

persist_directory ='db'

embedding = OpenAIEmbeddings()

vectordb = Chroma.from_documents(documents=texts,
                                 embedding=embedding,
                                 persist_directory=persist_directory)

vectordb.persist()

Then I'm getting the below error:


AttributeError Traceback (most recent call last) Cell In[47], line 1 1 vectordb.persist()

AttributeError: 'Chroma' object has no attribute 'persist'

DeepSpeed install error, help! with torch error

I'm a beginner coder studying AI. I'm studying KoreanLM at GitHub.

Actually, it's my first time using GitHub. This is how I installed the repo:

git clone https://github.com/quantumaikr/KoreanLM.git
cd KoreanLM
pip install -r requirements.txt

I'm told to order it like this, but there is a problem in the PIP part.

I ran that code on the GIT program.

The required modules are written in requirements.txt. Among them is deep speed.

If you run the pip install line, the error below occurs:

Collecting git+https://github.com/huggingface/peft.git (from -r requirements.txt (line 15))
Cloning https://github.com/huggingface/peft.git to c:\...
Running command git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git 'C:\...'
Resolved https://github.com/huggingface/peft.git to commit 02b5aeddf9c1ea11451f10a8a26da7e5df8cca4a
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting numpy (from -r requirements.txt (line 1))
Using cached numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
Collecting rouge_score (from -r requirements.txt (line 2))
Using cached rouge_score-0.1.2.tar.gz (17 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting fire (from -r requirements.txt (line 3))
Using cached fire-0.6.0-py2.py3-none-any.whl
Collecting transformers>=4.28.1 (from -r requirements.txt (line 4))
Using cached transformers-4.39.2-py3-none-any.whl.metadata (134 kB)
Collecting torch (from -r requirements.txt (line 5))
Using cached torch-2.2.2-cp312-cp312-win_amd64.whl.metadata (26 kB)
Collecting sentencepiece (from -r requirements.txt (line 6))
Using cached sentencepiece-0.2.0-cp312-cp312-win_amd64.whl.metadata (8.3 kB)
Collecting tokenizers>=0.13.3 (from -r requirements.txt (line 7))
Using cached tokenizers-0.15.2-cp312-none-win_amd64.whl.metadata (6.8 kB)
Collecting deepspeed (from -r requirements.txt (line 8))
Using cached deepspeed-0.14.0.tar.gz (1.3 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error

ร— ร— Getting requirements to build wheel did not run successfully.
โ”‚ โ”‚ exit code: 1
โ•ฐโ”€> [23 lines of output]

[WARNING] Unable to import torch, pre-compiling ops will be disabled. Please visit https://pytorch.org/ to see how to properly install torch on your system.
[WARNING] unable to import torch, please install it if you want to pre-compile any deepspeed ops.

I'm really discouraged from getting stuck from the module installation stage.. is there anyone who can help me please?

Shadows not casted/received with HDR environment texture in Three js

I am new to three js and Iโ€™ve been working on a project where I want to cast a shadow of the import 3D model which is download from this link also download the HDR file from the link free pack.

As per the videos avaliable on the internet and while googling my problem I have found some of the issue in my code which I have changed like scene.environment = texture;. I have the mesh of the model to castShadow = true and receiveShadow = true also added the to receiveShadow = true. But still I am not able to cast the shadow.

Library version I am using:

three - v0.162.0 node - v20.11.1 vite - v5.2.0

Here is my index.js file code:

import * as THREE from "three";
import "./style.css";
import { OrbitControls } from "three/examples/jsm/controls/OrbitControls";
import { GLTFLoader } from "three/examples/jsm/loaders/GLTFLoader";
import { RGBELoader } from "three/examples/jsm/loaders/RGBELoader";
// Scene
const scene = new THREE.Scene();

// Sizes
const sizes = {
  width: 800,
  height: 500,
};

const gltfLoader = new GLTFLoader();

const rgbeloader = new RGBELoader();

let car3;
rgbeloader.load(
  "./WhiteNeons_NAD.hdr",
  (texture) => {
    texture.mapping = THREE.EquirectangularReflectionMapping;
    scene.environment = texture;
    scene.castShadow = true;

    gltfLoader.load(
      "./models/car3/scene.gltf",
      (gltf) => {
        const model = gltf.scene;
        model.position.set(0, 0, 0);
        model.traverse((node) => {
          node.castShadow = node.isMesh;
          node.receiveShadow = node.isMesh;
        });
        scene.add(model);
        car3 = model;
      },
      undefined,
      (error) => {
        console.error(error);
      }
    );
  },
  undefined,
  (err) => {
    console.error("HDR:: ", err);
  }
);

// Plane
const planeGeometry = new THREE.PlaneGeometry(10, 10);
const planeMat = new THREE.MeshBasicMaterial({ color: 0x5a5a5a });
const plane = new THREE.Mesh(planeGeometry, planeMat);
plane.receiveShadow = true;
plane.rotation.x = -0.5 * Math.PI;
scene.add(plane);

const gridHelper = new THREE.GridHelper();
gridHelper.receiveShadow = true;
scene.add(gridHelper);
// Camera
const camera = new THREE.PerspectiveCamera(
  45,
  sizes.width / sizes.height,
  0.1,
  1000
);
camera.position.set(5, 2, 0);
// camera.position.z = 20;
scene.add(camera);

// Renderer
const canvas = document.querySelector(".webgl");
const renderer = new THREE.WebGLRenderer({ canvas });

renderer.setClearColor(0xa3a3a3);
renderer.shadowMap.enabled = true;
renderer.shadowMap.type = THREE.PCFSoftShadowMap;
renderer.outputColorSpace = THREE.SRGBColorSpace;
renderer.toneMapping = THREE.ACESFilmicToneMapping;
renderer.toneMappingExposure = 6;

renderer.setSize(sizes.width, sizes.height);
renderer.setPixelRatio(window.devicePixelRatio);
renderer.render(scene, camera);

// Controls
const controls = new OrbitControls(camera, canvas);

window.addEventListener("resize", () => {
  camera.updateProjectionMatrix();
});

function animate(time) {
  if (car3) {
    car3.rotation.y = -time / 1000;
  }
  controls.update();
  requestAnimationFrame(animate);
  renderer.render(scene, camera);
}
animate();

How to create a multi-user chatbot with langchain

Hope you are doing good. Iโ€™ve prepared a chatbot based on the below langchain documentation:

Langchain chatbot documentation

In the above langchain documenation, the prompt template has two input variables - history and human input.

Iโ€™ve variables for UserID, SessionID. Iโ€™m storing UserID, SessionID, UserMessage, LLM-Response in a csv file. I used python pandas module to read the csv and filtered the data frame for given UserID and SessionID and prepared the chat-history for that specific user session. Iโ€™m passing this chat-history as the โ€˜historyโ€™ input to the langchain prompt template(which was discussed in the above link). As I set verbose=true, the langchain was printing the prompt template on the console for every API call. Iโ€™ve started the conversation for the first user and first session and sent 3 human_inputs one by one. Later I started the second user session(now session ID and user ID are changed). After observing that prompt template on the console, Iโ€™ve observed that langchain is not only taking chat-history of second user session, itโ€™s taking some of the chat-history from previous user session as well, even though Iโ€™ve written the correct code to prepare chat-history for the given user session. The code to get chat-history is below:

# get chat_history
def get_chat_history(user_id,session_id,user_query):
    chat_history = "You're a chatbot based on a large language model trained by OpenAI. The text followed by Human: will be user input and your response should be followed by AI: as shown below.\n"
    chat_data = pd.read_csv("DB.csv")
    for index in chat_data.index:
        if ((chat_data['user_id'][index] == user_id) and (chat_data['session_id'][index] == session_id)):
            chat_history += "Human: " + chat_data['user_query'][index] + "\n" + "AI: " + chat_data['gpt_response'][index] + "\n"
    chat_history += "Human: " + user_query + "\n" + "AI: "
    return chat_history

How to teach langchain to consider only the given user session chat-history in itโ€™s prompt. Please help

kotlin combine flow has a slight delay from the dependent flows, thus causing issue in android compose view

I have a ListScreen which displays list items loaded from internet with the option to filter the items.

So, I have two states here: future and filter (see ViewModel code). My list state is combined flow of these two states, with initial value of empty list.

In my Compose screen, I want to show different view when list state is empty, otherwise the list items. The issue is that, when future state changes from LOADING to SUCCESS, the list state doesn't change from empty list to future result immediately, but a slight delay is observed. Due to this slight delay, sometimes the empty list screen is shown for a slight moment (See comment in Compose code).

ViewModel code:

class ListViewModel<T>(
    val path: String,
    val repository: MyRepository<T>,
): ViewModel() {
    enum class State { LOADING, SUCCESS, FAILURE }

    private val _futureState = MutableStateFlow(State.LOADING)
    val futureState = _futureState.asStateFlow()
    
    private lateinit var _futureError: Throwable
    val futureError get() = _futureError
    
    private lateinit var _futureResult: List<T>
    
    init {
        viewModelScope.launch(Dispatchers.IO) { 
            try {
                _futureResult = repository.get(path) // suspend fun returns List<T>
                _futureState.update { State.SUCCESS }
            } catch (error: Throwable) {
                _futureError = error
                _futureState.update { State.FAILURE }
            }
        }
    }
    
    private val _filterState = MutableStateFlow<String?>(null)
    val filterState = _filterState.asStateFlow()
    
    fun setFilter(value: String?) {
        _filterState.update { value }
    }
    
    val listState = combine(futureState, filterState) { state, filter ->
        if (state == State.SUCCESS) {
            if (!filter.isNullOrBlank()) {
                _futureResult.filter { 
                    // filter predicate
                }  
            } else _futureResult
        } else emptyList()
    }.stateIn(
        viewModelScope,
        SharingStarted.WhileSubscribed(1000),
        emptyList(),
    )
}

Compose code:

@Composable
fun <T> ListScreen(
    path: String,
    repository: MyRepository<T>,
) {
    val vm: ListViewModel<T> = viewModel(
        factory = viewModelFactory {
            initializer {
                ListViewModel(path, repository)
            }
        }
    )

    val state by vm.futureState.collectAsState()
    val filter by vm.filterState.collectAsState()
    val list by vm.listState.collectAsState()

    Scaffold {
        when (state) {
            State.LOADING -> { /* loading screen */ }
            State.FAILURE -> { /* error message screen */ }
            State.SUCCESS -> {
                if (list.isEmpty()) {
                    /* no results found screen */

                    // issue: when state changes from LOADING -> SUCCESS,
                    // this screen is shown for a very short duration

                    // note: the issue is only observed sometimes 
                } else {
                    /* list items screen */
                }
            }
        }
    }
}

Retrieve nodes by their relationship to another node

This is my Neomodel class:

class Sample(StructuredNode):
    uid = UniqueIdProperty()
    name = StringProperty(unique_index=True)


class Annots(StructuredNode):
    uid = UniqueIdProperty()    
    attributedTo = RelationshipTo('Sample', 'attributedTo', OneOrMore)
    assignedBy = RelationshipFrom('User', 'assignedBy', OneOrMore)

The user selects a sample, so I need to retrive all anotations that are linked to that sample. Using Cquery is straightforward:

MATCH (a:Annots)-[r:attributedTo]->(s:Sample)
where s.name = 'x'
RETURN a

But when I run into Python Neomodel I just can't figure out how to make it other than this and it seems not the right way.

current_sample = Sample.nodes.first_or_none(name=samples[selected_sample])
if current_sample is not None:
    related_annots = Annots.nodes.all(attributedTo=current_sample)

ValueError: No such property attributedTo on Annots. Note that Neo4j internals like id or element_id are not allowed for use in this operation.

I have thought of loading all annots nodes and then iterate to check their attribute but I don't think it is the best solution. Also I can change the relationship to Sample class, but I think it is better located in Annots class.

Any thoughts here?

--------- Update ---------

I can solve it by changing the Sample class to:

class Sample(StructuredNode):
    uid = UniqueIdProperty()
    name = StringProperty(unique_index=True)
    attributedfrom = RelationshipFrom('Annots', 'attributedTo', OneOrMore)

and making this filter call:

related_annots = current_sample.attributedfrom.all()

Still not convinced this is the propoer way to get around thisproblem

When using AutoModelForCausalLM, CogVLM and load_in_8bit I get this error : self and mat2 must have the same dtype, but got Half and Char

Load in 4 bit working great but when I use load in 8 bit, there is happening dtype mismatch which I can't make sense

There are no errors in CMD but the error message comes with response as self and mat2 must have the same dtype, but got Half and Char

import gradio as gr
import os, sys
from transformers import AutoModelForCausalLM, LlamaTokenizer
from PIL import Image, UnidentifiedImageError
import torch
import argparse
import time
from pathlib import Path
from itertools import chain

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL_PATH = "THUDM/cogagent-vqa-hf"
tokenizer = LlamaTokenizer.from_pretrained('lmsys/vicuna-7b-v1.5')
torch_type = torch.float16

model = AutoModelForCausalLM.from_pretrained(
    MODEL_PATH,
    low_cpu_mem_usage=True,
    #load_in_4bit=bit_4, when this is True and 8 bit false it works
    load_in_8bit=True,
    trust_remote_code=True
).eval()

I believe below the error cause is one of the below unsqueeze, to(DEVICE) or .to(DEVICE).to(torch_type) but even though I tried combinations I couldn't fix it

def post(input_text, temperature, top_p, top_k, image_prompt, do_sample):
    try:
        with torch.no_grad():
            image = Image.open(image_prompt).convert('RGB') if image_prompt is not None else None

            input_by_model = model.build_conversation_input_ids(tokenizer, query=input_text, history=[], images=([image] if image else None), template_version='base')
            inputs = {
                'input_ids': input_by_model['input_ids'].unsqueeze(0).to(DEVICE),
                'token_type_ids': input_by_model['token_type_ids'].unsqueeze(0).to(DEVICE),
                'attention_mask': input_by_model['attention_mask'].unsqueeze(0).to(DEVICE),
                'images': [[input_by_model['images'][0].to(DEVICE).to(torch_type)]],
            }
            if 'cross_images' in input_by_model and input_by_model['cross_images']:
                inputs['cross_images'] = [[input_by_model['cross_images'][0].to(DEVICE).to(torch_type)]]

            gen_kwargs = {
                "max_length": 2048,
                "temperature": temperature,
                "do_sample": do_sample,
                "top_p": top_p,
                "top_k": top_k
            }
            outputs = model.generate(**inputs, **gen_kwargs)
            outputs = outputs[:, inputs['input_ids'].shape[1]:]
            response = tokenizer.decode(outputs[0])
            response = response.split("</s>")[0]
            return response
    except Exception as e:
        return str(e)

Yasso20 R implementation

I have been trying to run the Yasso20 model implementation for R (https://github.com/YASSOmodel/Yasso20/blob/main/model/Yasso20.r)

I am testing the function Yasso20 with data from: https://github.com/YASSOmodel/Ryassofortran/tree/master/data

as:

Yasso20(YParameters = sample_parameters,
        SimulationTime = sample_data_run$time, 
        MonthlyTemperature = sample_data_run$temp, 
        Precipitation = sample_data_run$prec, 
        InitialCPool = sample_data_run$init, 
        LitterInput = sample_data_run$litter, 
        WoodySize = sample_data_run$wsize, 
        leac = sample_data_run$leac, 
        SS_pred = TRUE)

but I get:

Error in if (tem <= 1e-16) { : the condition has length > 1

I thought that maybe this function is meant to be used for each time step, and run in loop, so I tried then:

Yasso20(YParameters = sample_parameters,
        SimulationTime = sample_data_run$time[1], 
        MonthlyTemperature = sample_data_run$temp[1], 
        Precipitation = sample_data_run$prec[1], 
        InitialCPool = sample_data_run$init[1], 
        LitterInput = sample_data_run$litter[1], 
        WoodySize = sample_data_run$wsize[1], 
        leac = sample_data_run$leac[1], 
        SS_pred = TRUE)

getting a little farther:

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'as.array': replacement has length zero

However, I am not sure this is the correct approch. It seems to me that the input data should be formatted differently.

I am hoping somebody knows which sample data can be used to run this function, or which format it should have.

Jetpack compose. Visual transformation getting incorrect offset

I'm using composable BasicTextField:

BasicTextField(
   value = viewModel.state.phone,
   onValueChange = { 
       viewModel.onPhoneChanged(if (!it.startsWith("+")) "+$it" else it)
     },
   visualTransformation = PhoneNumberTransformation(),
)

My viewModel:

fun onPhoneChanged(phone: String) {
   state.update {
       it.copy(phone = phone)
   }
}

As you can see, if the user enters a number as the first character, I force a plus to be added to the zero position.

The plus is added, however in the visual transformation I get an offset equal to 1. For example, I enter 9, in which case +9 should be displayed in the text field, in the visual transformation in the originalToTransformed the offset is equal to 1, although the offset should already be equal to 2 (we must take into account plus). What could be the problem ? help me please

Model is null when sent from controller to view

I'm getting a System.NullReferenceException for Model in my view:

@page
@model List<MyApp.Models.Student>

<div>
  @{
    string sHtml = TestPageView.GetHtml(Model); // <-- NullReferenceException here
  }

  @Html.Raw(sHtml)
</div>

Here's my controller action:

public IActionResult TestPage()
{
  return View(SomeStudents.Students);
}

Here's my Student model:

namespace Models
{
  public class Student
  {
    public int Id;
    public string Name;
    public int Age;
    public int Grade;
  }
}

...and the list comes from here:

namespace Data {
  public class SomeStudents {
    public static List<Student> Students { get; } = new List<Student>() {
      new Student() { Id = 1, Name = "Adam", Age = 20, Grade = 69 },
      new Student() { Id = 2, Name = "Mark", Age = 21, Grade = 80 },
      new Student() { Id = 3, Name = "Bill", Age = 18, Grade = 51 }
    };
  }
}

I've reviewed several slightly similar Q&As reporting the same problem, including these:

...but I'm not finding anything that addresses my exact scenario (which I believe to be quite simple and not at all complex, which is confounding).

I tried explicitly naming the view, as in this answer:

public IActionResult TestPage()
{
  return View("TestPage", SomeStudents.Students);
}

I also tried adjusting the view a bit, using a slightly modified excerpt from this example:

@page
@model List<MyApp.Models.Student>

<div>
  @foreach (var item in Model) { // <-- NullReferenceException here
    @Html.DisplayFor(modelItem => item.Name)
  }
</div>

In all cases, however, attempts to access Model in the view result in the same NullReferenceException.

How do I pass the list of Students to my view?

Additional BERT inference became slower and kill RAM

I have pretrained BERT model for russian language, which I downloaded from the hugging face in google colab. I am trying to infer vector reprsentation after BERT work - pooler_output. For this purpose code works fine, but for several forward passes. After them my session crashes because of the RAM limit. Why is it the case and why all was fine for few steps? And how can I work around this?

I also noticed that time for processing for each next batch is mor than for the previous one. Maybe it may help.

from transformers import AutoTokenizer, AutoModel
import torch
model = AutoModel.from_pretrained('ai-forever/ruBert-base')
tokenizer = AutoTokenizer.from_pretrained('ai-forever/ruBert-base')


def process_texts(raw_texts):
  texts_raw_list = raw_texts.to_list()
  text_tokens = tokenizer(texts_raw_list, truncation=True, padding=True, return_tensors='pt', max_length = 512)#.input_ids
  inputs = text_tokens['input_ids']
  attention_mask = text_tokens['attention_mask']
  inputs = inputs.to(device_name)
  attention_mask = attention_mask.to(device_name)
  result = model(inputs, attention_mask=attention_mask).pooler_output
  return result

n = 0

for g in range(2734129//48000 + 1):
  total = pd.DataFrame()
  z = pd.read_csv('/drive/MyDrive/data/unparsed.csv', skiprows = 42000 + 48000*g, nrows = 48000)
  z.columns = ['unnamed', 'description', 'vacancy_id']
  raw = z['description']
  for i in tqdm(range(48000)):
    if i%8 == 0:
      res = process_texts(raw.iloc[i*8:i*8+8])
      z = pd.concat([z[['vacancy_id']].reset_index(inplace=False, drop=True), pd.DataFrame(res.tolist()).reset_index(inplace=False, drop=True)], axis = 1)
      total = pd.concat([total, z], axis = 0, ignore_index = True)
    if (i+1)%1200 == 0:
      total.to_csv('/drive/MyDrive/data_embs/embeds/'+str(n)+'.csv')
      total = pd.DataFrame()
      n+=1
  total.to_csv('/drive/MyDrive/data_embs/embeds/'+str(n)+'.csv')
  total = pd.DataFrame()

How to increase original LLaMA2 inference speed?

This question is about the original Meta LLaMA2 model that you download from Meta website. This question is NOT ABOUT Hugging Face (or any other) quantized modeles or quantization methods.

I have an advanced usecase that works with the original model, but does not work with all the other quantized modeles. The problem is that I can run only one inference at time. Batching also does not work, because when I use the batching option, the model returns incorrect results. When I try to call the model in parallel, all the inferences get blocked and take even longer to return, than if executed one by one.

So the question is simple, how can I make original LLaMA2 model to support multiple inferences in any way.

I run on single GPU RTX4090

I use the standard code:

generator = Llama.build(
    
    ckpt_dir="C:/AI/codellama/CodeLlama-7b-Instruct",
    tokenizer_path="C:/AI/codellama/CodeLlama-7b-Instruct/tokenizer.model",
    max_seq_len=max_seq_len,
    max_batch_size=max_batch_size,
    model_parallel_size = num_of_worlds
)

    
    results = generator.chat_completion(
        dialogs,  
        max_gen_len=max_gen_len,
        temperature=temperature,
        top_p=top_p,
    )
โŒ
โŒ