Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
27 views

I’m using SGLang’s OpenAI-compatible server (e.g., --port 30000, /v1/chat/completions) and calling it via the openai SDK with an async client: from openai import AsyncOpenAI client = AsyncOpenAI(...
Erfan Mhi's user avatar
0 votes
0 answers
47 views

I've implemented standard homoskedastic multitask Gaussian process regression using GPyTorch as follows: class MyModel(gpytorch.models.ExactGP): def __init__(self, X, Y, likelihood): super(...
SirAndy3000's user avatar
0 votes
0 answers
33 views

I have a PySpark function that reads a reference CSV file inside a larger ETL pipeline. On my personal Databricks cluster, this works fine. On the group cluster, it return empty dataframe, the same ...
Codie's user avatar
  • 1
0 votes
0 answers
102 views

I use the following command to compile an executable file for Android: cmake \ -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -...
XUHAO77's user avatar
  • 11
0 votes
0 answers
36 views

I am familiar with the fitdistrplus package, which offers relevant tool for statistical inferences. Meanwhile, I have trouble to understand what needs to be done when facing a sample that is left and/...
yeahman269's user avatar
0 votes
0 answers
26 views

i am currently studying tracking algorithm such as Seqtrack, ARTrack and ODTrack, for the configuration I have found this .yaml file in the experiments folder : https://github.com/microsoft/VideoX/...
Chloé c's user avatar
1 vote
0 answers
110 views

I need to do inference using vllm for large dataset, code structure as below: ds = ray.data.read_parquet(my_input_path) ds = input_data.map_batches( VLLMPredictor, concurrency=ray_concurrency, ...
cnmdestroyer's user avatar
0 votes
0 answers
75 views

I'm trying to build trueskill model with two teams and 5 players each with Infer.Net. However when inferring the skills the means of the distribution get way too big or small. Below is code of my ...
Ranersss's user avatar
0 votes
0 answers
35 views

I need to perform model inference using a deep learning model on a stream of data. However, the challenge I’m facing is that the inputs to the model might not arrive continuously, but rather with some ...
conmeobeo's user avatar
0 votes
1 answer
68 views

Problem: I have created my encoder-decoder model to forecast time series. Model trains well, but I struggle with the error in the inference model and I dont know how to troubleshoot it: WARNING:...
Art's user avatar
  • 11
-2 votes
1 answer
624 views

I read on https://github.com/huggingface/smollm/tree/main/smol_tools (mirror 1): All models are quantized to 16-bit floating-point (F16) for efficient inference. Training was done on BF16, but in our ...
Franck Dernoncourt's user avatar
0 votes
2 answers
4k views

Im trying to run inference using ONNX runtime on my server GPU. However im getting this error: 2024-08-10 23:53:29.404983674 [E:onnxruntime:Default, provider_bridge_ort.cc:1745 TryGetProviderInfo_CUDA]...
Mhmdfad's user avatar
  • 11
2 votes
1 answer
329 views

I'm trying to save my model so it won't need to re-download the base model every time I want to use it but nothing seems to work for me, I would love your help with it. The following parameters are ...
Lidor Eliyahu Shelef's user avatar
1 vote
0 answers
96 views

Let's consider the following Java program: import java.util.*; import java.util.stream.Collectors; public class Main { record Foo(String id, List<Bar> bars) {} record Bar(String id) {}...
Robin Dos Anjos's user avatar
1 vote
0 answers
465 views

I'm working on deploying a pre-trained Hugging Face Transformer models for inference using KServe, but my Kubernetes environment does not support KServe 0.13v. I've researched the topic and found ...
Reehan's user avatar
  • 11
0 votes
0 answers
28 views

I am trying to implement the Metropolis-Hastings (M-H) algorithm in R to sample from the posterior distribution of a Gumbel Type II distribution. However, I'm encountering issues with my ...
Carlos Souto Dos Santos Filho's user avatar
0 votes
1 answer
48 views

I successfully launched a training job in sagemaker. However, when I try to use the model to run inference, sagemaker is unable to find the model. import sagemaker from sagemaker.transformer import ...
Cyrus Mohammadian's user avatar
0 votes
1 answer
424 views

I trained a CNN model and saved it as a .keras file. Now I want other people to use it for making predictions. I am planning on deploying it using a flask server and package the whole thing in an exe. ...
CuriousRabbit's user avatar
0 votes
0 answers
313 views

I'm using a GPU server which has 4 A100 chips. I'm studying how to use ViT (in timm). My local GPU is a GTX 1650 Super but it is faster than a A100. The A100 takes almost 1 hour to finish the ...
Hyeongjun Cho's user avatar
-3 votes
2 answers
1k views

I am doing a CS50’s Introduction to Artificial Intelligence with Python course and I enjoy it very much. When I run my script, it seems its all working well, but CS50 checker finds some kind of edge ...
Maciej Zamojski's user avatar
1 vote
1 answer
216 views

I am trying to invoke sagemaker batch transform Input file example.jsonl {"number":"0060540745","brand_name":"XYZ","generic_keywords":"123"} ...
Jeya Kumar's user avatar
  • 1,112
0 votes
1 answer
51 views

Weights in the server work in server only, when i download the weights and run it in my local pc then im noticing that it doesnt detect any object at all. Commands used python train.py --epochs 10 --...
Aditya Kushal's user avatar
0 votes
1 answer
199 views

I am able to run Apache Jena Fuseki 4.6.1 under Windows 10 with no problems when using a config file that includes the following: <#service1> rdf:type fuseki:Service ; # . . . fuseki:dataset &...
Ted's user avatar
  • 11
1 vote
1 answer
2k views

I use llama-cpp-python to run LLMs locally on Ubuntu. While generating responses it prints its logs. How to stop printing of logs?? I found a way to stop log printing for llama.cpp but not for llama-...
San Vik's user avatar
  • 11
1 vote
0 answers
62 views

I'm working with a machine that has four A100 GPUs, and I'm using them for inference on the Mixtral 8x7B model with text-generation-inference. Strangely, I've noticed that using all 4 GPUs increases ...
doNothing's user avatar

1
2 3 4 5
13