Newest 'cluster-computing' Questions

Advice

2 votes

4 replies

69 views

determine cpu after c++ compilation with gcc?

Does anyone know if there is, in c++, any way to determine at runtime the cpu characteristics of the machine that compiled the code? For example, in gcc (which I'm using) the preprocessor variable ...

user3195869

95

asked 6 hours ago

0 votes

0 answers

43 views

Snakemake slurm cluster status in version 9.6.3

I am currently using Snakemake version 9.6.3 on a cluster managed by an SLURM scheduler. In previous workflows, I relied on version 6, which supported the --cluster, --cluster-status, and --parsable ...

jeje

11

asked Oct 20 at 13:39

0 votes

0 answers

185 views

CRC Status Shows 'OpenShift: Unreachable' Even After Multiple Restarts and Setup

I am trying to run Red Hat CodeReady Containers (CRC) with OpenShift 4.19.8 on an Ubuntu VM (running on VMware). No matter what I do, crc status always shows: crc status output (https://i.sstatic.net/...

cyrine maamer

1

asked Sep 22 at 21:48

0 votes

0 answers

110 views

Check if a server is part of an Azure cluster using C# and PowerShell

I am testing some code to check if a named server is part of an Azure cluster. I currently have a simple console application where the user enters the name of the server to check and the code then ...

Nigel Tunnicliffe

141

asked Sep 3 at 9:16

2 votes

1 answer

71 views

How to identify price regimes / trends in Pandas

I have created the following pandas dataframe, which is an example of 26 stock prices (Open, High, Low, Close): import pandas as pd import numpy as np ds = { 'Date' : ['15/06/2025','16/06/2025','17/...

Giampaolo Levorato

1,762

asked Jul 15 at 13:27

0 votes

0 answers

13 views

Proxmox migration and autoboot

I migrated my vm from one node to another over the cluster using the migrate function, without having a downtime. Theses vm where set to auto boot. Is this setting is kept trough the migration on a ...

btc4cash

325

asked Jun 12 at 23:39

0 votes

1 answer

105 views

Issue while deploying container in GKE

I am getting this error while deploying in GKE : Error from server: Get "https://10.x.x.x:10200/containerLogs/server-center/server-center-dev-86f67jkilo-rwrnm/server-center-dev": No agent ...

SecureTech

259

asked May 21 at 14:33

-1 votes

1 answer

72 views

Best practices for running high-granularity benchmark [closed]

I am trying to run a benchmark on some family of algorithms. I have multiple algorithms, each of them with one hyperparameter, and I want to test them with multiple data sizes. Each run takes ~60 ...

David Davó

812

asked Apr 2 at 15:43

1 vote

0 answers

112 views

"invalid_grant" "Code not valid" in Keycloak with multiple containers using same client

Sorry if this matter was discussed before. I looked for something like that, but found nothing. We have a scenario where we have a Keycloak, an NGINX proxy, four containers having a monolithic legacy ...

Walter do Valle

11

asked Mar 25 at 8:45

2 votes

1 answer

36 views

Is --nodes 2 (without '=') accepted way of requesting nodes in slurm?

I just realized that I have been always using a slurm script, where in the first line I specify number of nodes in a wrong way. I see two options are either #SBATCH N 2 or #SBATCH --nodes=2. Instead I ...

fahd

183

asked Mar 17 at 14:03

0 votes

0 answers

42 views

AWS PCS cluster creation failed with cloud formation

Im creating a complete HPC architecture on AWS using service AWS PCS. In my cloud formation template literally all resource creation is successful but AWS PCS. Cluster: Type: AWS::PCS::Cluster ...

parthraj panchal

121

asked Mar 7 at 16:40

0 votes

1 answer

148 views

How to determine buddy nodes dependencies in Vertica

How can I find in Vertica (Enterprise mode, K-safety 1) node dependencies so that I could build a node graph like this? The following query: select n.name, d.dependency_id from v_internal....

GriGrim

2,941

asked Feb 24 at 17:27

0 votes

0 answers

61 views

Pacemaker dynamic location constraint with expression

I am experimenting with storage clusters using RHEL9.3 and GFS2 with DRBD replication. So far I found a stable solution by using 3 nodes for main (one is DRBD Primary and mounts the DRBD disk, while ...

Fegendet

11

asked Feb 19 at 16:42

0 votes

1 answer

159 views

How to utilize multiple CPUs for training of YOLO?

I have access to a large CPU cluster that does not have GPUs. Is it possible to speed up YOLO training by parallelizing between multiple CPU nodes? The docs say that device parameter specifies the ...

Artem Lebedev

163

asked Jan 17 at 16:05

0 votes

0 answers

104 views

Problem when running custom flink jar application on cluster

I have a little problem when running a custom jar application on a cluster. First, I ran my custom jar application in a local flink installation: /bin/flink run /home/osboxes/WordCount.jar --input ...

ricksant

137

asked Jan 16 at 13:59

0 votes

0 answers

33 views

Galera Cluster (on GMD Gui need a refence cmd to Recover Cluster)

Under the cluster Galera Manager Daemon (gmd) gui dropdown there is the Recover Cluster option as shown in the image this work fine but requires me to manually press it: Recover Cluster Button What is ...

user3600775

1

asked Jan 13 at 7:05

0 votes

1 answer

63 views

How to add shapefiles and raster files as covariates in spatstat kppm function

I encountered various errors while running the spatstat package. Here is a summary of my data: a shapefile of landslide point events, a watershed shapefile as the observation window, and several ...

alipin ng sahod

1

asked Dec 28, 2024 at 10:01

0 votes

0 answers

31 views

Pre-staging large data files for parallel job execution

Apologies in advance if this is a mundane or unclear question. I want to scale up a workflow on on a cluster to run a program concurrently on several nodes. The program in question references a large, ...

gladshire

81

asked Dec 17, 2024 at 16:36

0 votes

0 answers

25 views

Cluster stability measurement for fuzzy clustering

I'm a biologist working in the data science field. I've successfully done clustering for a heterogenic disease with K-means. But I shifted to Fanny to get membership value and to be able to handle the ...

Mary

221

asked Dec 5, 2024 at 8:05

0 votes

0 answers

28 views

Why is my receive_processed_video function not receiving the full amount of data

I'm trying to make a cluster processing system with a client, broker and nodes. When executing the receive_processed_video function, it stops receiving data after a random time. Is there anything that ...

DavidBarragann

1

asked Nov 23, 2024 at 1:25

0 votes

1 answer

207 views

How can I ensure that my Python logic runs exclusively on the Apache Ray Worker Nodes?

I am using Apache Ray to create a customized cluster for running my logic. However, when I submit my tasks with ray.remote, they are executing on the driver node rather than on the worker nodes I ...

question.it

3,018

asked Nov 11, 2024 at 5:14

1 vote

0 answers

103 views

Prioritize specific nodes in slurm job submission

Is there a way to prioritize certain nodes over others in a job submission without admin privileges? I know about the --nodelist, --constraint or --exclude directives, but if set, the job runs only if ...

Oskar

1,488

asked Oct 25, 2024 at 11:25

1 vote

1 answer

80 views

Snakemake remote rule stalling before executing script in PBS cluster

I have a snakemake (7.22.0) that's stalling after they start. I have rules that run on a cluster (through pbs) and execute an external Python script. I noticed that now some of the rules stall for ...

Yotam Feldman

43

asked Oct 16, 2024 at 16:43

2 votes

1 answer

130 views

Spring boot Tomcat session replication with Traefik

I'm trying to setup session replication using Spring boot with Traefik. I've found how it can be achieved with Tomcat and its server.xml file in the following link: Tomcat session replication in ...

Marian Smarik

53

asked Sep 26, 2024 at 8:46

1 vote

0 answers

17 views

Why am I getting "TXN_REQUEST_IGNORED ERROR 10906" in GridDB due to an unknown event during cluster operations?

I’m using GridDB for a distributed database setup and recently encountered the following error while performing operations across nodes in the cluster: from griddb_python import StoreFactory, ...

Samar Mohamed

71

asked Sep 25, 2024 at 21:39

Collectives™ on Stack Overflow

determine cpu after c++ compilation with gcc?

Snakemake slurm cluster status in version 9.6.3

CRC Status Shows 'OpenShift: Unreachable' Even After Multiple Restarts and Setup

Check if a server is part of an Azure cluster using C# and PowerShell

How to identify price regimes / trends in Pandas

Proxmox migration and autoboot

Issue while deploying container in GKE

Best practices for running high-granularity benchmark [closed]

"invalid_grant" "Code not valid" in Keycloak with multiple containers using same client

Is --nodes 2 (without '=') accepted way of requesting nodes in slurm?

AWS PCS cluster creation failed with cloud formation

How to determine buddy nodes dependencies in Vertica

Pacemaker dynamic location constraint with expression

How to utilize multiple CPUs for training of YOLO?

Problem when running custom flink jar application on cluster

Galera Cluster (on GMD Gui need a refence cmd to Recover Cluster)

How to add shapefiles and raster files as covariates in spatstat kppm function

Pre-staging large data files for parallel job execution

Cluster stability measurement for fuzzy clustering

Why is my receive_processed_video function not receiving the full amount of data

How can I ensure that my Python logic runs exclusively on the Apache Ray Worker Nodes?

Prioritize specific nodes in slurm job submission

Snakemake remote rule stalling before executing script in PBS cluster

Spring boot Tomcat session replication with Traefik

Why am I getting "TXN_REQUEST_IGNORED ERROR 10906" in GridDB due to an unknown event during cluster operations?

Hot Network Questions