95 questions
1
vote
0
answers
68
views
How to pass P_map: dict[str, torch.Tensor] to PEFT (LoRA)?
My proxy goal is to change LoRA from h = (W +BA)x to h = (W + BAP)x. Preliminary code attached for your reference
My actual goal is to train a model with the following loss: 〖Θ ̃=(arg min)┬Δ ̂ 〗〖‖𝑓_(...
1
vote
0
answers
53
views
BLIP Fine-Tuning: Special Token Always Biased to One Class in Generated Caption
I'm trying to fine-tune Hugging Face BLIP (Bootstrapped Language-Image Pretraining) to classify pizza boxes as either recyclable (clean) or non-recyclable (contaminated) by generating captions that ...
0
votes
0
answers
75
views
Trainer is failing to load optimizer save state when resuming training
Intro to the problem
I am trying to train Llama-3.1 8B on an H100 but I keep running into the following error when trying to resume training
...
File "/home/jovyan/folder/training/.venv/lib/...
-2
votes
1
answer
56
views
Fine-tuning a model with the Trainer API | TypeError: object of type 'NoneType' has no len()
I am using hugging face Trainer API.
transformers version==4.31.0
torch==2.0.1
accelerate==0.27.0
I'm trying to fine-tune a TimeSformer model for video classification using the Hugging Face ...
2
votes
2
answers
4k
views
TypeError in SFTTrainer: Unexpected Keyword Arguments (packing, dataset_text_field, max_seq_length)
I'm trying to fine-tune a model using SFTTrainer from trl, but I'm facing multiple TypeError issues related to unexpected keyword arguments.
from transformers import TrainingArguments
from trl import ...
0
votes
1
answer
715
views
How to fix Index put requires the source and destination dtypes match` with `google/gemma-2-2b` in Transformers?
I’m trying to train a language model using google/gemma-2-2b with the Hugging Face Transformers Trainer. The same training script works fine for other models like gpt2 and meta-llama/Meta-Llama-3-8B, ...
2
votes
1
answer
101
views
how to get custom column in the model's forward() function when training with Huggingface Trainer?
I am using Huggingface Trainer to train a cumstom model subclassing a Llama llm. After tokenized by the tokenizer, my dataset has these fields 'input_ids', 'labels' and so on, and I additionally add 2 ...
0
votes
1
answer
630
views
Huggingface trainer is not showing any progress for finetuning
I have a dataset I want to fine-tune a huggingface LLM with.
This dataset is quite simple. It has two columns: one column has DNA sequences (each in the form of a string 5000 letters long). Another ...
0
votes
0
answers
811
views
SSL Certificate Verification Error with Hugging Face Transformers CLI
I'm trying to download the TheBloke/falcon-40b-instruct-GPTQ model using the Hugging Face Transformers CLI in PowerShell on Windows 10, but I consistently encounter an SSL certificate error. It ...
0
votes
0
answers
64
views
Wrong padding tokens in HF model prediction
Please consider the following code:
from datasets import load_dataset_builder, load_dataset
import numpy as np
import os
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, ...
0
votes
1
answer
587
views
How to Log Training Loss at Step Zero in Hugging Face Trainer or SFT Trainer?
I'm using the Hugging Face Trainer (or SFTTrainer) for fine-tuning, and I want to log the training loss at step 0 (before any training steps are executed). I know there's an eval_on_start option for ...
3
votes
0
answers
213
views
How to Log Custom Metrics with Metadata in Hugging Face Trainer during Evaluation?
I'm working on a sentence regression task using Hugging Face’s Trainer. Each sample consists of:
input_ids: The tokenized sentence.
labels: A numerical scalar target (for regression).
metadata: A ...
1
vote
1
answer
312
views
How to add EOS when training T5?
I'm a little puzzled where (and if) EOS tokens are being added when using Huggignface's trainer classes to train a T5 (LongT5 actually) model.
The data set contains pairs of text like this:
from
to
...
0
votes
1
answer
62
views
Seq2Seq trainer.train() keeps giving indexing error
I am trying to do a machine translation from Hindi to Sanskrit using NLLB model. But I keep getting the error:
IndexError: Invalid key: 39463 is out of bounds for size 0.
The error is coming when ...
1
vote
1
answer
534
views
How to add accelerator launch to VS Code Debugger?
Keep getting the error in my terminal:
ConnectionRefusedError: [Errno 111] Connection refused
I got the above error by trying to add in this command:
accelerate launch --num_processes=1 --...
0
votes
1
answer
237
views
PanicException: AddedVocabulary bad split AFTER adding tokens to BertTokenizer
I use a BertTokenizer and add my custom tokens using add_tokens() function.
Minimal sample code here:
checkpoint = 'fnlp/bart-base-chinese'
tokenizer = BertTokenizer.from_pretrained(checkpoint)
...
0
votes
0
answers
91
views
Transformers Trainer: Tried to track the number of tokens seen, however the current model is not configured properly to know what item is the input
I'm receiving this error from HuggingFace's Trainer:
Tried to track the number of tokens seen, however the current model is not configured properly to know what item is the input. To fix this, add a `...
2
votes
0
answers
306
views
GliNER finetuning - no validation loss is logging
I am trying to fine-tune using this notebook: GLiNER/examples/finetune.ipynb at main · urchade/GLiNER (github.com)
However, the logs only show 'loss' , which I assume is the training data set loss, ...
1
vote
1
answer
367
views
How to pass a pytorch DataLoader to huggingface Trainer? Is that even possble?
The usual steps to use the Trainer from huggingface requires that:
Load the data
Tokenize the data
Pass tokenized data to Trainer
MWE:
data = generate_random_data(10000) # Generate 10,000 samples
...
2
votes
0
answers
1k
views
Implementing a weighted loss function in SFTTrainer
Currently you can let SFTTrainer teach your models to learn to predict every token in your dataset, or you can let it train on "completions only", using the DataCollatorForCompletionOnlyLM ...
0
votes
0
answers
40
views
RuntimeError: stack expects each tensor to be equal size, but got [91] at entry 0 and [23] at entry 1
I tried to use the following code to train my model but I get the following issue
Here's the code:
import torch
import pyarrow as pa
import pandas as pd
from transformers import ...
1
vote
1
answer
1k
views
`AcceleratorState` object has no attribute `distributed_type`
I am trying to use an Accelerator with a Trainer using the code bellow:
tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path)
config = AutoConfig.from_pretrained(model_args....
1
vote
1
answer
256
views
Progress bar when launching training jobs in SageMaker does not match with the number of steps for the full training
Background
I am finetuning a mistral-7B-instruct-v01 model using the same workflow as is outlined in these two blogposts (using Sagemaker):
How to Fine-Tune LLMs in 2024 with Hugging Face
Train and ...
1
vote
1
answer
2k
views
Batch and Epoch training metrics for transformers Trainer
There are several ways to get metrics for transformers.Trainer but only for the evaluation and not for the training. I read and found answers scattered in different posts such as this post.
But ...
0
votes
1
answer
310
views
Upgrading accelerate while using Trainer class
I am facing an issue whilst using Trainer class with Pytorch on Google Colab as it demands accelarate>=0.21.0 even though I have updated all the requirements, is there any alternative to it?
"...