🤗Transformers

Topic	Replies	Views	Activity
Transformers Repo Install Error 🤗Transformers	9	34	June 6, 2025
Stopiteration error 🤗Transformers	3	97	June 6, 2025
How many GPU resources do I need for full-fine tuning of the 7b model? 🤗Transformers	2	5048	June 5, 2025
Generate: using k-v cache is faster but no difference to memory usage 🤗Transformers	5	15642	June 3, 2025
Distributed Training w/ Trainer 🤗Transformers	11	8799	June 3, 2025
Grouping by length makes training loss oscillate and makes evaluation loss worse 🤗Transformers	2	223	June 3, 2025
How can LLMs be fine-tuned for specialized domain knowledge? 🤗Transformers	2	205	June 3, 2025
Implementing Triplet loss in Vit 🤗Transformers	1	15	June 3, 2025
Using Huggingface for computer vision (Tensorflow)? 🤗Transformers	3	404	June 2, 2025
valueError: Supplied state dict for layers does not contain `bitsandbytes__*` and possibly other `quantized_stats`(when load saved quantized model) 🤗Transformers	4	696	May 30, 2025
RGBA -> RGB default background color vs padding color 🤗Transformers	1	7	May 30, 2025
Why is Static Cache latency high? 🤗Transformers	2	12	May 29, 2025
Error using Trainer with Colab notebook, anyone have a solution? 🤗Transformers	1	35	May 29, 2025
LoRA training with accelerate / deepspeed DeepSpeed	3	2241	May 28, 2025
How does Q, K, V differ in LLM? 🤗Transformers	1	19	May 28, 2025
The effect of padding_side 🤗Transformers	13	14289	May 27, 2025
Prompt caching in pipelines 🤗Transformers	1	31	May 27, 2025
GETTING ERROR >> AttributeError: 'InferenceClient' object has no attribute 'post' 🤗Transformers	5	243	May 27, 2025
How does Llama For Sequence Classification determine what class corresponds to what label? 🤗Transformers	10	4796	May 25, 2025
Best practice for usage of Data Collator For CompletionOnlyLM in multi-turn chat 🤗Transformers	2	602	May 25, 2025
How to merge fine-tuned LLaMA-3.1-8B (via LLaMA-Factory) into a single GGUF for LM Studio? 🤗Transformers	1	27	May 25, 2025
Generate keeps increasing memory usage on ubuntu 🤗Transformers	6	33	May 25, 2025
How does Transformers Library work under the hood? 🤗Transformers	1	15	May 22, 2025
Identical Evaluation Metrics for SFT & DPO–Fine-Tuned LoRA Adapter on SeaLLMs-v3-7B 🤗Transformers	1	14	May 22, 2025
Create a weighted loss function to handle imbalance? 🤗Transformers	3	1005	May 21, 2025
Incorrect total train batch size when using tp_size > 1 and deepspeed DeepSpeed	1	33	May 20, 2025
How do I load a trained checkpoint model? 🤗Transformers	1	31	May 20, 2025
Fine tuning on qwen3 🤗Transformers	2	308	May 19, 2025
TokenClassificationPipeline produce entities with "##" characters 🤗Transformers	6	24	May 19, 2025
PPO Training does not improve SFT model outputs (Metrics identical before and after PPO) 🤗Transformers	1	34	May 19, 2025