I trained a BART model (facebook-cnn) for summarization and compared summaries with a pretrained model
model_before_tuning_1 = AutoModelForSeq2SeqLM.from_pretrained(model_name) trainer = Seq2SeqTrainer( model=model, args=training_args, data_collator=data_collator, train_dataset=train_data, eval_dataset=validation_data, tokenizer=tokenizer, compute_metrics=compute_metrics, ) trainer.train()
Summaries from model() and model_before_tuning_1() are different but when i compare the model config and/or print(model) it gives exact same things for both.
How to know, what exact parameters have this training changed?
You can compare state_dict of the models. I.e. model.state_dict() and model_before_tuning_1.state_dict().
State_dict contains learnable parameters that change during traning. For further details see: https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html
Otherwise, printing the models or model config gives you the same results because the architecure does not change during training.
Answered By – Ray
Answer Checked By – Terry (AngularFixing Volunteer)