How to view the changes in a huggingface model after training?

Issue I trained a BART model (facebook-cnn) for summarization and compared summaries with a pretrained model model_before_tuning_1 = AutoModelForSeq2SeqLM.from_pretrained(model_name) trainer = Seq2SeqTrainer( model=model, args=training_args, data_collator=data_collator, train_dataset=train_data, eval_dataset=validation_data, tokenizer=tokenizer, compute_metrics=compute_metrics, ) trainer.train() Summaries from model() and model_before_tuning_1() are different but when

Continue reading

Tokenizing an HTML document

Issue I have an HTML document and I’d like to tokenize it using spaCy while keeping HTML tags as a single token. Here’s my code: import spacy from spacy.symbols import ORTH nlp = spacy.load(‘en’, vectors=False, parser=False, entity=False) nlp.tokenizer.add_special_case(u'<i>’, [{ORTH: u'<i>’}])

Continue reading

Tokenizing an HTML document

Issue I have an HTML document and I’d like to tokenize it using spaCy while keeping HTML tags as a single token. Here’s my code: import spacy from spacy.symbols import ORTH nlp = spacy.load(‘en’, vectors=False, parser=False, entity=False) nlp.tokenizer.add_special_case(u'<i>’, [{ORTH: u'<i>’}])

Continue reading