{"id":1033,"hash":"2fad2926fc6ec9667c4320a094041296cbe81cc48f226abd051a4f7c42a5bc39","pattern":"How can I make sentence-BERT throw an exception if the text exceeds max_seq_length, and what is the max possible max_seq_length for all-MiniLM-L6-v2?","full_message":"I'm using sentence-BERT from Huggingface in the following way:\n\nfrom sentence_transformers import SentenceTransformer\nmodel = SentenceTransformer('all-MiniLM-L6-v2')\nmodel.max_seq_length = 512\nmodel.encode(text)\n\nWhen text is long and contains more than 512 tokens, it does not throw an exception. I assume it automatically truncates the input to 512 tokens.\n\nHow can I make it throw an exception when the input length is larger than max_seq_length?\n\nFurther, what is the maximum possible max_seq_length for all-MiniLM-L6-v2?","ecosystem":"pypi","package_name":"nlp","package_version":null,"solution":"First of all, it should be noted that the sentence transformer supports a different sequence length than the underlying transformer. You check those values with:\n\n# that's the sentence transformer\nprint(model.max_seq_length)\n# that's the underlying transformer\nprint(model[0].auto_model.config.max_position_embeddings)\n\nOutput:\n\n256\n512\n\nThat means, the position embedding layer of the transformers has 512 weights, but the sentence transformer will only use and was trained with the first 256 of them. Therefore, you should be careful with increasing the value above 256. It will work from a technical perspective, but the position embedding weights (>256) are not properly trained and can therefore mess up your results. Please also check this StackOverflow post.\n\nRegarding throwing an exception, I think that is not offered by the library and you, therefore, have a write a workaround by yourself:\n\nfrom sentence_transformers import SentenceTransformer\nmodel = SentenceTransformer('all-MiniLM-L6-v2')\n\nmy_text = \"this is a test \"*1000\n\ntry:\n  o = model[0].tokenizer(my_text, return_attention_mask=False, return_token_type_ids=False)\n  if len(o.input_ids) > model.max_seq_length:\n    raise ValueError(\"Oh no!\")\nexcept ValueError:\n  ...\n\nmodel.encode(my_text)","confidence":0.95,"source":"stackoverflow","source_url":"https://stackoverflow.com/questions/75901231/how-can-i-make-sentence-bert-throw-an-exception-if-the-text-exceeds-max-seq-leng","votes":12,"created_at":"2026-04-19T04:52:12.295467+00:00","updated_at":"2026-04-19T04:52:12.295467+00:00"}