[Over-long prompt/response length] cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] because the unspecified dimension size -1 can be any value and is ambiguous

When there is a VLLM serving error, such as out of the length of the max length of a tiny LLM, the output is empty, then the trace backward to the agent client, the error happens. 

```bash
🖇 AgentOps: [OPENAI WRAPPER] Error in chat_completion_stream_wrapper: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 4608 tokens. However, you requested 6859 tokens (6347 in the messages, 512 in the completion). Please reduce the length of the messages or completion. None", 'type': 'BadRequestError', 'param': None, 'code': 400}
```

E.g. when using the costum agent in the customized cal_x example, such issue happens at the end of the first batch of the multiprocess task. 

``` bash  
 
  File "/workspace/workspace/agent-lightning/.venv/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 154, in forward
    query_states = self.q_proj(hidden_states).view(hidden_shape).transpose(1, 2)
RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] because the unspecified dimension size -1 can be any value and is ambiguous

...
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Over-long prompt/response length] cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] because the unspecified dimension size -1 can be any value and is ambiguous #50

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Over-long prompt/response length] cannot reshape tensor of 0 elements into shape [1, 0, -1, 128] because the unspecified dimension size -1 can be any value and is ambiguous #50

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions