Grok 4.1 Fast is not respecting the "Thinking" Toggle

Title: Grok 4.1 Fast is not respecting the "Thinking" Toggle

Description:

Hi support team,

I've noticed a regression in the "Thinking" vs "Instant" toggle specifically for the Grok 4.1 Fast model in t3-chat. Previously, this worked as expected across models:

  • Instant mode: Immediate response, no visible thinking buffer/streaming.

  • Thinking mode: Thinking steps shown first, then final response.

Now, even with "Instant" toggled on, there's always a thinking buffer delay before the response appears, making the toggle ineffective.

Steps to Reproduce:

  1. Log into t3-chat.

  2. Select Grok 4.1 Fast model.

  3. Toggle to "Instant" mode (confirm it's active).

  4. Send any simple query (e.g., "Hello").

  5. Observe: Thinking buffer/stream appears before final response.

Actual Behavior:

  • Both modes show thinking buffer, defeating the purpose of Instant.

Environment:

  • Browser: Zen (Firefox based && latest version)

  • OS: Fedora Linux (latest)

  • Model: Grok 4.1 Fast only (other models work fine)

This started recently (post-update?). No huge blocker, but toggle feels broken. Happy to provide screenshots, video, or test further.

Thanks!

~ Sam

Please authenticate to join the conversation.

Upvoters
Status

Completed

Board
💡

Feature Request

Subscribe to post

Get notified by email when there are changes.