Microsoft is planning first to work on relaxing the Bing Chat limits and chat caps for the balanced mode before working on relaxing those limits on other modes, said Mikhail Parakhin, CEO of Bing.
He said this on Twitter, " we want to keep relaxing constraints in every mode." "Right now focusing on getting the balance of Balanced right, then you should expect some further relaxation," he added on Twitter.
He also said Microsoft is seeing "weird spikes in time-to-first-token we don't understand" saying they want to get these "stabilize Balanced" mode "and get the latency spikes under control first," before doing the same for Creative and Precise chat modes.
Here are those tweets:
As I stated previously, we want to keep relaxing constraints in every mode. Right now focusing on getting the balance of Balanced right, then you should expect some further relaxation.
— Mikhail Parakhin (@MParakhin) March 20, 2023
Honestly, I want the team to stabilize Balanced and get the latency spikes under control first. We get these weird spikes in time-to-first-token we don't understand (token generation speed seems fine...).
— Mikhail Parakhin (@MParakhin) March 21, 2023
I did ask Bing Chat about this, and it is going with the PR spin. :)
Forum discussion at Twitter.