While everyone is amazed by the capabilities of ChatGPT, I dived into the API requests to learn more about the request structure, here is a glimpse of what functionalities it might have in the future - a 🧵 #ChatGPT#OpenAIChat
There are multiple possible roles that the entities interacting can take: 1. unknown 2. user (being used for input) 3. assistant (being used while replying) 4. system 5. critic 6. tool
These have wild implications, regarding the use-cases in future.
The system will be able to act not only as a companion but also as a critic to the input, replacing the likes of @Grammarly towards giving better suggestions. Will it be able to spot mistakes in proofs? What implications does the role of tool has? There is much to think about.
It can support multiple types of input according to API: 1. text 2. code 3. tether_browsing_code 4. tether_browsing_display 5. tether_quote 6. error 7. stdout 8. stderr 9. image 10. execution_output 11. masked_code 12. masked_text 13. unit_test_result 14. system_error
I think the following points are worth noting: 1. Text and code are expected, image comes as an surprise.
2. What does the tether suite of options mean? There seems to be an effort to compete with the likes of @AdeptAILabs
4. There is a strong focus on coding, it supports a suite of options ranging from error resolution to output/results to explain the code. Is there a seperate model/prompt for these?
In the near future chatGPT will be able to both execute actions and critique input!
A screenshot of the error message that reveals these attributes, the same can be replicated with tinkering the API.
I tried to generate a base64 image payload but seems like the backend does not support it yet 😝
• • •
Missing some Tweet in this thread? You can try to
force a refresh
How does OpenAI plugins/Browser work? A thread of detailed analysis on the server interaction. S/o to @CrisGiardina for helping with access.
TLDR: Browsing (and possibly plugin) is a different model with 8k seq length support and a toolformer like operation structure.
If you look into the model names, this one is called text-davinci-002-browse and supports 8197 max_tokens.
So those of you who have not got access to GPT-4 8k API but have got access to plugins, you can have a similar experience with longer context.
The process of querying is simple, you just chat like any other ChatGPT session but the model is smart enough to make web-requests, but how? First lets look at the client side view of the chat session: