Claude already 'works' over Excel, but in a naive manner - it writes raw python/openpyxl to analyze an Excel sheet cell-by-cell and generally lacks a semantic understanding of the content. Basically the coding abstractions used are too low-level to have the coding agent accurately do more sophisticated analysis.
Our new LlamaSheets API lets you automatically segment structure complex Excel sheets into well-formatted 2D tables. This both gives Claude Code immediate semantic awareness of the sheet, and allows it to run Pandas/SQL over well-structured dataframes.
We've written a guide showing you how specifically to use LlamaSheets with coding agents!
I’m excited for @OpenAI’s new support for function calling fine-tuning! (@stevenheidel)
Help gpt-3.5 better structure outputs + reason/plan 🤖
Dropping a day 0 release of supporting fn fine-tuning + distilling GPT-4 w/ Pydantic in @llama_index ⚡️👇: github.com/run-llama/llam…
Our default way of using @OpenAI function calling is through our pydantic programs: simply specify the pydantic schema, and we’re use the endpoint to extract a structured output with that schema.
We can now log these results and collect them as a dataset.
This is *very* WIP - we’re excited to use function fine-tuning to explore better agentic reasoning capabilities as well as better RAG systems (we recently added support for structured outputs!)
The fine-tuned model does better than base gpt-3.5 at CoT reasoning.
Example Q: “What is the total fair value of Uber's financial assets as of March 31, 2022?”
gpt-3.5 (left) returns an inaccurate response. Finetuning (right) keeps CoT going until it finds the actual answer.
Our comprehensive guide (linked above) shows you how to do this.
At a high-level, we autogenerate a set of questions over Uber 10Q filings.
We then log prompt inputs + outputs with each call to the LLM for a GPT-4 agent.
We use this data to finetune gpt-3.5.