🚨BREAKING: New Python library for agentic data processing and ETL with AI
Introducing DocETL.
Here's what you need to know:
1. What is DocETL?
It's a tool for creating and executing data processing pipelines, especially suited for complex document processing tasks.
It offers:
- An interactive UI playground
- A Python package for running production pipelines
2. DocWrangler
DocWrangler helps you iteratively develop your pipeline:
- Experiment with different prompts and see results in real-time
- Build your pipeline step by step
- Export your finalized pipeline configuration for production use
These 7 statistical analysis concepts have helped me as an AI Data Scientist.
Let's go: 🧵
Step 1: Learn These Descriptive Statistics
Mean, median, mode, variance, standard deviation. Used to summarize data and spot variability. These are key for any data scientist to understand what’s in front of them in their data sets.
2. Learn Probability
Know your distributions (Normal, Binomial) & Bayes’ Theorem. The backbone of modeling and reasoning under uncertainty. Central Limit Theorem is a must too.
🚨Introducing Agent Development Kit (ADK) by Google
A simple framework from Google for building, evaluating, and deploying AI agents.
Here's what you need to know (a thread): 🧵
1. What is ADK?
Agent Development Kit (ADK) is a framework from Google for building, evaluating, and deploying AI agents.
It is “model-agnostic” and “deployment-agnostic”: although it’s optimized to work well with Google’s models and infrastructure (like Gemini, Vertex AI), you can use it with other models and deploy it in different environments.
2. Key Features
- Agents are Structured
- Modular
- Tooling & integration
- Supports multi-agent system