Tools like #chatgpt & github #copilot can help debug complex code and replace Googling + Stack Overflowing for common scripting.
Key skill: ChatGPT prompting (more on this in my free ChatGPT for Data Scientists)
2. Code Quality & Documentation
Great products have great documentation. AI can help produce documentation, comment code, and replace time-consuming manual documentation with automated AI docs.
Outliers have led me to 100s of business insights. But first I had to find them.
In 3 minutes let me kill your confusion. Let's dive into outliers:
1. Outliers
Outliers or anomalies in a dataset are data points that differ significantly from other observations. They are often important insights signifying key events.
2. Methods: There are 1000s of outlier detection methods. The ones I use can be broken into 4 categories:
1. Statistical 2. Clustering 3. Time Series 4. Machine Learning
Tableau and PowerBI are getting killed by free AI tools.
Case in Point: Microsoft's AI Data Formulator.
100% free in Python. Let's dive in:
1. Data Formulator: Create Rich Visualizations with AI
Data Formulator is an AI-powered tool for data analysts to iteratively create rich visualizations.
Data Formulator is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.
2. A Novel Approach to Business Intelligence
Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines user interface interactions (UI) and natural language (NL) inputs for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI
Google just dropped a new Generative AI Python library for SQL Databases.
Introducing Google GenAI Toolbox.
This is what you need to know:
1. Meet the Google GenAI Toolbox
An open-source server designed to simplify building Gen AI tools for your databases. It streamlines development, letting you integrate powerful data tools with just a few lines of code.
2. The Toolbox handles the heavy lifting
Managing connection pooling, authentication, and moreβso you can focus on creating innovative Gen AI applications without reinventing the wheel.
There's a new Python library that automates Business Intelligence with AI using Text2SQL.
Let me introduce you to WrenAI:
1. Meet WrenAI
WrenAI is the future of Generative Business Intelligence (GenBI). It transforms complex data into intuitive insights through a conversational, no-code interface.
2. Text2SQL Engine
With its advanced Text-to-SQL engine, WrenAI lets you ask questions in plain language and instantly translates them into actionable queries, democratizing data access for everyone.
Forecasting time series is what made me stand out as a data scientist.
But it took me 1 year to master ARIMA.
In 1 minute, I'll evaporate your confusion. Let's go.
1. Autoregressive Forecast Models
ARIMA and SARIMA are both statistical models used for forecasting time series data, where the goal is to predict future points in the series. The implement a concept called Autoregression.
2. ARIMA Decomposed:
AR-I-MA stands for Autoregressive (AR), Integrated (I), Moving Average (MA).
In 3 minutes, I'll demolish your confusion. Let's go: π§΅
1. Time Series Analysis:
Time series analysis is a statistical technique that deals with time-ordered data points. It's commonly used to analyze and interpret trends, patterns, and relationships within data that is recorded over time (e.g. with timestamps).
2. Uses:
Understanding and applying time series analysis concepts is critical for forecasting, detecting anomalies, and drawing insights on data that varies over time.