Data science and engineering manager at ๏ฃฟ | #rstats & #Julialang | ๐ฆ dev | โค๏ธ time-series analysis & forecasting | Author. Opinions are my own
Mar 21 โข 6 tweets โข 3 min read
NeuralForecast - new #Python library for time series forecasting with deep learning by @nixtlainc.
The package provides tools and applications for preprocessing time series data, statistical tests, modeling, #forecasting, and benchmark performance. ๐งต ๐๐ผ #machinelearning
The package leveraged on the backend PyTorch and PyTorchLightning.
The package is fairly in early-stage, the first release, version 0.0.7, was two days ago. ๐
Deepchecks is a #Python library for #MLOps applications. It provides functions and applications for data integrity, machine learning models validations and performance evaluation.
V0.5.0 key functionality thread ๐๐ผ
#MachineLearning#DataScience
Version 0.5.0 main feature - integration with the Weights and Bias #MLOps platform. This new functionality enables the export of tests and other validations outputs created with the package to the W&B platform with the .to_wandb function ๐๐ผ
Mar 16 โข 6 tweets โข 3 min read
New release for PyCaret time series module! ๐
PyCaret is a #Python package for low-code ML applications supporting supervised and unsupervised machine learning applications and time series forecasting models. New features: ๐งต๐๐ผ
#TimeSeries#DataScience#MachineLearning
The pycaret-ts-alpha recent release includes (1/3):
โ Support for univariate forecasting with exogenous variables โค๏ธ
โ Croston Model added for Intermittent Demand
โ Support for multiple seasonal periods in models that support it (e.g. TBATS)
โ Difference Plots with Diagnostics
๐๐ฌ๐ญ๐ซ๐ ๐๐๐ง๐๐๐ recently released ๐๐ก๐๐ฆ๐ข๐๐๐ฅ๐ - an open-source ๐๐ฒ๐ญ๐ก๐จ๐ง library for deep learning applications for drug pair scoring. ๐งต ๐๐ผ
#python#PyTorch#pharma#DeepLearning#datascience
The package uses on the backend ๐๐ฒ๐๐จ๐ซ๐๐ก and ๐๐จ๐ซ๐๐ก๐๐ซ๐ฎ๐ (see links ๐๐ผ) for applications such as drug-drug interaction, polypharmacy side effects, and synergy prediction.
๐๐ข๐๐๐ง๐ฌ๐: Apache 2.0 ๐
Feb 9 โข 8 tweets โข 5 min read
Geographic Data Science with R! ๐๐๐
The R for geographic data science, by @maps4thought, provides an introduction to data science with applications for geographic data. ๐งต๐
Images credit: from the book #rstats#DataScientists#MachineLearning#Stats
The book follows the syllabus of the "R for Data Science" course at the School of Geography, @uniofleicester. The book is still a work in progress, and a draft version is available online.
Feb 8 โข 6 tweets โข 4 min read
Bayesian Generalized Linear Models with Julia! ๐๐๐
TuringGLM is a new Julia package for GLM models with Bayesian flavor โค๏ธ. As its name implies, the package uses the Turing package on the backend for the regression engine. ๐งต ๐๐ผ
#JuliaLang#TuringLang#DataScience#Stats
It enables to specify Bayesian Generalized Linear Models using the formula syntax and returns an instantiated Turing model.
The package is inspired by the R's brms and Python's bambi packages (see links ๐๐ผ).
The 3rd edition of the Speech and Language Processing, by Prof. DAN JURAFSKY and Prof. @jurafsky is now available online (draft version). ๐งต ๐๐ผ
#NLP#DataScience#DeepLearning#MachineLearning
Images credit: from the book
The book covers core NLP topics and includes the following topics (1/2):
โ Text preprocessing - regular expression, text normalization, N-gram models
โ Sentiment analysis and classification methods
โ Constituency grammars and parsing
Feb 6 โข 6 tweets โข 4 min read
Grammar of graphic with Julia (part 2)! ๐
Gadfly is another Julia package that follows the grammar of graphics. Similar to the Algebra of Graphics package, the Gadfly is also inspired by the R's #ggplot2 package and The Grammar of Graphics book ๐งต๐๐ผ
#julialang#dataviz#RStats
Like ggplot2, the Gadfly uses geometries (or geom) to draw the input data with representation (e.g., point, line, bar, etc.).
In addition to the default static plot, the package supports interactive mode with #JavaScript code.
Feb 5 โข 5 tweets โข 4 min read
Grammar of graphic with Julia!
The Algebra of Graphics is an extension of the Makie - a Julia package for data visualization. This library supports the grammar of graphic plotting style inspired by R's ggplot2 package. ๐งต ๐๐ผ
If you are new to #R or planning to learn R, I highly recommend checking Prof. @ajay_kolii course "R For Beginners". The course was organized by the Vishwakarma University - Puna, India. ๐งต ๐๐ผ
#RStats#OpenSource#DataScientists#Datavisualization
The course covers the following topics:
โ Basics of R & RStudio
โ Dynamic Documents using R Markdown
โ Data Visualisation using ggplot2
โ Data Wrangling using dplyr
โ Slide Crafting using xaringan
Jan 30 โข 6 tweets โข 4 min read
Data visualization with Julia! โค๏ธโค๏ธโค๏ธ
This week I learned about Makie, a data visualization ecosystem for the Julia programming language. ๐งต๐
License: MIT ๐
Animation credit: @LazarusAlon #julialang#dataviz#DataScience
This ecosystem includes multiple packages providing a variety of 2D and 3D plotting tools ๐, supporting GPU for both interactive and noninteractive, animation, and other data visualization applications ๐.
Jan 22 โข 8 tweets โข 5 min read
Amazon Deep Learning Book! ๐๐๐
If you are looking for a resource to learn Deep Learning, I recommend checking the Dive into Deep Learning book created by @Amazon scientists - Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola (main authors). ๐งต ๐๐ผ [1/n] #DeepLearning
What?
It focuses on the foundation of DL, from the basic linear Neural Network to complex modeling. It covers the math behind it while illustrating the functionality with interactive examples implemented with Python libraries such as @ApacheMXNet , @PyTorch , and @TensorFlow
Jan 20 โข 6 tweets โข 4 min read
Responsible Machine Learning book! ๐ฆ๐
Not every day you get to see such a creative and artistic #DataScience book ๐คฏ. The Hitchhikerโs Guide to Responsible Machine Learning is an educational comic in the area of Responsible Machine Learning with #R ๐๐ผ๐งต
#ML#rstats
This beautiful book was created by Przemyslaw Biecek, Anna Kozak, and Aleksander Zawada.
The code in the book is with #RStats, code snippets are available on Rmarkdown as well (links below ๐๐ผ)
Jan 20 โข 5 tweets โข 4 min read
Are you looking to start with ๐๐ฎ๐๐๐๐ฅ๐จ๐ฐ ๐? check this step by step installation guide for Windows by Ashish Patel ๐งต๐๐ผ
#MLOps#ML#Kubernetes
Kubeflow is an open-source #MLOps tool that provides toolkits for the deployment of machine learning workflows on #Kubernetes ๐. Supports core data science tools such as Jupyter notebooks, training ML models such as #TensorFlow, #PyTorch, #XGBoost, setting pipelines, etc
The new release, version 1.1 includes the following new features and changes: ๐๐ผ
Jan 16 โข 5 tweets โข 6 min read
H2O new release! ๐๐๐
This week, H2O had a major release of their ML open-source library for #R and #Python, introducing two new algorithms, improvements, and bug fixing. โค๏ธ๐๐ผ ๐งต
Did you know that An Introduction to Statistical Learning (ISLR) book has an online course? ๐ฅ ๐โค๏ธ @edXOnline is offering an online course by @Stanford University, following the book curriculum: ๐งต ๐๐ผ
New book for Bayesian statistics with #Python! ๐๐๐
The Bayesian Modeling and Computation in Python by @aloctavodia, @canyon289, and @junpenglao provides an introduction to Bayesian statistics using core Python libraries for Bayesian ๐งต ๐๐ผ
#bayesian#MachineLearning#stats
The book covers the following four topics:
- Bayesian Inference concepts
- Bayesian regression methods for linear regressions, splines,
- Time series #forecasting
- Bayesian additive regression trees
- Approximate of Bayesian computation