Mar 21 โข 6 tweets โข 3 min read

The package provides tools and applications for preprocessing time series data, statistical tests, modeling, #forecasting, and benchmark performance. ๐งต ๐๐ผ

#machinelearning The package leveraged on the backend PyTorch and PyTorchLightning.

The package is fairly in early-stage, the first release, version 0.0.7, was two days ago. ๐

License: GPL-3

#PyTorch #opensource

Mar 17 โข 5 tweets โข 4 min read

Deepchecks is a #Python library for #MLOps applications. It provides functions and applications for data integrity, machine learning models validations and performance evaluation.

V0.5.0 key functionality thread ๐๐ผ

#MachineLearning #DataScience Version 0.5.0 main feature - integration with the Weights and Bias #MLOps platform. This new functionality enables the export of tests and other validations outputs created with the package to the W&B platform with the .to_wandb function ๐๐ผ

Mar 16 โข 6 tweets โข 3 min read

PyCaret is a #Python package for low-code ML applications supporting supervised and unsupervised machine learning applications and time series forecasting models. New features: ๐งต๐๐ผ

#TimeSeries #DataScience #MachineLearning The pycaret-ts-alpha recent release includes (1/3):

โ Support for univariate forecasting with exogenous variables โค๏ธ

โ Croston Model added for Intermittent Demand

โ Support for multiple seasonal periods in models that support it (e.g. TBATS)

โ Difference Plots with Diagnostics

Mar 15 โข 4 tweets โข 3 min read

๐๐ฌ๐ญ๐ซ๐ ๐๐๐ง๐๐๐ recently released ๐๐ก๐๐ฆ๐ข๐๐๐ฅ๐ - an open-source ๐๐ฒ๐ญ๐ก๐จ๐ง library for deep learning applications for drug pair scoring. ๐งต ๐๐ผ

#python #PyTorch #pharma #DeepLearning #datascience The package uses on the backend ๐๐ฒ๐๐จ๐ซ๐๐ก and ๐๐จ๐ซ๐๐ก๐๐ซ๐ฎ๐ (see links ๐๐ผ) for applications such as drug-drug interaction, polypharmacy side effects, and synergy prediction.

๐๐ข๐๐๐ง๐ฌ๐: Apache 2.0 ๐

Feb 9 โข 8 tweets โข 5 min read

The R for geographic data science, by @maps4thought, provides an introduction to data science with applications for geographic data. ๐งต๐

Images credit: from the book

#rstats #DataScientists #MachineLearning #Stats The book follows the syllabus of the "R for Data Science" course at the School of Geography, @uniofleicester. The book is still a work in progress, and a draft version is available online.

Feb 8 โข 6 tweets โข 4 min read

TuringGLM is a new Julia package for GLM models with Bayesian flavor โค๏ธ. As its name implies, the package uses the Turing package on the backend for the regression engine. ๐งต ๐๐ผ

#JuliaLang #TuringLang #DataScience #Stats It enables to specify Bayesian Generalized Linear Models using the formula syntax and returns an instantiated Turing model.

The package is inspired by the R's brms and Python's bambi packages (see links ๐๐ผ).

#rstats #Python

Feb 7 โข 5 tweets โข 3 min read

The 3rd edition of the Speech and Language Processing, by Prof. DAN JURAFSKY and Prof. @jurafsky is now available online (draft version). ๐งต ๐๐ผ

#NLP #DataScience #DeepLearning #MachineLearning

Images credit: from the book The book covers core NLP topics and includes the following topics (1/2):

โ Text preprocessing - regular expression, text normalization, N-gram models

โ Sentiment analysis and classification methods

โ Constituency grammars and parsing

Feb 6 โข 6 tweets โข 4 min read

Gadfly is another Julia package that follows the grammar of graphics. Similar to the Algebra of Graphics package, the Gadfly is also inspired by the R's #ggplot2 package and The Grammar of Graphics book ๐งต๐๐ผ

#julialang #dataviz #RStats Like ggplot2, the Gadfly uses geometries (or geom) to draw the input data with representation (e.g., point, line, bar, etc.).

In addition to the default static plot, the package supports interactive mode with #JavaScript code.

Feb 5 โข 5 tweets โข 4 min read

The Algebra of Graphics is an extension of the Makie - a Julia package for data visualization. This library supports the grammar of graphic plotting style inspired by R's ggplot2 package. ๐งต ๐๐ผ

#julialang #dataviz #rstats #ggplot2 #DataScience Like the ggplot2 package, the AlgebraofGraphics uses the '+' and '*' symbols to add different layers to the plot.

Package philosophy โก๏ธ juliaplots.org/AlgebraOfGraphโฆ

Jan 31 โข 4 tweets โข 3 min read

If you are new to #R or planning to learn R, I highly recommend checking Prof. @ajay_kolii course "R For Beginners". The course was organized by the Vishwakarma University - Puna, India. ๐งต ๐๐ผ

#RStats #OpenSource #DataScientists #Datavisualization The course covers the following topics:

โ Basics of R & RStudio

โ Dynamic Documents using R Markdown

โ Data Visualisation using ggplot2

โ Data Wrangling using dplyr

โ Slide Crafting using xaringan

Jan 30 โข 6 tweets โข 4 min read

This week I learned about Makie, a data visualization ecosystem for the Julia programming language. ๐งต๐

License: MIT ๐

Animation credit: @LazarusAlon

#julialang #dataviz #DataScience This ecosystem includes multiple packages providing a variety of 2D and 3D plotting tools ๐, supporting GPU for both interactive and noninteractive, animation, and other data visualization applications ๐.

Jan 22 โข 8 tweets โข 5 min read

If you are looking for a resource to learn Deep Learning, I recommend checking the Dive into Deep Learning book created by @Amazon scientists - Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola (main authors). ๐งต ๐๐ผ [1/n]

#DeepLearning What?

It focuses on the foundation of DL, from the basic linear Neural Network to complex modeling. It covers the math behind it while illustrating the functionality with interactive examples implemented with Python libraries such as @ApacheMXNet , @PyTorch , and @TensorFlow

Jan 20 โข 6 tweets โข 4 min read

Not every day you get to see such a creative and artistic #DataScience book ๐คฏ. The Hitchhikerโs Guide to Responsible Machine Learning is an educational comic in the area of Responsible Machine Learning with #R ๐๐ผ๐งต

#ML #rstats This beautiful book was created by Przemyslaw Biecek, Anna Kozak, and Aleksander Zawada.

The code in the book is with #RStats, code snippets are available on Rmarkdown as well (links below ๐๐ผ)

Jan 20 โข 5 tweets โข 4 min read

github.com/ashishpatel26/โฆ

#MLOps #ML #Kubernetes Kubeflow is an open-source #MLOps tool that provides toolkits for the deployment of machine learning workflows on #Kubernetes ๐. Supports core data science tools such as Jupyter notebooks, training ML models such as #TensorFlow, #PyTorch, #XGBoost, setting pipelines, etc

Jan 17 โข 8 tweets โข 8 min read

Orbit is an open-source #Python library for Bayesian time series forecasting and inference applications developed by @UberEng ๐งต ๐๐ผ

#timeseries #forecast #MachineLearning #PyTorch #Bayesian #Bayes The library uses under the hood probabilistic programming languages with libraries such as #Python @mcmc_stan , Pyro, and #PyTorch to build the forecast estimators.

The new release, version 1.1 includes the following new features and changes: ๐๐ผ

Jan 16 โข 5 tweets โข 6 min read

This week, H2O had a major release of their ML open-source library for #R and #Python, introducing two new algorithms, improvements, and bug fixing. โค๏ธ๐๐ผ ๐งต

#MachineLearning #ML #DeepLearning #rstats #DataScience #DataScientists New algorithm (1/2):

โจ Distributed Uplift Random Forest (Uplift DRF) - The Uplift DRF is a tree-based algorithm that uses a Random Forecast classifier to estimate a treatment's incremental impact. See demo on the notebook โฌ๏ธ

github.com/h2oai/h2o-3/blโฆ

#randomforest #ML #UpLift

Jan 13 โข 9 tweets โข 6 min read

@edXOnline is offering an online course by @Stanford University, following the book curriculum: ๐งต ๐๐ผ

#rstats #Statistics #ML #datascience The course instructors are two of the book authors - Prof. Trevor Hastie and Prof. @robtibshirani. While the book is based on #R some awesome people translate it to #python, #julialang, and other #Rstats flavors (see links on the comments below ๐๐ผ).

Jan 11 โข 6 tweets โข 5 min read

The Bayesian Modeling and Computation in Python by

@aloctavodia, @canyon289, and @junpenglao provides an introduction to Bayesian statistics using core Python libraries for Bayesian ๐งต ๐๐ผ

#bayesian #MachineLearning #stats The book covers the following four topics:

- Bayesian Inference concepts

- Bayesian regression methods for linear regressions, splines,

- Time series #forecasting

- Bayesian additive regression trees

- Approximate of Bayesian computation