Rami Krispin Profile picture
Data science and engineering manager at ๏ฃฟ | #rstats & #Julialang | ๐Ÿ“ฆ dev | โค๏ธ time-series analysis & forecasting | Author. Opinions are my own
Mar 21 โ€ข 6 tweets โ€ข 3 min read
NeuralForecast - new #Python library for time series forecasting with deep learning by @nixtlainc.
The package provides tools and applications for preprocessing time series data, statistical tests, modeling, #forecasting, and benchmark performance. ๐Ÿงต ๐Ÿ‘‡๐Ÿผ
#machinelearning The package leveraged on the backend PyTorch and PyTorchLightning.

The package is fairly in early-stage, the first release, version 0.0.7, was two days ago. ๐ŸŒˆ

License: GPL-3
#PyTorch #opensource
Mar 17 โ€ข 5 tweets โ€ข 4 min read
New release to Deepchecks! ๐Ÿš€๐Ÿš€

Deepchecks is a #Python library for #MLOps applications. It provides functions and applications for data integrity, machine learning models validations and performance evaluation.
V0.5.0 key functionality thread ๐Ÿ‘‡๐Ÿผ

#MachineLearning #DataScience Version 0.5.0 main feature - integration with the Weights and Bias #MLOps platform. This new functionality enables the export of tests and other validations outputs created with the package to the W&B platform with the .to_wandb function ๐Ÿ‘‡๐Ÿผ
Mar 16 โ€ข 6 tweets โ€ข 3 min read
New release for PyCaret time series module! ๐Ÿš€

PyCaret is a #Python package for low-code ML applications supporting supervised and unsupervised machine learning applications and time series forecasting models. New features: ๐Ÿงต๐Ÿ‘‡๐Ÿผ

#TimeSeries #DataScience #MachineLearning Image The pycaret-ts-alpha recent release includes (1/3):
โœ…Support for univariate forecasting with exogenous variables โค๏ธ
โœ…Croston Model added for Intermittent Demand
โœ…Support for multiple seasonal periods in models that support it (e.g. TBATS)
โœ…Difference Plots with Diagnostics Image
Mar 15 โ€ข 4 tweets โ€ข 3 min read
๐ƒ๐ž๐ž๐ฉ ๐‹๐ž๐š๐ซ๐ง๐ข๐ง๐  ๐ข๐ง ๐๐ก๐š๐ซ๐ฆ๐š! ๐Ÿš€๐Ÿš€๐Ÿš€

๐€๐ฌ๐ญ๐ซ๐š ๐™๐ž๐ง๐ž๐œ๐š recently released ๐‚๐ก๐ž๐ฆ๐ข๐œ๐š๐ฅ๐— - an open-source ๐๐ฒ๐ญ๐ก๐จ๐ง library for deep learning applications for drug pair scoring. ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#python #PyTorch #pharma #DeepLearning #datascience The package uses on the backend ๐๐ฒ๐“๐จ๐ซ๐œ๐ก and ๐“๐จ๐ซ๐œ๐ก๐ƒ๐ซ๐ฎ๐  (see links ๐Ÿ‘‡๐Ÿผ) for applications such as drug-drug interaction, polypharmacy side effects, and synergy prediction.

๐‹๐ข๐œ๐ž๐ง๐ฌ๐ž: Apache 2.0 ๐ŸŒˆ
Feb 9 โ€ข 8 tweets โ€ข 5 min read
Geographic Data Science with R! ๐Ÿš€๐Ÿš€๐Ÿš€

The R for geographic data science, by @maps4thought, provides an introduction to data science with applications for geographic data. ๐Ÿงต๐Ÿ‘‡

Images credit: from the book
#rstats #DataScientists #MachineLearning #Stats The book follows the syllabus of the "R for Data Science" course at the School of Geography, @uniofleicester. The book is still a work in progress, and a draft version is available online.
Feb 8 โ€ข 6 tweets โ€ข 4 min read
Bayesian Generalized Linear Models with Julia! ๐Ÿš€๐Ÿš€๐Ÿš€

TuringGLM is a new Julia package for GLM models with Bayesian flavor โค๏ธ. As its name implies, the package uses the Turing package on the backend for the regression engine. ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#JuliaLang #TuringLang #DataScience #Stats Image It enables to specify Bayesian Generalized Linear Models using the formula syntax and returns an instantiated Turing model.
The package is inspired by the R's brms and Python's bambi packages (see links ๐Ÿ‘‡๐Ÿผ).

#rstats #Python
Feb 7 โ€ข 5 tweets โ€ข 3 min read
New book for NLP! ๐Ÿš€๐Ÿš€๐Ÿš€

The 3rd edition of the Speech and Language Processing, by Prof. DAN JURAFSKY and Prof. @jurafsky is now available online (draft version). ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#NLP #DataScience #DeepLearning #MachineLearning
Images credit: from the book The book covers core NLP topics and includes the following topics (1/2):
โœ… Text preprocessing - regular expression, text normalization, N-gram models
โœ… Sentiment analysis and classification methods
โœ… Constituency grammars and parsing
Feb 6 โ€ข 6 tweets โ€ข 4 min read
Grammar of graphic with Julia (part 2)! ๐ŸŒˆ

Gadfly is another Julia package that follows the grammar of graphics. Similar to the Algebra of Graphics package, the Gadfly is also inspired by the R's #ggplot2 package and The Grammar of Graphics book ๐Ÿงต๐Ÿ‘‡๐Ÿผ

#julialang #dataviz #RStats Like ggplot2, the Gadfly uses geometries (or geom) to draw the input data with representation (e.g., point, line, bar, etc.).

In addition to the default static plot, the package supports interactive mode with #JavaScript code.
Feb 5 โ€ข 5 tweets โ€ข 4 min read
Grammar of graphic with Julia!

The Algebra of Graphics is an extension of the Makie - a Julia package for data visualization. This library supports the grammar of graphic plotting style inspired by R's ggplot2 package. ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#julialang #dataviz #rstats #ggplot2 #DataScience Image Like the ggplot2 package, the AlgebraofGraphics uses the '+' and '*' symbols to add different layers to the plot.

Package philosophy โžก๏ธ juliaplots.org/AlgebraOfGraphโ€ฆ Image
Jan 31 โ€ข 4 tweets โ€ข 3 min read
R For Beginners! ๐Ÿš€๐Ÿš€๐Ÿš€

If you are new to #R or planning to learn R, I highly recommend checking Prof. @ajay_kolii course "R For Beginners". The course was organized by the Vishwakarma University - Puna, India. ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#RStats #OpenSource #DataScientists #Datavisualization The course covers the following topics:
โœ… Basics of R & RStudio
โœ… Dynamic Documents using R Markdown
โœ… Data Visualisation using ggplot2
โœ… Data Wrangling using dplyr
โœ… Slide Crafting using xaringan
Jan 30 โ€ข 6 tweets โ€ข 4 min read
Data visualization with Julia! โค๏ธโค๏ธโค๏ธ

This week I learned about Makie, a data visualization ecosystem for the Julia programming language. ๐Ÿงต๐Ÿ‘‡

License: MIT ๐ŸŒˆ

Animation credit: @LazarusAlon
#julialang #dataviz #DataScience This ecosystem includes multiple packages providing a variety of 2D and 3D plotting tools ๐ŸŒˆ, supporting GPU for both interactive and noninteractive, animation, and other data visualization applications ๐Ÿš€.
Jan 22 โ€ข 8 tweets โ€ข 5 min read
Amazon Deep Learning Book! ๐Ÿ“š๐Ÿ“Š๐Ÿš€

If you are looking for a resource to learn Deep Learning, I recommend checking the Dive into Deep Learning book created by @Amazon scientists - Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola (main authors). ๐Ÿงต ๐Ÿ‘‡๐Ÿผ [1/n]
#DeepLearning What?
It focuses on the foundation of DL, from the basic linear Neural Network to complex modeling. It covers the math behind it while illustrating the functionality with interactive examples implemented with Python libraries such as @ApacheMXNet , @PyTorch , and @TensorFlow
Jan 20 โ€ข 6 tweets โ€ข 4 min read
Responsible Machine Learning book! ๐Ÿฆ„๐Ÿ“š

Not every day you get to see such a creative and artistic #DataScience book ๐Ÿคฏ. The Hitchhikerโ€™s Guide to Responsible Machine Learning is an educational comic in the area of Responsible Machine Learning with #R ๐Ÿ‘‡๐Ÿผ๐Ÿงต

#ML #rstats Image This beautiful book was created by Przemyslaw Biecek, Anna Kozak, and Aleksander Zawada.

The code in the book is with #RStats, code snippets are available on Rmarkdown as well (links below ๐Ÿ‘‡๐Ÿผ) ImageImageImage
Jan 20 โ€ข 5 tweets โ€ข 4 min read
Are you looking to start with ๐Š๐ฎ๐›๐ž๐Ÿ๐ฅ๐จ๐ฐ ๐ŸŒˆ? check this step by step installation guide for Windows by Ashish Patel ๐Ÿงต๐Ÿ‘‡๐Ÿผ

github.com/ashishpatel26/โ€ฆ

#MLOps #ML #Kubernetes Image Kubeflow is an open-source #MLOps tool that provides toolkits for the deployment of machine learning workflows on #Kubernetes ๐Ÿš€. Supports core data science tools such as Jupyter notebooks, training ML models such as #TensorFlow, #PyTorch, #XGBoost, setting pipelines, etc
Jan 17 โ€ข 8 tweets โ€ข 8 min read
New release for @Uber forecasting library ๐ŸŒˆ

Orbit is an open-source #Python library for Bayesian time series forecasting and inference applications developed by @UberEng ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#timeseries #forecast #MachineLearning #PyTorch #Bayesian #Bayes The library uses under the hood probabilistic programming languages with libraries such as #Python @mcmc_stan , Pyro, and #PyTorch to build the forecast estimators.

The new release, version 1.1 includes the following new features and changes: ๐Ÿ‘‡๐Ÿผ
Jan 16 โ€ข 5 tweets โ€ข 6 min read
H2O new release! ๐Ÿš€๐Ÿš€๐Ÿš€

This week, H2O had a major release of their ML open-source library for #R and #Python, introducing two new algorithms, improvements, and bug fixing. โค๏ธ๐Ÿ‘‡๐Ÿผ ๐Ÿงต

#MachineLearning #ML #DeepLearning #rstats #DataScience #DataScientists New algorithm (1/2):
โœจ Distributed Uplift Random Forest (Uplift DRF) - The Uplift DRF is a tree-based algorithm that uses a Random Forecast classifier to estimate a treatment's incremental impact. See demo on the notebook โฌ‡๏ธ
github.com/h2oai/h2o-3/blโ€ฆ
#randomforest #ML #UpLift
Jan 13 โ€ข 9 tweets โ€ข 6 min read
Did you know that An Introduction to Statistical Learning (ISLR) book has an online course? ๐ŸŽฅ ๐ŸŒˆโค๏ธ
@edXOnline is offering an online course by @Stanford University, following the book curriculum: ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#rstats #Statistics #ML #datascience The course instructors are two of the book authors - Prof. Trevor Hastie and Prof. @robtibshirani. While the book is based on #R some awesome people translate it to #python, #julialang, and other #Rstats flavors (see links on the comments below ๐Ÿ‘‡๐Ÿผ).
Jan 11 โ€ข 6 tweets โ€ข 5 min read
New book for Bayesian statistics with #Python! ๐Ÿ“š๐Ÿ“Š๐Ÿš€
The Bayesian Modeling and Computation in Python by
@aloctavodia, @canyon289, and @junpenglao provides an introduction to Bayesian statistics using core Python libraries for Bayesian ๐Ÿงต ๐Ÿ‘‡๐Ÿผ

#bayesian #MachineLearning #stats ImageImageImage The book covers the following four topics:
- Bayesian Inference concepts
- Bayesian regression methods for linear regressions, splines,
- Time series #forecasting
- Bayesian additive regression trees
- Approximate of Bayesian computation