Discover and read the best of Twitter Threads about #MachineLearning

Most recents (24)

"As the world prepares to invest trillions of dollars in quelling climate change, we must focus on a just transition by ensuring that climate solutions also uplift those who are most impacted." -- Yusuf Jameel, Project Drawdown…
Forging Venezianita: La Serenissima’s Tortured Relationship with its Byzantine DNA…
#VenetianHistory, #AcademicStudy, #ByzantineBackground
Shifts in regional water availability due to global tree restoration | Nature Geoscience…
#TreeRestoration, #WaterAvailability, #RegionalShifts, #HydrologicalEffects
Read 13 tweets
Can computational methods assess the #sentiment of complex texts? In an article with my fabulous colleagues from the @WZB_Berlin, @unipotsdam @LMU_Muenchen @JungeAkademie, we answer this question by applying #dictionary and #scaling methods on a sample of #literature reviews🧵
The article is also a practical #guide to help researchers select an appropriate #method and degree of preprocessing for their own data. Our #corpus consisted of 6.041 summaries of reviews of contemporary German #literature acquired from @perlentaucher00 (2/9)
The linguistic #complexity of #literature reviews differ from other texts with regard to their #language-- ambiguity, irony, metaphors, etc.-- are comparatively difficult to capture with #computational approaches. So how did our different methods fare? (3/9)
Read 9 tweets
"Poetry...cannot be copied. It’s an artifact of introspection that can only be mastered by our species. There is no superhuman way to write poems because we write them by virtue of being what a computer isn’t: human." -- Carmine Starnino @cstarnino…
Origin of life theory involving RNA–protein hybrid gets new support…
#LifeOrigins, #RNAProtein, #AminoAcids
Poetry & digital personhood by Carmine Starnino | The New Criterion…
#ArtificialIntelligence, #MachineLearning, #DigitalPersonhood, #ArtFuture
Read 13 tweets
What is Feature Engineering for Machine Learning?

A thread 🧵

Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work.
If feature engineering is done correctly, it increases the predictive power of machine learning algorithms by creating features from raw data that help facilitate the machine learning process.
Read 15 tweets
- Un système de stockage moléculaire de l'énergie solaire permettant une restitution sous forme d'électricité ... beau potentiel
- #PaLM le nouveau modèle d' #IA de #Google : expliquer des blagues, corriger du code informatique… rien ne lui résiste. Derrière ces perf. : 540 Mds de paramètres et 9M$ pour l'entrainement…
Read 7 tweets
New article on #websites #classification discussing possible #taxonomy that can be used (IAB, Google, Facebook, etc.) as well as #machinelearning models:…

list of useful resources:
a new telegram channel where will post about #explainableai (#XAI for short):…
there are now many useful libraries available for doing #explainability of #AI models: SHAP, LIME, partial dependence plots PDP. And also the "classical" feature importance.
Our german blog on topic of website #categorizations:
Read 6 tweets
Day-4 🧵: Avoid overfitting with Early Stopping callback! 🚀

@PyTorchLightnin Early stopping callback automatically stops training once it detects that there is no improvement in the monitored metrics (for example validation accuracy). ⚡️

1/3 Image
It provides additional parameters that stop training at extreme points:

1️⃣ stopping_threshold
Stop training if the monitored metric has reached this threshold.

2️⃣ divergence_threshold
Stop training if the monitored metric is worse than this threshold.

3️⃣ Check_finite
This will check if your monitored metric is NaN or infinite.

4️⃣ check_on_train_epoch_end
It checks the metric at the end of the training epoch instead of the validation epoch.

#MachineLearning #45DaysOfLightning #DataScience #Python #PyTorchLightning

Read 4 tweets
If a Data Scientist has a Github with 3 Python projects, you don't need to give them a technical interview. If they've been working as a Data Scientist for 3+ years, they don't need a take-home project.
#DataScience #MachineLearning #Hiring
Do they have a blog with 1 or 2 years worth of posts on Machine Learning Engineering? Published research? A YouTube channel with tons of Data Science educational content? Significant open source contributions?
#DataScience #MachineLearning #Hiring
I get a better sense of a candidate's capabilities from those sources. In my experience, the generic methods have lower predictive value for employee performance.
#DataScience #MachineLearning #Hiring
Read 9 tweets
What is the difference between a predictive problem and a causal inference problem? This is an essential differentiation for data scientists, and even very smart people botch the answer. 2 very smart authors did just that. Let me explain.
#DataScience #MachineLearning
They proposed 2 questions:

1. Should I hire more college graduates?
2. Should I subsidize college degrees for my employees?
#DataScience #MachineLearning
In the article, they said question 1 is a prediction problem and question 2 is a causal inference problem. They are both causal inference problems because they ask the data scientist to prescribe a policy.
#DataScience #MachineLearning
Read 13 tweets
Data Scientist Job Openings On LinkedIn:
March - 138K
Now - 134K

Hiring is slowing for mid to junior-level roles. That's the first sign of tightening budgets and more changes will come quickly. Let me explain what comes next.
#DataScience #MachineLearning #Leadership
Higher costs are compressing margins for businesses across industries. Revenue growth has stagnated. Both factors mean businesses must find ways to cut costs or they are in danger.
#DataScience #MachineLearning #Leadership
Missing on revenue projections or lowering guidance for the rest of the year is a death sentence for share prices. The C Suite is measured by share price so they're moving quickly to cut costs.
#DataScience #MachineLearning #Leadership
Read 14 tweets
New hiring rules. Any test given to a candidate has to be taken by the existing team, and 80% of them have to pass it.

#DataScience #MachineLearning #Hiring
If the job description asks for a minimum of 5 years of experience, it needs to include an explanation of why 4 years isn’t enough.
#DataScience #MachineLearning #Hiring
After 2 rounds of interviews, the company needs to explain what additional information they expect to get from this round and why they didn’t get it during the last round.
#DataScience #MachineLearning #Hiring
Read 11 tweets
"No matter how outrageous a claim may initially seem, we all need to consider the possibility that it might actually be true – just consider how many things that you never thought would be possible are now reality." -- Rebecca Sherry Eshraghi…
Sustaining Antimicrobial Stewardship in a High–Antibiotic Resistance Setting | Clinical Decision Support…
#AntibioticResistance, #AntimicrobialStewardship, #ClinicalDecisionSupport, #CohortStudy
Read 13 tweets
The popularity of Python comes from the fact that it has a rich set of libraries & frameworks available for numerous usecases.

Today we look at some realworld applications & the available Python framework/library for it

#100DaysOfCode #Python #MachineLearning #DataScience
{ Web Development }

1⃣ @djangoproject
2⃣ Flask
3⃣ @FastAPI
4⃣ Sanic
{ Machine Learning }
1⃣ Pandas
2⃣ Numpy
3⃣ Keras
4⃣ PyTorch
5⃣ Scikit-Learn
6⃣ Matplotlib
7⃣ TensorFlow
8⃣ Seaborn
Read 8 tweets
[Data Analysis] 🧵
Exploratory data analysis is a fundamental step in any analysis work. You don't have to be a data scientist and be proficient at modeling to be a useful asset to your client if you can do great EDA.

Here's a template of a basic yet powerful EDA workflow👇
EDA is incredibly useful. Proper modeling CANNOT happen without it.

The truth:
Stakeholders NEED it far more than modeling.

EDA empowers the analyst with knowledge about the data, which then moderates the #machinelearning pipeline
While #pandas and #matplotlib are key to good EDA in #python, the real difference are the QUESTIONS you ask to your dataset.

As in all things, these tools are just tools. The real weapon is the analyst. You are in control, not the dataset.
Read 10 tweets
Supervised deep learning is limited by label quality. An ontology must be built before labeling begins. That's a graph defining concepts and their connections. Ontologies guide labeling to ensure consistency and completeness.
#DataScience #MachineLearning #DeepLearning
Any problem space including people introduces multiple, often conflicting ontologies. Datasets ideally have multiple labels and require multiple models to be trained.
#DataScience #MachineLearning #DeepLearning
Most projects have a single, majority consensus labeling methodology. Where ontologies diverge from or conflict with it, inference will be inaccurate no matter how incredible the models we use become.
#DataScience #MachineLearning #DeepLearning
Read 5 tweets
Approach your Data Science learning path strategically. Start by asking, ‘why do people build models?’ I'm going to explain a more effective approach to learning our field that focuses on applications over theory.
#DataScience #MachineLearning #CareerAdvice
Most use cases in the business world don’t use complex machine learning or deep learning. It’s mostly analytics and simple models.

Why do people build simple models? Models are mathematical tools to extract knowledge from data.
#DataScience #MachineLearning #CareerAdvice
Why do people build datasets? Datasets introduce new knowledge into the business. Having data is not enough. The dataset must contain new knowledge.
#DataScience #MachineLearning #CareerAdvice
Read 10 tweets
✨Very happy to be able to share my new article titled "The ethics and politics of data sets in the age of machine learning: deleting traces and encountering remains" which is out in @MCSjournal open access… #datasets #MachineLearning #aireuse
The article explores the emergence of what I call ‘critical dataset studies’ and draws on critical archival theory to articulate the ethico-political issues surfaced by these studies.
Specifically my article argues that critical dataset studies shows the need for an expanded ethical and conceptual approach to datasets that not only relies on linear notions of deletion and accountability but also on iterative frameworks of remains and response-ability.
Read 8 tweets
[ML tools & tips] 🧵

Have you ever used sklearn's pipeline class to enhance your analysis?

While not mandatory, pipelines bring important benefits if implemented in our code base.

Here is a short thread of why you should use pipelines to improve your #machinelearning work 👇
In #datascience and machine learning, a pipeline is a set of sequential steps that allows us to control the flow of data.
They are very useful as they make our code cleaner, more scalable and readable. They are used to organize the various phases of a project.
Implementing pipelines is not mandatory but has significant advantages, such as

- cleaner code
- less room for error
- implemented like a typical model with .fit()
Read 10 tweets
Recently, a Princeton postdoc posted a thread about a paper he had published with his PI and group in PNAS, which raised serious methodological and ethical concerns. With many others, I tweeted my views of these problems, and did so with strong language.

In doing so, I contributed to a massive Twitter pile-on against this work, which this *junior* research could only have felt directed directly against himself. He has now deleted his Twitter account. I cannot believe that this is a coincidence.
I deeply regret my part in the pile-on. Even if criticism is not aimed at the researchers (and much of it was), massive numbers of often senior researchers directing harsh words at one's research, calling it unethical and comparing it to blatant racism, can only be a traumatic
Read 8 tweets
Open-sourcing Twitter’s algorithm isn’t what most people think it is. I don’t think even Elon Musk or most people at Twitter really understand where this process goes.
#DataScience #MachineLearning #Twitter
The code is not very insightful. The model itself is too complex for people to understand and interact with. So, what does open-sourcing the algorithm look like?
#DataScience #MachineLearning #Twitter
It’s the ability to click on a Tweet in your timeline and get a detailed explanation of why it was served to you. There are levels of model explainability.
#DataScience #MachineLearning #Twitter
Read 10 tweets
Top 5 eCommerce Trends for 2022 - Parag Pallav Talks…
1. Social commerce
#Socialcommerce is not a new trend. Business can take huge benefit from ecommerce growth using Social Media platforms

#ecommerce #business
2. Augmented experience
Many eCommerce companies are working hard toward developing AR-powered apps that can help customers try out products without actually visiting a store in person.

#AugmentedReality #MetaverseNFT
Read 6 tweets
If you're a Data Scientist who wants to be a better developer or builder, here's a thread on how to do it. There's so much bad advice out there, and I hope this helps clear things up.
#DataScience #MachineLearning #Programming
1. Spend a year coding as part of a team. Have people review your code and participate in code reviews. This will help you unlearn many bad habits. You'll also get exposure to different styles and best practices.
#DataScience #MachineLearning #Programming
2. Build traditional software engineering type projects. Services and Web Apps are great because you'll learn fundamental coding skills.

You'll have to Google a lot which is a software engineering superpower.
#DataScience #MachineLearning #Programming
Read 8 tweets
Most Data Strategies are missing a critical component. It's a Data Monetization Catalog, and they are not difficult to build. Here's my process:
#DataScience #MachineLearning #Data #Strategy
The process starts with the question, what use cases is this data used for? Use cases have business value, and it's a straight-line connection.
#DataScience #MachineLearning #Data #Strategy
I walk clients through this exercise, and it reveals excellent insights because data catalogs and dictionaries are connected to technical use cases but rarely to business use cases.

Here's what we most frequently find:
#DataScience #MachineLearning #Data #Strategy
Read 8 tweets
Data Science introduces a new model or architecture weekly, and it can be tough to keep up. Here are some of the basics and recent releases with resources to help you quickly understand each one.
#DataScience #MachineLearning #DeepLearning
Let's start with DALL E2. Here's a python implementation. Sometimes the easiest way to learn about it is to use it.…

Here's a YT video with a simple explanation.

#DataScience #MachineLearning #DeepLearning
Google recently released an overview of PaLM. It's one of a growing list of large scale language models improving on the capabilities of earlier models like GPT-3. Deep learning is going big.…
#DataScience #MachineLearning #DeepLearning
Read 15 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!