Getting started on the Open #Bioinformatics Research Project initiative
πŸ‘€πŸ§΅πŸ‘‡ See thread below
1. Watch the introductory video on the Open Bioinformatics Research Project initiative for:
- Intro to the initiative
- High-level overview of the dataset
- Ideas for which types of analysis to perform
2. Accessing the dataset
πŸ‘‰ GitHub github.com/dataprofessor/…
πŸ‘‰ Kaggle kaggle.com/thedataprof/be…
3. Getting started notebook
πŸ‘‰ Kaggle kaggle.com/thedataprof/ge…
4. Complementary tools
To perform EDA and ML model building it may be helpful to use @RDKit_org and PaDEL (as well as PaDELPy).
Install it via:
πŸ‘‰ pip install rdkit-pypi
πŸ‘‰ pip install padelpy
5. Watch related project tutorial videos
- 6 Part #Bioinformatics from Scratch

- 2 Part #DataScience for #DrugDiscovery
Pt 1
Pt 2
6. Additional supplemental tutorial videos
- How to use PaDELPY to calculate molecular descriptors and fingerprints
- 2 minute overview of using #machinelearning for #drugdiscovery

- Build a @Streamlit app
7. Perhaps some background knowledge? Here are hour-long lecture and podcast videos
- Computational #DrugDiscovery 101
- How to Build #Bioinformatics Tools

- Podcast with @wpwalters on #cheminformatics

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with Data Professor

Data Professor Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @thedataprof

15 Sep
How to get started in #datascience?

πŸ‘€πŸ§΅πŸ‘‡ See thread below
2/ 1. Craft your own personal learning plan
Earlier this year I made a video that details the steps you can take to craft your own personal learning plan for your data journey. Everyone's plan is different, make your own! Here's how...
3/ 2. Work on data projects using datasets that is interesting to you
When starting out, I found that working on datasets that's interesting to you will help you engage in the process. Be persistent and work on the project to completion (end-to-end).
How? Data→Model→ Deployment
Read 10 tweets
31 Aug
Here’s a cartoon illustration I’ve drawn a while back:
The #machinelearning learning curve

πŸ‘€πŸ§΅πŸ‘‡ See thread below
2/ Starting the learning journey
The hardest part of learning data science is taking that first step to actually start the journey.
3/ Consistency and Accountability
After taking that first step, it may be challenging to maintain the consistency needed to push through with the learning process. And that’s where accountability steps in.
Read 9 tweets
27 Aug
Hi friends, here’s my new hand-drawn cartoon illustration ✏️
Quickly deploy #machinelearning models
πŸ‘€πŸ§΅πŸ‘‡ See thread below
2/ Deployment of machine learning models is often overlooked especially in academia
- We spend countless hours compiling the dataset, processing the data, fine tuning the model and perhaps interpreting and making sense of the model
- Many times we stop at that
- Why not deploy?
3/ Perhaps model deployment is difficult?
- Django? Flask? API?
- We now have access to powerful libraries:
1. Dash (@plotlygraphs) plotly.com/dash/
2. @Gradio gradio.app
3. @streamlit streamlit.io
4. Shiny (@rstudio) shiny.rstudio.com
Read 9 tweets
17 Aug
Cheat sheet that summarizes #DataScience in 10 pages
(Links in the comments below πŸ‘‡)
2/ Link to the cheatsheet by Maverick Lin
github.com/ml874/Data-Sci…
3/ Topics include:
- Overview of Data science
- Probability and Statistics
- Data cleaning
- Feature engineering
- Modeling
- Classical Machine learning
- Deep learning
- SQL
- Python data structures
Read 4 tweets
26 Jul
1/ #Pandas is the go-to library that you need for #datawrangling for your #datascience projects when coding in #Python.
πŸ‘€πŸ§΅πŸ‘‡ See thread below
2/ Why Do We Need Pandas?
The Pandas library has a large set of features that will allow you to perform tasks from the first intake of raw data, its cleaning and transformation to the final curated form in order to validate hypothesis testing and machine learning model building.
3/ Basics of Pandas - 1. Pandas Objects
Pandas allows us to work with tabular datasets. The basic data structures of Pandas that consists of 3 types: Series, DataFrame and DataFrameIndex. The first 2 are data structures while the latter serves as a point of reference.
Read 10 tweets
25 Jul
1/ #MachineLearning Crash Course by Google
- Free course
- Learn and apply fundamental machine learning concepts
- 30+ exercises
- 25 lessons
- 15 hours to complete
- Real-world case studies
- Explainers of ML algorithms

πŸ‘€πŸ§΅πŸ‘‡ See thread below
2/ Machine Learning Crash Course by Google
developers.google.com/machine-learni…
3/ Free machine learning crash course from Google
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(