David Regalado Profile picture
👨‍💼 Founder @DataEngiLATAM, the biggest and coolest community ever! 📊 Data Scientist | Data Engineer | Data Architect
Aug 3 6 tweets 3 min read
What is the difference between a Data Engineer and a Data Architect?

🧵[1/x] A data engineer looks at the immediate set of requirements and works towards that. In other words, data engineers build, rebuild, and tear down. ⚒

Need a new field in the report? Let's just build the whole thing. ⚒

🧵[2/x]
Jul 20 10 tweets 3 min read
"Blessing and misfortune are two sides of the same coin. One extreme can transform into another, and there is no right or wrong to this."

🧵[1/x] During the Han Dynasty, an old man (Sai Weng) living on China’s border one day lost his horse. His neighbors all said what terrible luck that was, and sympathized with the old man. But Sai Weng said: “Maybe losing my horse is not a bad thing after all.”

🧵[2/x]
Jul 18 7 tweets 4 min read
Did you notice this?

Some #GoogleCloud professional certificates on Coursera have off-platform certification exams. For a limited time, you can get a discount voucher for 20% off the cost of the exam.

This is a 🧵of links to those programs.

🧵[1/x]

@coursera @GoogleCloudTech Google Cloud Digital Leader Training Professional Certificate lnkd.in/efnhSb57

🧵[2/x] Image
Jul 5 7 tweets 5 min read
💡 Methods for addressing overfitting.



🧵[1/x]

#MachineLearning #ML 1. Increase the number of training examples. I know, I know. Sometimes that's not possible.

🧵[2/x]

#MachineLearning #ML
Jun 25 11 tweets 4 min read
💡Seven ways to become a more effective founder

Credits to @GoogleStartups

#startups #founders

🧵[1/x] 🚨⚠️People issues are the biggest risk to funded startups.

55% of startups fail because of people problems, according to a study by Harvard, Stanford, and University of Chicago researchers.

🧵[2/x]
Jun 24 15 tweets 7 min read
📣Data Engineering Projects for Beginners 2022

👇🧵[1/x]

#dataengineering #python #Docker #developers #aws #GoogleCloud #apacheairflow Tracking your Uber Rides and Uber Eats expenses through a data engineering process

Technologies and skills:
Python, Docker, Apache Airflow, AWS Redshift, Power BI, data modelling, Task schedulling, ETL and ELT processes, Data warehousing, Cloud

🧵[2/x]

github.com/Wittline/uber-…
Jun 3 9 tweets 8 min read
Best websites for data science! 😱

/

¡Las mejores páginas web para aprender ciencia de datos! 😱

🧵⬇ [1/7]

#datascience reddit.com/r/datascience
@Reddit

🧵⬇ [2/7]

#datascience
May 25 30 tweets 8 min read
Best-of Machine Learning projects with Python 👇

[1/x] 🧵

#Python #MachineLearning #DataScience Machine Learning Frameworks
56 projects

github.com/ml-tooling/bes…
May 9 6 tweets 2 min read
☁ You will likely encounter pushback when moving to the cloud. Moving to something new may seem risky and unnecessary to the developers. This requires a cultural shift.

💎 Here are some tips on how to tackle this problem.

#cloud #googlecloud #azure #aws 1. Sync with cross-functional teams early and often. Train them so they understand the benefits of the cloud and are comfortable and knowledgeable using it.
May 7 11 tweets 12 min read
Without effective testing, there's no way to know if your database has been migrated correctly. There are many things you need to verify.

🧵1/10

#databases #databasemigration #dataengineering #sql #googlecloud 1. Was the database schema migrated correctly?
2. Has all the data been migrated?
3. How about user logins?
4. Can all of the users still connect and can users only access the data they're permitted to access?

🧵2/10

#databases #databasemigration #dataengineering #sql
Mar 27 7 tweets 7 min read
Can you imagine serverless Spark + BigQuery together? 🤯

Forget about managing clusters and tuning infrastructure if your job is to focus on create business value.

👇

🧵1/6

#googlecloud #bigquery #spark #dataengineering Why Serverless Spark?

💡 Developers can focus on code and logic. They do not need to manage clusters or tune infrastructure. They submit #Spark jobs from their interface of choice, and processing is auto-scaled to match the needs of the job.

🧵2/6

#googlecloud #bigquery #gcp
Mar 18 7 tweets 5 min read
Bored employees are super disengaged, prone to conflict and suffer burnout at a higher rate.

Harvard Business Review found employees who are learning at work experience less anxiety and stress. They are also more ethical than bored workers on autopilot.

🧵1/7

#manager #career All employees deserve to learn, bored, engaged or somewhere in between. It’s your job as a manager to make that happen. Here are three ideas to create a learning strategy and become a manager whose people won’t leave:

🧵2/7

#manager #career
Mar 16 6 tweets 2 min read
Is Infrastructure as SQL (IaSQL) a thing? 🤔

🧵1

#dataengineering #sql #IaSQL #cloud 𝗛𝗼𝘄 𝗜𝗮𝗦𝗤𝗟 𝘄𝗼𝗿𝗸𝘀

IaSQL is an open-source SaaS that models cloud infrastructure as data by maintaining a 2-way connection between a AWS account and a hosted PostgreSQL database. ☁

🧵2
Feb 26 9 tweets 10 min read
The evolution of data processing frameworks.

Knowing how these frameworks have evolved can help you understand the typical problems that arise, and how they're addressed.

As the Internet grew, Google invented new data processing methods.

🧵

#GCP #google @google @googlecloud In 2002, Google created GFS, or the Google File System to handle sharding and storing petabytes of data at scale.

GFS is a foundation for cloud storage, and also for what would become BitQuery managed storage.

🧵

#GCP #google @google @googlecloud