Tracking your Uber Rides and Uber Eats expenses through a data engineering process
Technologies and skills:
Python, Docker, Apache Airflow, AWS Redshift, Power BI, data modelling, Task schedulling, ETL and ELT processes, Data warehousing, Cloud
Technologies and skills:
Python, Machine Learning, NLP, word2vec, Text Analysis, Sentiment Analysis, PCA, t-SNE, Word Embeddings, Text Preprocessing, Web scraping, Data Visualization
🚨⚠️People issues are the biggest risk to funded startups.
55% of startups fail because of people problems, according to a study by Harvard, Stanford, and University of Chicago researchers.
🧵[2/x]
1. Minimize unnecessary micromanagement
Micromanaging can be helpful in certain situations, the most effective leaders aim to delegate work in order to scale both themselves and their businesses. Our data suggests that micromanaging can be a fatal flaw for CEOs.
☁ You will likely encounter pushback when moving to the cloud. Moving to something new may seem risky and unnecessary to the developers. This requires a cultural shift.
💎 Here are some tips on how to tackle this problem.
1. Sync with cross-functional teams early and often. Train them so they understand the benefits of the cloud and are comfortable and knowledgeable using it.
2. Help teams understand the benefits, the project's processes, the desired goals and outcomes.
1. Was the database schema migrated correctly? 2. Has all the data been migrated? 3. How about user logins? 4. Can all of the users still connect and can users only access the data they're permitted to access?
💡 Developers can focus on code and logic. They do not need to manage clusters or tune infrastructure. They submit #Spark jobs from their interface of choice, and processing is auto-scaled to match the needs of the job.
💡 Data engineering teams do not need to manage and monitor infrastructure for their end users. They are freed up to work on higher value #dataengineering functions.
💡 Pay only for the job duration, vs paying for infrastructure time.