Bryan Cafferky
Bryan Cafferky
  • Видео 177
  • Просмотров 2 264 086
What to Learn for Your Career Path
The most common question I get asked is "What do I need to learn?" related to a given career path like Data Engineer, Data Scientist, Data Analyst, or other role. The answer is simpler than you may think. Join me as I resolve this question and put you on the right path.
Support me on Patreon
www.patreon.com/bePatron?u=63260756
My Playlists
www.youtube.com/@BryanCafferky/playlists
Slides from this Video
github.com/bcafferky/shared/blob/master/WhatToLearn/WhatToLearn.pdf
Просмотров: 376

Видео

About My Channel
Просмотров 344День назад
In the digital age, people are confused by the myriad of complex technologies. My channel is all about taking complex things, breaking them down, and making them simple to understand. Once you have that, the rest is easy. Let me tell you about what my channel is about and why you should subscribe. My Playlists www.youtube.com/@BryanCafferky/playlists Master Databricks & Apache Spark ruclips.net...
Quick Review of the Best App Dev Tools & Services
Просмотров 42714 дней назад
Whether you want to showcase your data analytics skills, or just create a cool app, you'll need an app development tool. There's a lot of tools out there to build and deploy web and mobile apps but which are the best, i.e., easiest to use, maintain, and deploy? With a focus on data centric apps, I'll review my favorites and discuss the pros and cons of the runner ups. Support me on Patreon www....
Master Databricks and Apache Spark Step by Step: Series Update - What's Changed?
Просмотров 1,2 тыс.Месяц назад
Since this series was uploaded, Databricks has added a lot of powerful new services and enhanced existing services. In this video, I will give a summary of what has changed since I uploaded the series. Spoiler Alert! The series is still valid and an ideal jump start on Databricks and Apache Spark. Watch the video to understand why.
Understanding Databricks & Apache Spark Performance Tuning: Lesson 02 - Spark Hardware
Просмотров 1,2 тыс.Месяц назад
Following up on Databricks Performance Tuning with the best place to start: allocating Spark clusters. If you don't allocate sufficient resources, nothing else will fix the problem. How many nodes? How large should the driver and workers be? Do you need GPUs or CPUs? Should you use Photon? These and many more questions will be covered in detail. Support me on Patreon www.patreon.com/bePatron?u=...
Understanding Databricks & Apache Spark Performance Tuning: Lesson 01 - Spark Architecture
Просмотров 2,8 тыс.3 месяца назад
A popular interview question and a critical topic for all Databricks and Spark developers, how do you tune and optimize Spark queries? This video provides a conceptual understanding of where things can go wrong as a starting point to understanding performance tuning and optimization. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides github.com/bcafferky/shared/blob/master/Databri...
Master Dimensional Modeling Lesson 02 - The 4 Step Process
Просмотров 2,1 тыс.3 месяца назад
Dimensional Modeling is the process of developing The Star Schema, a popular and effective way to organize your data to maximize business value. In this video, you will learn about the 4 steps in the Dimensional Modeling process. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides github.com/bcafferky/shared/blob/master/MasterDimensionalModeling/lesson_02/lesson02_DimModelingSteps0...
Master Dimensional Modeling Lesson 01 - Why Use a Dimensional Model?
Просмотров 4,6 тыс.4 месяца назад
Dimensional Modeling is a popular and effective way to organize your data to maximize business value. In this video, you will learn what a Dimensional Model, aka a Star Schema is and why you should use them to organize your data warehouse. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides github.com/bcafferky/shared/blob/master/MasterDimensionalModeling/lesson_01/DimModelingWhy_l...
Data Architecture vs. Data Engineering Deep Dive
Просмотров 3,3 тыс.4 месяца назад
Are you an aspiring Data Architect? Join me as I explain what Data Architecture is and what Data Engineering is with in-depth explanations and examples. I draw on decades of experience as a Data Engineer and Data Architect to give you time tested advice and best practices. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides available here: github.com/bcafferky/shared/blob/master/Da...
Master Data Workload Automation: Introduction
Просмотров 1,3 тыс.5 месяцев назад
Automating your data workloads is essential in today's mission critical data driven businesses but what is the best way to do it? There are two basic choices: job schedulers and data orchestrators. I'll explain what they are, review some examples, and explain when to use each. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides available here: github.com/bcafferky/shared/blob/maste...
Streamlit for Dummies: Lesson 3 - Using Advanced Features
Просмотров 9445 месяцев назад
This video builds on lesson 2 by using Streamlit's advanced features like state management, layout control, animations, and more. Using these features is crucial to building professional apps. Support me on Patreon www.patreon.com/bePatron?u=63260756 Code: github.com/bcafferky/shared/blob/master/Streamlit/lesson03/lesson03.zip My Video on Using Python Virtual Environments: ruclips.net/video/bjU...
Streamlit for Dummies: Lesson 2 - Writing Your First App
Просмотров 1,3 тыс.6 месяцев назад
Streamlit is a fun and easy way to create interactive web apps in Python. In this video I show you how to code a simple Streamlit app, i.e. a game. Support me on Patreon www.patreon.com/bePatron?u=63260756 Code: github.com/bcafferky/shared/blob/master/Streamlit/lesson02/lesson02_basic.py My Video on Using Python Virtual Environments: ruclips.net/video/bjUjNSotYgA/видео.html My Streamlit Introdu...
Python Streamlit for Dummies
Просмотров 6 тыс.7 месяцев назад
Streamlit is a fun and easy way to create interactive web apps in Python. Join me as I explain how you can get started using this powerful framework to have fun, build a data analytics online work portfolio, and add a valuable skill to your resume. Support me on Patreon www.patreon.com/bePatron?u=63260756 Slides: github.com/bcafferky/shared/blob/master/Streamlit/lesson01/StreamlitForDummies.pdf...
Python Virtual Environments & The Facts of Life
Просмотров 1,3 тыс.8 месяцев назад
If you develop Python programs, you need to use Virtual Environments! In this video, I'll explain what they are, how to use virtual them and The Facts of Life. Support me on Patreon www.patreon.com/bePatron?u=63260756 Video Slides github.com/bcafferky/shared/blob/master/PythonVirtualEnvironments/PythonVirtualEnvs.zip
How and When to Use Databricks Identity Column
Просмотров 2 тыс.8 месяцев назад
Databricks added support for Identity Columns similar to the same feature found in relational databases. How do you use it? Should you use it? How does it differ from Identity columns on relational databases? Before you use the Identity Column feature, you need to watch this video. Support me on Patreon www.patreon.com/bePatron?u=63260756 Video Slides github.com/bcafferky/shared/blob/master/Dat...
How to Create Databricks Workflows (new features explained)
Просмотров 11 тыс.9 месяцев назад
How to Create Databricks Workflows (new features explained)
Introduction to my Online Guide to my YouTube Videos
Просмотров 7099 месяцев назад
Introduction to my Online Guide to my RUclips Videos
Should You Use Databricks Delta Live Tables?
Просмотров 5 тыс.9 месяцев назад
Should You Use Databricks Delta Live Tables?
Scale Up Your Databricks Coding with Databricks AI Assistant
Просмотров 2,3 тыс.10 месяцев назад
Scale Up Your Databricks Coding with Databricks AI Assistant
Core Databricks: Understand the Hive Metastore
Просмотров 13 тыс.10 месяцев назад
Core Databricks: Understand the Hive Metastore
Python Pro! Understand Variable Scopes
Просмотров 73210 месяцев назад
Python Pro! Understand Variable Scopes
Creating Decorators on Steroids: Adding Custom Parameters
Просмотров 55311 месяцев назад
Creating Decorators on Steroids: Adding Custom Parameters
Python for Data Engineers: Using Function Decorators
Просмотров 1,8 тыс.Год назад
Python for Data Engineers: Using Function Decorators
Advanced Python Programming: Using Functions as First Class Objects
Просмотров 2,3 тыс.Год назад
Advanced Python Programming: Using Functions as First Class Objects
How to Build a Delta Live Table Pipeline in Python
Просмотров 14 тыс.Год назад
How to Build a Delta Live Table Pipeline in Python
Why Databricks Delta Live Tables?
Просмотров 15 тыс.Год назад
Why Databricks Delta Live Tables?
Understanding Delta File Logs Part 3 - The Deep Dive
Просмотров 1,8 тыс.Год назад
Understanding Delta File Logs Part 3 - The Deep Dive
Understanding Delta File Logs Part 2 - Demonstrating Transactions
Просмотров 3 тыс.Год назад
Understanding Delta File Logs Part 2 - Demonstrating Transactions
Understanding Delta File Logs - The Heart of the Delta Lake
Просмотров 7 тыс.Год назад
Understanding Delta File Logs - The Heart of the Delta Lake
Understanding Delta Lake - The Heart of the Data Lakehouse
Просмотров 6 тыс.Год назад
Understanding Delta Lake - The Heart of the Data Lakehouse

Комментарии

  • @JoesMarineRush
    @JoesMarineRush День назад

    Great video. Many thanks for sharing these thoughts. I like that you emphasized familiarity and mastery, and that you need to make a decision on where to dedicate your time to master.

  • @Noobsmove
    @Noobsmove День назад

    Honestly at this day and age, the strongest weakpoints i see in people starting in the field is not that they lack tools like programming languages knowledge in platforms like databricks. If they don't know them yet they are mostly quick to learn. If there is an actual issue, it lies way deeper. Lack of understanding in core concepts like Data normalization, dimension and fact tables, measures in multidimensional Data models. Or being able to derive architecture and data requirements from talking to a customer. Those are difficult hurdles for beginners. I feel like people rush to learn tools, before learning what to do with them. My analgoy is someone who mastered the tools of carpentry, like saw and hammer but still jas no good idea how to use them to build a good chair^^

  • @donatusajaezu
    @donatusajaezu День назад

    From the bottom of my heart, I just want to say thank you so much for this, I work as a data engineer but started off as a web developer, I have never really known how to actually organise the work of a data engineer and you just helped me with that. Now i know what and where exactly to focus on. Thanks once again.

    • @BryanCafferky
      @BryanCafferky День назад

      So glad this video was helpful. Thanks for your comment.

  • @awadelrahman
    @awadelrahman День назад

    Your content is very appreciated!! So practical and direct to the point! Any recommendation for Data Architect (or what they sometimes call Data cloud and analytics Architect), any suggestions for such career path in terms of the core knowledge etc? Thanks

    • @BryanCafferky
      @BryanCafferky День назад

      Ideally, a Data Engineer would progress to being a Data Architect but DAs need to think at a broader level and consider the implications of their design and architectural decisions. I have found not all data engineers make good architects b/c they are too in the weeds and can't see the big picture. Basic tech skills for the DA includes everything of the DE plus broader knowledge esp. in the orange and blue band that I put for the DE. See this video for more information: ruclips.net/video/cI2dYnM5Kzo/видео.html Thanks, Bryan

    • @awadelrahman
      @awadelrahman День назад

      @@BryanCafferky yes, of course I have watched this vid! So nice! However I don’t know if coming from a data science background makes things a bit different. I just don’t know what makes a great architect!

  • @sibabalwesinyaniso4491
    @sibabalwesinyaniso4491 День назад

    Thanks

  • @demircanozdemir7467
    @demircanozdemir7467 День назад

    It's been 6 years since this video taken, interface almost completely changed. Can you make a new demo please?

    • @BryanCafferky
      @BryanCafferky День назад

      Actually, that's not correct. Purview is a replacement for Azure Data Catalog. However, ADC has been retired I can see from this blog. learn.microsoft.com/en-us/azure/data-catalog/overview That's the thing. MS promotes something as the best thing ever then quietly drops it. 😞

  • @ammarahmed5981
    @ammarahmed5981 День назад

    Awesome series.

  • @stu8924
    @stu8924 3 дня назад

    Such a great channel, thank you Bryan.

  • @aalmisry
    @aalmisry 4 дня назад

    Thank you so much

  • @GhernieM
    @GhernieM 4 дня назад

    Hey Bryan, do you plan to create something about Unity Catalog?

  • @GhernieM
    @GhernieM 4 дня назад

    Thanks, you're really good at explainin these topics!

  • @HasanCatalgol
    @HasanCatalgol 4 дня назад

    Underrated channel, really quality information.

  • @YiminWei-z6w
    @YiminWei-z6w 4 дня назад

    great explanation. Thanks!

  • @ericaleverson9430
    @ericaleverson9430 5 дней назад

    I made a mistake in an interview today and confused the star schema with the 3 Normal Forms. I also stated star schema was normalization when it was denormalized...oh well.

  • @alokhom
    @alokhom 6 дней назад

    your video has decluttered me a lot. Now am going to make a hdfs on my k8s cluster and spark operator

  • @potnuruavinash
    @potnuruavinash 7 дней назад

    Your videos are descriptive , but crisp too .. To the point .. I have never seen any other tutor who explain big data concepts so well in a practical way .. Too good you are .. Love from India 💌 I wish i found your channel in my early days of my career

  • @matthewg9064
    @matthewg9064 7 дней назад

    Love the content, always very clear

  • @ChrisUK70
    @ChrisUK70 7 дней назад

    Hey Bryan, the SQL with group by negates the need for the distinct in the SELECT unless Spark SQL is different to ANSI SQL? Thanks for your series.

  • @ChrisUK70
    @ChrisUK70 7 дней назад

    Thanks Bryan sorry another question when a table is created does it lock the file so it cannot be deleted from the file system?

    • @BryanCafferky
      @BryanCafferky 7 дней назад

      In the case of this video topic, No. Because you are only creating a schema definition on top a file, i.e., schema on read. Mind you, the file system is Azure Data Lake Storage which is like a drive do it does not lock up. However, if you create a Delta table (not discussed here b/c it was very new and not in GA at the time of this video), that would create a new parquet file and related logs and these should be locked until the process is complete. Make sense?

  • @AkashD.K-u8r
    @AkashD.K-u8r 7 дней назад

    Is this series contains how to work with java jars in databricks?

  • @faicalammisaid3705
    @faicalammisaid3705 7 дней назад

    Amazing high level channel I recommend it for any young learner, keep going bryan

  • @hassaanali7405
    @hassaanali7405 8 дней назад

    And I'm here for alllllllllll of it. ❤

  • @bumpersmith
    @bumpersmith 8 дней назад

    Your laid-back method and complete explanations are a very refreshing method of learning. Thank you looking forward to watching more of your videos.

  • @awadelrahman
    @awadelrahman 8 дней назад

    I also love your way of presentation!! (Of course plus the wonderful content) and I don’t find 30 hours of prep to be too much given the high quality you deliver!

  • @ChrisUK70
    @ChrisUK70 8 дней назад

    Is the context of the database and tables only for querying there is no DML?

    • @BryanCafferky
      @BryanCafferky 8 дней назад

      Initially, Spark was only able to query data. It was never intended to be a database. So Spark SQL is originally just a query (SELECT) language. However, Databricks added full DML to it which required creating a storage format that supported Create, Read, Update, Delete) CRUD. To distinguish this from ordinary read operations, they called the new database like functionality and storage format called Delta Tables, the Data Lakehouse b/c it is a data warehouse on a data lake. Data Lakehouse has only been around for a few years and simulates a database in many ways but it is implemented very differently b/c it uses parquet files and a snapshotting approach to group current parquet files that form the current table snapshot together. Delta tables are more like Source Code Control in that each table version is a collection of files. See my playlist on this for a full explanation. ruclips.net/video/Muyq3qtHzzo/видео.html

    • @ChrisUK70
      @ChrisUK70 7 дней назад

      @@BryanCafferky Coming from a database background it is understanding how this joins together and the use cases for it, thanks for the explanation.

  • @ChrisUK70
    @ChrisUK70 8 дней назад

    I so agree with nailing things down before moving on, but due to over zealous project managers and scrum masters this rarely happens in my experience, grrr! They push to move things along due to time and cost but as you rightly pointed out that it costs so much more to have to change things later on. I could moan for hours on this subject!

  • @ChrisUK70
    @ChrisUK70 9 дней назад

    Thanks Bryan, you teach the subject in an easy to understand manner.

  • @ChrisUK70
    @ChrisUK70 9 дней назад

    Thanks Bryan for providing all of this for free. I am a seasoned dev 28 years in industry started out writing data feeds to Oracle data warehouses on Unix boxes with shell, SQL, PLSQL. Then after a few years moved to Business Objects Data Services an Integration tool and then for the last 10 years using Talend another Integration tool. Now out of work next month and in my fifties I realise I need to retrain to a new technology. Data Bricks and Apache Spark seem to be very popular, but I for building data integrations is wise to learn Data Fabric with Data Factory or is there some other tech used? I have done a small amount of pulling and pushing to Cloud services using API calls within Talend but I do not have strong Cloud skills or knowledge. Azure seems like a logical choice.

    • @BryanCafferky
      @BryanCafferky 9 дней назад

      Hi Chris, Not to be pedantic but Databricks is all one word as I just spelled. Need to get the spelling right. Its a common mistake. Fabric is brand new and its future uncertain despite the marketing hype. Generally, I see Fabric as a service for Power BI so if you are a Power BI dev, makes sense to learn Fabric. I don't see Fabric replacing Databricks or Snowflake, the 2 largest Big Data services. Azure is great but AWS is still the largest public cloud. However, Databricks seems to be more popular on Azure. If you do ETL on Azure, ADF is good to learn but not always needed. Databricks workflows are powerful and can handle most ETL jobs. Thanks for your comment.

    • @ChrisUK70
      @ChrisUK70 8 дней назад

      @@BryanCafferky Thanks for pointing that out if I cannot get the spelling right how can I get a job using it?😀 It is difficult to know what technology to invest the time into, a lot of my career has been data migrations not so much ETL into DWH. Hence ETL/integration tools, there are so many offering's now I will follow the Microsoft route possibly aim to get some sort of certification. I have seen a lot of roles that are asking for Snowflake as well in the UK where I am based. Thanks again for a great series of videos!

  • @brokejohnnylive1530
    @brokejohnnylive1530 10 дней назад

    Dude you are on the money!! Agree all 100%.

  • @calvinkhor890
    @calvinkhor890 13 дней назад

    Thanks Bryan, other than Flask I haven't tried these myself yet. Flet looks interesting too. I would have chosen "PyFlutter" as a name. Any thoughts on Gradio, which I think is similar to say Streamlit but lets you deploy to huggingface? There's so many different options these days and its hard to choose, your video helps narrow down the options.

  • @Thegameplay2
    @Thegameplay2 13 дней назад

    Really useful

  • @user-xd1hc4tg6s
    @user-xd1hc4tg6s 14 дней назад

    Thank you for your video!

  • @ELDhouse
    @ELDhouse 16 дней назад

    Bryan, thanks for another great video! Your series on python and sqlite helped me move my data career forward. I want to ask your opinion on Kotlin vs Flutter as a cross platform development language especially after Google established it as their preferred development language. Thanks!

    • @BryanCafferky
      @BryanCafferky 15 дней назад

      Not sure about Google adopting Kotlin. Since FlutterFlow, they seem to be pushing Flutter which is their language. See flutter.dev/events Kotlin is also Yet Another Language so not loving that. Kotlin still has a low adoption according to the TIOBE indexwww.tiobe.com/tiobe-index/ but so does Flutter. Did you watch the video? I discuss Python options which may be a good choice.

  • @yusuf07007
    @yusuf07007 16 дней назад

    Ty guy. Your post cleared up some of my inquiries!

  • @mainakdey3893
    @mainakdey3893 17 дней назад

    at last somebody is clearing the confusion, Good job Bryan

  • @SaiKumar-ub6jo
    @SaiKumar-ub6jo 17 дней назад

    Can you help how we can create the drop down for task parameters in worflow

    • @BryanCafferky
      @BryanCafferky 16 дней назад

      You use widgets. Doc here learn.microsoft.com/en-us/azure/databricks/notebooks/widgets

  • @drisselfigha3547
    @drisselfigha3547 18 дней назад

    I love your way of explaining, I watch each of your videos several times. This also allows me to improve my English.

  • @bhavindedhia3976
    @bhavindedhia3976 19 дней назад

    how to retrieve specific values from delta log after reading json unable grab values

  • @ProGamER__801
    @ProGamER__801 19 дней назад

    Hi sir , does python and mySQl together a good combo and can land you a job ?

    • @BryanCafferky
      @BryanCafferky 19 дней назад

      Yes. It's a great combo and popular so good skills to have.

  • @davidk7212
    @davidk7212 20 дней назад

    Zank you sir for zis tutorial. It is most very velcome.

  • @sebajo6643
    @sebajo6643 22 дня назад

    Good one

  • @sebajo6643
    @sebajo6643 23 дня назад

    Great Lesson! Thank you Bryan!

  • @elinadiary9357
    @elinadiary9357 23 дня назад

    Thank you sir

  • @_extremily
    @_extremily 23 дня назад

    Thanks!!

  • @Darshakramani-kr2he
    @Darshakramani-kr2he 25 дней назад

    Thanks for the video about changes since Databricks series, Brayan. It's service to the community. I am very pleased, you being quite helpful to the new people in field including me. Your explanation as always is to the point covering all backgrounds of people of CS. Only Channel I subscribed is this one, I used to watch from incognito, But I had to come watch from my account, add to playlist, subscribe, and my subscription does not validate anything, but want to tell you, we cherish your effort wholeheartedly.

    • @BryanCafferky
      @BryanCafferky 24 дня назад

      Thank you so much! Glad it is helpful. Glad my videos are helpful. New people in the field and people crossing over from other related fields are very much in my thoughts when I do my videos.

  • @sarthakmane2977
    @sarthakmane2977 25 дней назад

    5:54, better comedian than half the comedians in the world

  • @ayandapeter1681
    @ayandapeter1681 26 дней назад

    Sir, I just want to say thank you so much, I've gone through many videos but was still confused, u made this crystal clear with all your conceptual approach.

    • @BryanCafferky
      @BryanCafferky 25 дней назад

      Thank you for kind words. I'm so glad my videos are helping you. That's why I do them. I know this technology is not easy to learn so kudos to you for sticking with it.

  • @sebajo6643
    @sebajo6643 26 дней назад

    Very useful video

  • @calvinkhor890
    @calvinkhor890 27 дней назад

    recently finished the course, thanks a lot Bryan. Out of curiosity any thoughts on recent acquisition of tabular? Will people switch to Iceberg?

  • @Hamromerochannel
    @Hamromerochannel 27 дней назад

    I tried to do data bricks academy and I got lost. Thanks to channel, I understand every nook and crannies. Thumbs up Brian!!

    • @BryanCafferky
      @BryanCafferky 25 дней назад

      Thank you! Glad my videos are helping you.