AWSCertifiedBigDataSlides-compressed

Exam Overview

  • Course: AWS Certified Data Analytics Specialty Course DAS-C01

  • Objective: Prepare for the Big Data Specialty certification exam (BDS-C00).

  • Recommended Background:

    • Prior knowledge of AWS services (EC2, networking).

    • Familiarity with data and analytics concepts.

  • Duration: Long and interesting course, take your time.

Instructors

  • Stephane Maarek

    • IT Consultant & AWS Big Data Architect.

    • Veteran instructor; 94% certification score.

    • Links: GitHub, LinkedIn, Medium, Twitter.

  • Frank Kane

    • Former Amazon Sr. Software Engineer and Manager.

    • Focus on Machine Learning and Big Data.

    • Owner of Sundog Education.

    • Links: LinkedIn, Twitter, Facebook.

Course Coverage

  • AWS Big Data Services:

    • Amazon Kinesis, AWS Lambda, AWS Glue, Amazon EMR, Amazon ML, SageMaker, etc.

    • Services categorized into:

      • Collection: Kinesis, AWS IoT, SQS.

      • Storage: S3, DynamoDB.

      • Processing: AWS Lambda, Glue, EMR.

      • Analysis: Athena, Redshift, QuickSight.

      • Security: AWS KMS, CloudHSM.

Case Study

  • Case Study Overview: cadabra.com

  • Requirements:

    • Order History App: Client app, server logs.

    • Product Recommendations: Server logs.

    • Transaction Rate Alarm: Server logs.

    • Near Real-Time Log Analysis: Amazon OpenSearch, Kinesis Data Firehose.

    • Data Warehousing & Visualization: (Managed Serverless).

AWS Data Collection Methods

  • Real-Time:

    • Tools: Kinesis Data Streams, SQS.

  • Near Real-Time:

    • Tools: Kinesis Data Firehose, DMS.

  • Batch Analysis:

    • Tools: Snowball, Data Pipeline.

Kinesis Data Streams

  • Architecture: Consists of shards, producers, consumers.

  • Data Ingestion Constraints:

    • Retention (1 to 365 days).

    • Record limitations (1 MB/record, 1,000 records/sec per shard).

  • Security:

    • Control access using IAM,

    • Encryption in flight and at rest with KMS.

AWS Lambda

  • Functionality: Serverless computing and event-driven architecture.

  • Common Integrations:

    • Kinesis, S3, DynamoDB.

  • Cost Model: Pay for number of requests and duration of execution.

  • Supported Languages: Node.js, Python, Java, C#, Go.

AWS Glue

  • Serverless ETL Service: Effective for data cleaning and transformation.

  • Crawler Feature: Automatic schema discovery from data sources.

  • Key Transformations:

    • Machine learning transformations (FindMatches).

    • Supports a variety of data formats.

Amazon Redshift

  • Service Type: Fully-managed, petabyte-scale data warehouse.

  • Performance: 10x faster than others via MPP and columnar storage.

  • Key Features:

    • Scaling, backups, and concurrency management.

    • Storage options: RA3 nodes offering independent scaling of compute and storage.

Amazon QuickSight

  • Functionality: Fast, serverless business analytics service.

  • Key Features: Commonly used for dashboards, ad-hoc analysis, and visualizations.

  • Security: Multi-factor authentication, IAM policies, row-level security.

Exam Preparation Tips

  • Timing: 65 questions in 170 minutes (~2.5 minutes per question).

  • Practice: Take practice exams; familiarize with AWS white papers and make use of the AWS training resource.

  • Day of Exam:

    • Bring two forms of ID; no notes or electronic devices allowed.

    • Arrive early and prepared to reduce stress.

Additional Resources

  • Review AWS Big Data White Paper and specific service documentation for exam preparation.

  • Join AWS forums and communities for tips and advice from others who have taken the exam.