PySpark
๐ Master PySpark in 18 days with structured lessons, hands-on tasks and an end-to-end project, covering essential concepts and ML model training.
pythonboilerplatedata-sciencebig-datahadoopetlreferencescikit-learndata-engineeringcheatsheetsparksqlspark-sql
FreeRepo
Overview
This is a structured learning roadmap for mastering PySpark and big data processing over 18 days, covering DataFrames, SQL, joins, performance tuning and machine learning with hands-on coding tasks. The core stack includes PySpark, Python 3 and Java, designed for learners who want to ship fast through practical exercises on real datasets.
Features
structured-learning-pathhands-on-tasksend-to-end-projectcode-examplesdatasetsdaily-exercises
Feature Flags
teamsOrgsanalyticsjobsQueuemapsformsValidationdocumentationtutorialscodeExamplesprojectBased
Recommended Use Cases
learning-pysparkbig-data-processingmachine-learning-with-sparkdata-engineeringetl-development
Frontend
None
Backend
pysparkpython
Auth Providers
None
Deployment Targets
localhadoop
Payment Providers
None
Quick Facts
Stack
Language
python
Database
spark
Data Layer
Databases
spark
UI Stack
Developer Experience
Docker
No
Tests
No
Quickstart
Yes
env.example
No
Pricing
Classification
free
Selected
โ
Notes
Open-source educational resource