Skip to content
/ BigData Public

BigData is a powerful data processing project that focuses on handling large-scale datasets using PySpark, Apache Iceberg, and Snapshot Management. This repository contains efficient ETL pipelines, data lake optimizations, and snapshot-based versioning for scalable and reliable big data analytics.

Notifications You must be signed in to change notification settings

siam29/BigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BigData

About

BigData is a powerful data processing project that focuses on handling large-scale datasets using PySpark, Apache Iceberg, and Snapshot Management. This repository contains efficient ETL pipelines, data lake optimizations, and snapshot-based versioning for scalable and reliable big data analytics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published