HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Analyze the MovieLens dataset to identify popular genres using Hive over Hadoop in Cloudera VM. Extract insights for streaming platforms like Netflix or Prime.

Tech Stack

Apache Hive
Hadoop HDFS
Cloudera Quickstart VM
Linux Shell
Excel / Matplotlib for visualization

Key Features

Genre-wise popularity using Hive explode and split
Data stored and queried in HDFS
Business insights for recommendation engines
Visualization charts

Project Structure

datasets/: Input CSV files
Hive_Queries/: All Hive scripts
visualizations/: Graphs generated from output and Hive CLI output proofs

Report

See Movie analytics.docx for the full write-up.

How to Run

Set up Cloudera VM
Load movies.csv into HDFS
Create external Hive table
Run queries from hive_queries/

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
HIVE_Big_Data		HIVE_Big_Data
Hive_Queries		Hive_Queries
dataset		dataset
visualisations		visualisations
.gitignore		.gitignore
Movie analytics.pptx		Movie analytics.pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Tech Stack

Key Features

Project Structure

Report

How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

HIVE_Big_Data

MovieLens Genre Analytics Using Hive on Hadoop

Objective

Tech Stack

Key Features

Project Structure

Report

How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages