Skip to content

Latest commit

 

History

History
4 lines (4 loc) · 1.6 KB

File metadata and controls

4 lines (4 loc) · 1.6 KB

MovieLens_data_analysis_using_pig_pyspark

In this system, we have used here the MovieLens 20M dataset. MovieLens, that is a movie suggesting service, provided this dataset (ml-20m), which describes 5-star rating and free-text tagging behaviour. Over 27278 movies, it has 20000263 ratings and 465564 tag applications. 138493 users produced this data, between January 9, 1995, and March 31, 2015. On October 17, 2016, this dataset was created. In this dataset, users were selected at random for inclusion. All selected users had rated at least 20 movies. No demographic information is included. Each user is represented by an id, and no other information is provided. All the included genres are action, adventure, animation, children's, comedy, crime, documentary, drama, fantasy, film-Noir, horror, musical, mystery, romance, sci-Fi, thriller, war, western, and (no genres listed).

As a result of the movieLens dataset analysis, users may find or obtain data on the most popular films based on the number of users, reviews, and insights into films based on ratings. The user may also find out the average rating of films based on a variety of factors such as genre, occupation, and age group. They would be able to have a better knowledge of the films by learning about the most popular genre for each age group. Additionally, filmmakers may be interested in learning about the yearly trends in film production that might aid them in making critical decisions.

As a result, we've conducted a number of studies employing movie Lens datasets in order to give consumers and experts with accurate movie information.