Welcome to the School Text Analysis project! This project aims to analyze text from different schools using word cloud visualization. In this README, you'll find information on how to apply the word cloud analysis to a school's text and how to contribute to the project.
The purpose of this project is to perform text analysis on various schools' content and visualize the most frequent keywords using word clouds. By following the steps outlined in the provided notebook, you can extract insights from text data and generate informative visualizations.
-
Choose a School: Start by selecting a school's webpage or any relevant textual content that you'd like to analyze.
-
Data Collection: In the notebook provided, you'll find code that fetches and processes the text data from a given URL.
-
Data Transformation: The text data is parsed and processed to remove unnecessary elements, ensuring that only relevant text is used for analysis.
-
Keyword Extraction: The notebook uses the
nlp_rake
library to extract keywords from the text data. -
Visualizations: The extracted keywords are visualized using both bar plots and word clouds for a comprehensive view of the most frequent terms.
-
Save the Results: After running the notebook, you can save the generated word cloud image in the "images" folder of this repository.
We welcome contributions from the community to enhance this project and explore text data from different schools. Here's how you can contribute:
-
Run the Notebook: Clone this repository and run the provided notebook on a school of your choice.
-
Save Word Cloud Images: After generating the word cloud visualization in the notebook, save the resulting image in the "images" folder.
-
Update README: Add the name of the school you analyzed and the corresponding image in the "Contributions" section of this README.
To provide an even more convenient way to perform text analysis and visualize word clouds, we've included a Streamlit app named app.py
.
- Install Dependencies: Make sure you have the required dependencies installed. You can install them using the following command:
bash pip install streamlit requests html.parser nlp_rake wordcloud matplotlib
- Run The app: to run the app, execute the following :
bash streamlit run app.py
Thank you to the contributors who have added their analyses to this project:
Feel free to add your school's analysis by following the steps mentioned above! Choose a school from the following list:
This project is licensed under the MIT License.
Happy text analysis and word cloud visualization!