The data should have at least 3,000 text documents and perform Unsupervised analysis using Rstudio with below methods:-
o Sentiment Analysis (Lexicon)
o Latent Semantic Analysis (LSA)
o Topic Models
The deliverable will include 4 components:
1. A written report that summarizes your key findings in a clear, concise manner to a business
o Describe how you chose your sample (Random? Convenience? Criteria-based?).
o Describe your text variable.
o Describe relevant non-text variables and include descriptive statistics and exploratory
D. Analysis Results
o Present your analysis findings in-depth.
o Summarize why you chose your analysis method, the analysis itself (including any
modeling decisions made) and the results.
o Present internal or external validation measures and accompanying plots, as necessary.
E. Discussion & Conclusion
o Discuss the high-level findings of your analysis in words.
o Conclude your report by drawing real-world connections to how your analysis can be used, with a focus on business implications.
-The written report should guide the content of your presentation, but you should focus on the highlevel information and insights.
3. An R Script file (.R) with all (annotated/commented!) code used to conduct your analysis.
4. An RData file (.RData) containing all objects, functions, values and data used.