Differential Language Analysis ToolKit
DLATK is an end to end human text analysis package, specifically suited for social media and social scientific applications. It is written in python 3 and developed by the World Well-Being Project at the University of Pennsylvania.
It contains:
feature extraction part-of-speech tagging correlation prediction and classification mediation dimensionality reduction and clustering wordcloud visualizationDLATK can utilize:
Mallet for creating LDA topics Stanford Parser CMU's TweetNLP pandas dataframe output InstallationDLATK is available via conda, pip or github.
conda install -c wwbp dlatk pip install dlatk python setup.py install Dependencies mysqlclient NumPy scikit-learn SciPy statsmodelsSee the full installation instructions for recommended and optional dependencies.
DocumentationThe documentation for the latest release is at dlatk.wwbp.org .
LicenseLicensed under a GNU General Public License v3 (GPLv3)
BackgroundDeveloped by the World Well-Being Project based out of the University of Pennsylvania.