Text and data mining (TDM) is an umbrella term to describe a broad array of methods, tools, and approaches to scholarship that involve applying computational methods to large bodies of (often unstructured) text.
TDM approaches are increasingly popular in an era of large-scale digitization efforts. The rapid growth of digital platforms for publication and social media have created massive corpora of textual data that is often easily accessible to anyone with internet access.
Researchers in diverse fields use TDM to gain insights from looking at long periods of time or across vast collections that would be nearly impossible with reading and examination in the traditional way. Quantifying the text elements, and developing analytical tools to count and visualize the terms of interest have opened up huge new bodies of scholarship in fields such as the digital humanities.
This guide is intended to help you get started identifying some basic tools and methods to approach a research question using TDM.