Final Thoughts

Here is a final chance to express your ideas about text-mining and/or digital history tools. Feel free to make any overall comments about the tools we’ve looked at so far, or to make requests for tools that you think you could use but don’t yet exist.

Some possible questions to address, but don’t feel limited by these:

  • What should be the priorities for improving tools we’ve looked at?
  • What kind of technology tools would enhance your work (no matter how similar or different from those we looked at)?
  • To what extent have the tools provoked you to think about researching and communicating history in new ways?


The Software Environment for the Advancement of Scholarly Research (SEASR), funded by the Andrew W. Mellon Foundation, provides a research and development environment capable of powering leading-edge digital humanities initiatives.

Although not dedicated exclusively to text-mining, the project produces several tools related to it. We would like your feedback about the SEASR project to better understand how digital humanities projects can be more useful and become more engaged with communities of practitioners. Some questions to think about:

Did you get a good sense of what this project is about and what it’s doing? Did it provide insights into the potential of text-mining? What do you think of the tools and demos on the site? Do you have any interest in the broad frameworks that SEASR is developing? How could this project or website be more broadly useful or generate more interest?

Voyeur is a collaborative project by Stéfan Sinclair & Geoffrey Rockwell to think through some foundations of contemporary text analysis, including issues related to the electronic texts used, the tools and methodologies available, and the various forms that can take the expression of results from text analysis.

Your task is to learn a bit more about and play around with Voyeur. How could you use this? How could the tool/instructions be improved?

Visualizing the Origin of Species

This week’s resource up for discussion: The Preservation of Favoured Traces

One of many visualization projects of Ben Fry, this page shows you how the Origin of Species changed over time, allowing for both broad and detailed views of the text. Though perhaps not suitable for rigorous textual analysis, could this be of real value for getting a sense of a large corpus? Is it just eye candy?

Also, you might check out some other (not always historical) examples of visualization at These are examples of using an open-source programming language (called Processing, created by Fry) often used to visualize data. Did you get any ideas about how you represent some of your own ideas and concepts?

Many Eyes

This week’s resource up for discussion is a tool that transforms plain data into eye-catching visuals.

On the website, users upload a standardized set of data (like from an Excel spreadsheet) and can then inspect it from a variety of perspectives. You can browse the gallery of visuals created by users, but the real fun is uploading your own data (it takes about 15 minutes to go through the process) and seeing it in new ways.

Let us know what you did during your visit to Many Eyes. Is their website easy to understand and use? What did you think of existing visualizations? Did you upload your own data? Did you get any new ideas? What is lacking from this kind of visualization tool?

Shaping the West

This week’s discussion focuses on visualizing change over time, as illustrated with railroad company board membership. Though you cannot plug in your own data here as with Wordle, you can adjust what the tool shows and how it shows it.

The questions are much the same as last week: can you imagine using a visualization tool like this for your own sources/data? Could this be useful for research? Or just as a way of presenting data? What kind of features would make such a tool more generally useful?


This week’s discussion focuses on visualizing texts. To begin, visit and create your own visualization of plain text (you can paste it onto their site) or a webpage (you can paste in a URL).

Although Wordle itself is probably not a suitable research tool, can you imagine using visualization tools like this for your own sources? What kind of features would make such a tool more useful?

Time Magazine, 1923-2006

Mark Davies of Brigham Young University has taken the 100 million words published in Time Magazine from 1923-2006 and created a site that allows any researcher to explore trends over time (so to speak). Although oriented toward those interested in linguistic evolution, virtually any topic (such as the rise and fall of coverage of “race relations,” shown here) can be examined. Results of searches can be numerical or charted (as shown above). In addition, users of the site can click through to see specific Time Magazine articles that relate to any search result.

If you work in twentieth-century history, is this a resource that you can imagine using? If you do not work in the twentieth century, can you imagine taking advantage of such a tool based on your own source material? If so, how? What would make such a tool more attractive? (You may wish to look at other digital research collections Davies has created, such as the Corpus del Español or the Corpus do Português.)

Please feel free to comment on the design of this site as well, and on any particular feature (or missing feature) that caught your eye.

Enhancing Historical Research With Text-Mining and Analysis Tools