Final Thoughts

Here is a final chance to express your ideas about text-mining and/or digital history tools. Feel free to make any overall comments about the tools we’ve looked at so far, or to make requests for tools that you think you could use but don’t yet exist.

Some possible questions to address, but don’t feel limited by these:

  • What should be the priorities for improving tools we’ve looked at?
  • What kind of technology tools would enhance your work (no matter how similar or different from those we looked at)?
  • To what extent have the tools provoked you to think about researching and communicating history in new ways?

8 Responses to “Final Thoughts”

  1. wilssearch Says:

    What should be the priorities for improving tools we’ve looked at?

    As a lot of academics are not necessarily geeks, explanations and instructions should use plain language not technospeak.

    Add voice to the video presentations if not already there and more in depth tutorials.

    What kind of technology tools would enhance your work (no matter how similar or different from those we looked at)?

    I have been under the impression that what I need for my research and writing is a database, but I am not sure if a data mining application might not work for the comparisons I want to make. But some of them will make more work just because the material would need to be scanned before using the data mining software. If using several books that are not on-line, this would turn into a really major project.

    To what extent have the tools provoked you to think about researching and communicating history in new ways?

    We are progressing into a more complex digital age and historians need to progress along with the rest of the world. If we do not, the younger generations will ignore the whole subject. So it is extremely important for us to keep up with the times – digitally speaking.

    I found some of the projects we looked at as having nothing to do the history that I am looking at, but would be great for literary criticism and other fields as well as history.

    Using open source software is a great time and money savings prospect and should be encouraged at all times. For those of us who need to make some profit from our research, this is a great idea.

    It seems to me that these are more research assistant type programs rather than a way to communicate with others, but perhaps posting the end result of the software’s run on a web site or a blog might work. But I think doing so should wait until after publication. I have seen several books that numbered 700 to 900 pages with a CD in the back of the book. Perhaps that might be an option. The CD would contain the completed project with the text of the book as well.

  2. kkennin Says:

    I agree with the previous post that the tools need to be designed in such a way that they are accessible to non-specialists. As historians, many of us do not have the computing and technical skills to follow the language used by many of these sights.

    I would also echo the comment above that we have to keep up with the technology as best we can in order to continue to make history relevant and exciting for students.

    I use database software in my research, and I am always looking for new tools to aid my analysis. Many of the tools we have looked at would be great for making my research results more visually stimulating, and I can imagine using them in presentations to demonstrate patterns and analyze data.

  3. tim vermande Says:

    I like the way that these tools allow for visual communication. Most of my students are visual learners. I have been struggling to find ways to move away from the “lecture format” of writing a few terms on the board, as well as finding way to involve students.

    That leads to: I agree with the others, these tools need to be designed for non-specialists. The computer is a tool for me. If a tool doesn’t work well, or if I can’t figure it out, I’m going to use another one, even if it’s not quite right (yesterday I pried something with a screwdriver, too).

    Further: students today often feel the same way. I have students who are great at the programs they use professionally: Maya, Photoshop, Flash, and so on. I want them to demonstrate their abilities and develop competence with their field’s tools in a variety of settings. It’s good for them to learn auxiliary programs, but they are not “geeks” either, and they aren’t going to use a program with arcane instructions or procedures.

  4. Andrew H. Lee Says:

    I heartily concur with the previous posts that straightforward and lay language be used in instructions, but I would also add the availability of printed guides. I find that too much is assumed by online instructions and what is intuitive for the creators/manual writers is not intuitive to users. I used to read the manuals that came with programs and I found it useful for several important reasons.

    One, I learned the vocabulary of the tool, its often idiosyncratic and obtuse ways of referring how to do something so I can then search the online help. This is the equivalent learning another language, and, sorry MAC devotees, it is not natural.

    Two, by reading the printed manual I became (however vaguely) of additional possibilities that I may not need immediately, but existed in the software. that frequently I made use of down the road.

    Finally, I became aware of what it did not do and could discourage casual users from trying to make one piece of software fulfill all their computing needs. I have archival colleagues who have created used Wordperfect as a substitute for what they should have used: a relational database. All the work that went into making a Wordperfect file is lost time that could have been better spent learning a database program. because now that Wordperfect file is dead and the data has to be migrated into a database with all the resulting errors…

    GIS would be very useful for my work, some sort of GIS like we saw at Stanford’s Spatial history project. Text mining software that could handle multiple languages and word variants, maybe even scripts.

    But overall this was a fun exercise. I was impressed by Stanford’s Spatial History Project, amused by Wordle, and disappointed by IBM… plus ça change, plus c’est la même chose…

  5. tdmackie Says:

    I appreciate being asked to participate. The new various technologies opened my eyes but I am still not sure of its uses. Though I feel as if I am the one who still has a young horse and a good wagon and am unsure if I need an auto. I used the Time Magazine application and went back to use the spatial program called “Shaping the West”. Most of these are very good for interpreting data for geography students but I am not sure if it helps in research.
    I work in a Lincoln Museum and a text mining project that I could use is one that traces times when Lincoln is called the emancipator or the martyr in newspapers, memorials and public speeches. Again, it is most useful to illustrate points and only of mild use to original research.
    In fact of all those I could use the Text mining illustrated in the Time Magazine project would be most useful to my field.

  6. Cathy Hajo Says:

    I was a skeptic coming in about the value of visualization tools. I am comfortable creating and analyzing databases in order to drive my research, and prefer the tangible nature of the results I get with a database (x number of people in a given situation said x) rather than pretty word clouds where the most common answer is bigger. I think that in general those kinds of displays of data, where they aren’t linked to hard facts, don’t have much appeal to me. They feel more subjective, less factual than more boring charts and lists.

    That said, I did think that some of these tools were neat, I just didn’t find that any of them provided me with the kinds of quality results that I could see working into an article or paper. Some were extremely frustrating to use, as noted above; some didn’t seem to work so well, but if any one of them had given the glimmer of the ability to produce high quality results, I think that I would have stuck with them and tried to master them. Many Eyes was the one that came closest to doing something that I wanted with my data–but it didn’t work.

    Text mining is something that I tend to use more as a historian (by using it as a stronger text search) than like a literary scholar, who I think most of these tools are designed for. They are good for locating materials, but I don’t think that I would use them, unless they were far more customizable, to try to quickly analyze large chunks of text to automatically pull out meaning. I still think that reading the texts and thinking about them does a better job of that. Maybe it can help to focus reading on the most relevant sources. I’m not sure. Most of these just produced fairly pretty looking or complicated displays that I couldn’t really figure out how to use.

    The things that I am looking for are ways to take structured data, that I might gather about events, organizations, people, etc. and create interactive maps, or visualizations of the links between people.

    One of the things that I find the most useful about digital history is that it allows us to ask questions and get answers far more quickly than we could before. This means that we can try any number of different ways of looking at the data that we have gathered, or run any number of text searches or different database reports just to see what comes up. Before having digital tools to help us, you had to be pretty sure that you would come up with an interesting result before you would commit to poring over primary sources or creating concordances. So I think that the continued creation of these tools is really important, and think that they need to be explained, with some real-world examples, so that we can have a better sense whether the tool can do what we want.

    This is sounding more negative that I think I actually feel. It was interesting and sometimes fun to look at these tools, many of which I had heard of, but none of which I had every tried, and while I didn’t feel that I could use any of them right now in my work, just the exposure to them is a good thing, and something that I will share with students and colleagues. Thanks!

  7. allison Says:

    I am in complete agreement with my fellow commentators that many of these tools, if they are going to be taken up by large numbers of humanities researchers, need to use clear non-technical language to both justify their usefulness and explain their functioning. The reality is that, even though I tend to be a techno enthusiast, committing to zotero for example early on, I need to be convinced of a tool’s stability and and enthusiastic about its usefulness before I will take it on – before I will spend the time to learn how to use it and input data.

    I’m assuming that the Spatial History project has been up and running longer and thus the projects are further along, but it is instructive to compare it to SEASR – not only are its claims presented in very different language (modest vs. revolutionary), but the Stanford project provides a range of real research projects one or more of which a humanities researcher might easily recognize as related to their own. Out of all of these tools, it is the one that excited me the most – I instantly wanted to pass it on to colleagues and friends.

    The one thing SEASR really has going for it, in my view, is that it can be used in a Zotero environment – Zotero’s recognition of the browser as today’s key research portal was what convinced me to adopt it. Likewise, new tools need not necessarily replace existing ones (although zotero did), but should enable me to enhance and extend what I use already. Compatibility, in other words, is crucial.

    Evaluating these tools has made me think of doing research in different ways – it makes certain projects seem more possible in a lifetime. It has made me wish other corpuses already existed, or that I myself was more adept at manipulating corpuses – in fact it has reiterated to me the importance of standardizing my own research techniques (entering data into zotero, for example) with a view to using one of these tools in the future or another which will surely come along.

    In terms of immediate change, however, the tools are more likely to affect how I communicate results and how I teach – wordle, probably the simplest of the tools, will probably get the most play in the near future, probably as an assignment for a 2nd year Modern art survey I teach. And I’m considering putting more time into Many Eyes in order to come up with my own graphics for presentations (although the comments of other panelists about it not working makes me nervous about committing the time). Perhaps surprisingly, art historians are not used to thinking of visualizing things beyond the art works and images that already serve as our primary sources, but we have a captive, visually astute audience, who might more immediately grasp a point (say about the professional networks of artists at a given historical moment) with a graphic. In any case, what strikes me as most important after looking at these tools is the need for ongoing and increased collaboration – between those with the linguistic and technical know-how and those with broad experience of their own fields, who can articulate the various needs of different kinds of researchers.

  8. Jeff Tenuth Says:

    After pondering this exercise for some time now (perhaps even too long) I must confess to a certain ambivalence regarding the examples we saw. While some were innovative, others were rather standard, given today'[s supposed computer capabilities. In some cases, these tools were useful for demographic and even geographical studies. But I found none of them useful in the deeper, core historical questions. As mentioned before, none of these examples helped me to understand the how and why of history past the obvious demographic answers that could be found using other methodologies.

    Having said that, demographic studies in themselves can be very useful in answering some of history’s questions. For example, demographic studies could help me understand how peoples migrated from the Black Sea region in prehistoric times and how that is reflected in extant language groupings today. But, computer based methods of data analysis or data mining cannot at this time tell me why such migrations took place. They can only tell me how. But, “how” is some of the battle, so maybe some progress is better than no progress. Looking at all seven tools provided, I found Wordle, Many Eyes, and HyperPo examples not very useful. The Time Magazine, Shaping the West, and SEASR examples had potential, but I would like to see them more fully developed past the “demographic” stage. The “visualizing the Origin of Species” example was far too complex and too far reaching. It was essentially useless.

    The other concern I have is that the creation of data mining methodologies may prove to be an end in itself. This is the great danger of grant driven experimentation-that the acquisition of the grant or rather, funding becomes the end, rather than just another tool. If that becomes the case, then there will be little actual progress in the future. And I do believe that these data mining tools can be made to provide more than just raw data clumping and superficial analysis.

    Finally, a comment about the nature of history and its relationship to computer based analysis. History is an objective analysis of the past, based on sources that are at least partly verifiable. Computers should be able to enhance this. At the same time, all written historical analysis is at least partly subjective because it is composed by us and we are subject to the biases of our own times. There are many histories and many ways to study history. But crossing all of history and all the ways to study history is knowing the ideas and motivations of past people, cultures and nations. Computers and data mining methods cannot yet help us understand these ideas and motivations-they cannot tell us the why of history. And until they do, they are but tools, relegated to amasssing and analyzing large amounts of data as a demographer would. And if they do go beyond that capability, they we will need to ask other questions perhaps more important about our future rather than our past.

Enhancing Historical Research With Text-Mining and Analysis Tools