SEASR

The Software Environment for the Advancement of Scholarly Research (SEASR), funded by the Andrew W. Mellon Foundation, provides a research and development environment capable of powering leading-edge digital humanities initiatives.

Although not dedicated exclusively to text-mining, the project produces several tools related to it. We would like your feedback about the SEASR project to better understand how digital humanities projects can be more useful and become more engaged with communities of practitioners. Some questions to think about:

Did you get a good sense of what this project is about and what it’s doing? Did it provide insights into the potential of text-mining? What do you think of the tools and demos on the site? Do you have any interest in the broad frameworks that SEASR is developing? How could this project or website be more broadly useful or generate more interest?

11 Responses to “SEASR”

  1. Thomas Mackie Says:

    I attempted to read through this feature page. I feel very dumb as much was beyond me. It felt as if I was testing for a foreign language test without enough study. The entire page appeared to be build for those with very strong technology background and information storage. I cannot see many humanities scholars following this. Perhaps I am slow, but the entire SEARSR project appears to be a flexable indexing system to find selected words in a vast number of documents.

  2. Jan Kunnas Says:

    A good starting point could be the: “movie that highlights some of the projects and groups using the SEASR technology. Check out http://repository.seasr.org/Movies/SEASR-Nov-2009.m4v for more details.”

  3. tim vermande Says:

    While agreeing that much of this is about technology rather than what it can do for me, there are some good points if you can wade through them. At the end, though, it still seems to be a new form of a “question looking for an answer.”
    To me, the developers are vague about what it can do–is it an indexing system for my material, or can it share with others? The presentations make a lot of noise about how materials are currently stored in a variety of incompatible formats, but how is not just one more such format?
    I could do with less jargon and more about how to use it. I am suspicious of the flow charts, perhaps that’s just me, but I thought the humanities are about matters that resist measure-and-manage control.
    I didn’t find anything about access. Will it be restricted to schools that pay some fee? The world of research doesn’t end with big universities. Can an advanced student in a high school, or an independent scholar use this?

  4. Jeff Tenuth Says:

    As with other data mining tools I’ve seen so far, this too seems to have potential. Unfortunately, I don’t see its use beyond demographic analysis within the historical realm. Whereas it may help determine frequency or clustering, it doesn’t tell me how or why. As I have indicated with other mining tools, this kind of tool can only take me so far, then I must consult other sources and methods to know how and why something happened in the past. If I were a linguistic analyst or even a social scientist of some kind, I may find it more useful. But as an historian, this represents a visualization of data that is an early step in understanding, but it has limited use beyond that. Still, I suggest that this or any of the tools we have seen can be useful and development should continue. As some point in the future (hopefully not too soon or before we are ready), this kind of tool can be an element in incipient artificial intelligence.

  5. wilssearch Says:

    I had trouble following the demonstration but found the power point more understandable. However, I was unable to see any mention to open source, so could not tell if it was free to use or not. Both Firefox and Zotero are open source, so that means no fees.

    When I start really doing research for publication, I think something like this might be helpful although I am not sure how yet.

  6. kkennin Says:

    This looks like a potentially interesting tool, but I must admit much of the explanation was too technical for me. The FAQs section was a little better at explaining in simple terms, but they need to set up the website so it can be used by people who can’t follow all of the technical language. I had trouble getting the demos to work, too, but the text mining tools seem potentially useful for analyzing documents and determining patterns of word usage. I also thought the timeline tool could potentially be useful for historians.

  7. allison Says:

    Like the other commentators, I found the SEASR website daunting. Moreover, it seemed to be at once trying to empower me, the humanities scholar, at the same time as it disempowered me, not only through its overly technical language, but through the promise of helping me to “uncover hidden information and connections.” Perhaps as a scholar of the nineteenth century I have a tendency to be suspicious of such “discovery narratives,” of the notion of “hidden information,” both of which recall a kind of art history in which finding the “true meaning” by parsing “disguised symbolism” is central.

    That off my chest, I am always enthusiastic about something that might save me time and help me work more effectively with collaborators, so I did try to figure it out in a rudimentary way (with the help of my computational linguist partner). I can imagine using the summary functions through Zotero for citations for which I don’t have abstracts – especially if it works across different language groups (I didn’t see much information about how SEASR functions with different languages). I can also envision using it for more complex research questions but in order to really dive in I would have to be convinced that the time it would take to enter the relevant corpuses would be worthwhile – I would have to be convinced that it worked well across different languages and that it reliably approximated the human reader. My research involves the cultural use of terms like experiment, experimental, etc., especially in France, but also in England, Germany and the Scandinavian countries. One particularly challenge of this research is that in French the word “expérience” can be translated as experience or experiment – it is only the contexts which tell you – and not always precisely – whether the term is being used in a scientific manner. I have been working to trace when “experimental art” in the nineteenth century actually had something to do with experimental method. So, I could see working with a linguist to develop a number of KWIC criteria to apply to a corpus and then running a SEASR analysis…but it strikes me that developing the criteria, rather than pumping out the analysis might actually be the most useful part of the project.

    In other words, I’m really on the fence here – I support projects like this as pure computational linguistic NLP research, but I feel I can only glimpse the usefulness in applying this to my own work. I am skeptical that it would reveal “hidden information” – but if it convinced me that I could save time – and that it was reliable and worked across different languages – then I’d be all ears.

  8. Carrie Tallichet Says:

    My initial impression of this project is that it makes a good pitch. SEASR enables collaboration, facilitates access, and analyzes a variety of digital materials, but I’m not certain how it does all of this. I also found the website difficult to navigate, and many of my questions were left unanswered. The presentation did help clarify the capabilities of SEASR and made its application to humanities research more apparent. It seems the immediate value of SEASR is in its ability to quickly visualize the information already in our collections. I find this particularly useful in the initial stages of research by helping establish connections and relationships between sources.

  9. Shawn Barron Says:

    Well as I wrote on our class about the future of technology creating artificial intelligence seems like SEASR beat me to it. I think that their project whose goal seems to be to combine as many data miners as possible to create a super date miner is pretty cool. My critique of the site is that to use it you seem to need to possess a good deal of programming knowledge. This probably exists because data miners are still in what could best be called a hobbyist phase. The programs used by the site were developed by historians to satisfy their own curiosity and were shared in the hopes that they might help others. As a result the wording and interface of the programing is not designed to appeal to a mass consumer base (which is probably what i lean closer to). However this process of sharing programs will probably lead to improvements in usability over time.

  10. Ashley Anttila Says:

    This tool was not meant for amateurs. I could not find a tutorial on how to get started and it seemed the only information was provided as text. Although there were images, it would have been nice to see a video tutorial of how the site works. I feel like I don’t know what the software actually looks like. Despite this confusion, I did read through some of the text and this tool appears to be really flexible. You can do a lot with your dataset, like Google mapping and creating a timeline. As mentioned in our readings, these visualizations are carried out through text mining. What I did not get a clear sense of is where the data that is being mined comes from. Is it uploaded by the historian? Is it made from Google search parameters? Perhaps I just missed that information but I think, based on our blog discussions, where we’re getting the data from matters. It affects the quality of our research.

    However, what I find most useful about these options is the time it saves for the historian. It shortens the length of time between collection of data and visualization of it. I think that’s hugely beneficial. You’re more likely to try out a dataset in unconventional layouts if you can input the information into a program that does it for you.

  11. Nabeel Siddiqui Says:

    Unfortunately, I had similar experiences with SEASR as others. I found the website a little confusing and the goals of the project to be many times counterproductive. I understand that SEASR is trying to create a place where scholar can share research and scholarship despite proprietary formats. Yet, it does not really get to the heart of the problem. The problem is not that there is no way for proprietary formats to be shared. Instead, it is the fact that proprietary formats exist in the first place. SEASR does not address this issue. In fact, it seems to exacerbate the issue. By allowing people to collaborate in these formats, SEASR essentially allows an individual to find a reason to stick to these formats. Also, I have found that other more powerful tools have now been created to allow for collaboration.

Enhancing Historical Research With Text-Mining and Analysis Tools