Category Archives: papers

thinking about ‘sentiment analysis’

I just got off the phone with a researcher this morning who is interested in looking at sentiment analysis on a corpus of fiction, specifically by having some native speakers of Japanese (I think) tag adjectives as positive or negative, then look at the overall shape of the corpus with those tags in mind.

A while back, I wrote a paper about geoparsing and sentiment analysis for a class, describing a project I worked on. Talking to this researcher made me think back to this project – which I’m actually currently trying to rewrite in Python and then make work on some Japanese, rather than Victorian English, texts – and my own definition of sentiment analysis for humanistic inquiry.*

How is my definition of sentiment analysis different? How about I start with the methodology? What I did was look for salient adjectives, which I searched for by looking at most “salient” nouns (not necessarily the most frequent, but I need to refine my heuristics) and then the adjectives that appeared next to them. I also used Wordnet to look for words related to these adjectives and nouns to expand my search beyond just those specific words to ones with similar meaning that I might have missed (in particular, I looked at hypernyms (broader terms) and synonyms of nouns, and synonyms of adjectives).

My method of sentiment analysis ends up looking more like automatic summarization than a positive-negative sentiment analysis we more frequently encounter, even in humanistic work such as Matt Jockers’s recent research. I argue, of course, that my method is somewhat more meaningful. I consider all adjectives to be sentiment words, because they carry subjective judgment (even something that’s kind of green might be described by someone else as also kind of blue). And I’m more interested in the character of subjective judgment than whether it should be able to be considered ‘objectively’ as positive or negative (something I don’t think is really possible in humanistic inquiry, and even in business applications). In other words, if we have to pick out the most representative feelings of people about what they’re experiencing, what are they feeling about that experience?

After all, can you really say that weather is good or bad, that there being a lot of farm fields is good or bad? I looked at 19th-century British women’s travel narratives of “exotic” places, and I found that their sentiment was often just observations about trains and the landscape and the people. They didn’t talk about whether they were feeling positively or negatively about those things; rather, they gave us their subjective judgment of what those things were like.

My take on sentiment analysis, then, is clearly that we need to introduce human judgment to the end of the process, perhaps gathering these representative phrases and adjectives (I lean toward phrases or even whole sentences) and then deciding what we can about them. I don’t even think a human interlocutor could put down a verdict of positive or negative on these observations and judgments – sentiments – that the women had about their experiences and environments. If not even a human could do it, and humans write and train the algorithms, how can the computer do it?

Is there even a point? Does it matter if it’s possible or not? We should be looking for something else entirely.

(I really need to get cracking on this project. Stay tuned for the revised methodology and heuristics, because I hope to write more and share code here as I go along.)

* I’m also trying to write a more extensive and revised paper on this, meant for the new incarnation of LLC.

academic death squad

Are you interested in joining a supportive academic community online? A place to share ideas, brainstorming, motivation and inspiration, and if you’re comfortable, your drafts and freewriting and blogging for critique? If so, Academic Death Squad may be for you.

This is a Google group that I believe can be accessed publicly (although I’ve had some issues with signing up with non-Gmail addresses) although you appear to have to be logged in to Google to view the group’s page. Just put in a request to join and I’ll approve you. Or, if that doesn’t work, email me at mdesjardin (at) gmail.com.

Link: [Academic Death Squad]

I’m trying to get as many disciplines and geographic/chronological areas involved as possible, so all are welcome. And I especially would love to have diversity in careers, mixing in tenure-track faculty, adjuncts, grad students, staff broadly interpreted, librarians, museum curators, and independent scholars – and any other career path you can think of. Many of us not in grad student or faculty land have very little institutional support for academic research, so let’s support each other virtually.

In fact, one member has already posted a publication-ready article draft for last-minute comments, so we even have a little activity already!

Best regards and best wishes for this group. Please email me or comment on this post if you have questions, concerns, or suggestions.

よろしくお願いいたします!

*footnote: The name came originally based on a group I ran called “Creative Death Squad” but the real origin is an amazing t-shirt I used to own in Pittsburgh that read “412 Vegan Death Squad” and had a picture of a skull with a carrot driven through it. I hope the name connotates badass-ness, serious commitment to our research, and some casual levity. Take it as you will.

New issue of D-Lib magazine

D-Lib magazine has just published their most recent issue, available at http://www.dlib.org

This looks to be a great issue, with a number of fascinating articles on dissertations and theses in institutional repositories, using Wikipedia to increase awareness of digital collections, MOOCs, and automatic ordering of items based on reading lists.

Please check it out! All articles are available in full-text on the site.

creativity, goals, and the dissertation

I’ve been consulting some books on art-making lately, that you could broadly say are on that nebulous idea of “creativity” itself. (Art and Fear is the most well known of them and I can’t recommend it enough. It’s the best tiny book you’ll ever own.) As I’ve read more, I have realized that they apply not only to my artistic life – my life outside of the “work” of research and writing – but also to my current writing project as well. In other words, writing a dissertation, essentially a non-fiction book, is a creative undertaking of great magnitude and can be considered with the same principles in mind as would a painting or a composition or a mathematical theory.  (Fill in your creative path here.)

This was a revelation for me, despite the fact that I engage in drawing, painting, and creative writing as a part of my life: why would non-fiction writing for my “real job” not be creative work as well, and best approached with the same attitudes? Why  not?

So one thing that comes out of this is the issue of the goal. Art and Fear talks about this one and I’d honestly never considered it before. The goal often sounds like this: have a solo show, or get a piece in MoMA, or get a book published, or whatever. The problem arises that when the artist is successful and meets that goal, art-making can often cease completely, forever, because the goal has been met and there is no direction anymore, and nothing to aim for.

This book in particular recommends that goals should be more along the lines of “find a group of like-minded artists and share work with them.” Things that won’t be attained in a single moment, but that continue for the rest of your life.

It made me realize that yes, as a scholar, I have an end goal right now, and that is finishing my dissertation. After that, it’s a few articles, a monograph. But then what? And I don’t have a good answer for that. Thus, I am at high risk for becoming the same as the writer who quits after her first bestselling novel, adrift without an ongoing goal.

I wonder how scholars deal with this (I may just go and ask a few of them), but I think for myself, I’ve found a seed of it in a digital humanities project I’m dreaming up but haven’t had time to start implementing yet. It’s one that is less about content and more about opening up possibilities for exploring questions in ways that didn’t exist before, and to experiment with new methodologies that wouldn’t have traditionally come from my discipline. Sure, it’s building a database. But then it’s what to do with that database that’s the real project.

At the same time, I think a huge issue both in the arts and the academic humanities is that of solitude. I am not saying anything new here. Right now, a colleague and I are planning on co-authoring an article and attempting to get it published (please cross your fingers for us). I think it may be in my best interests, more than anything else, to keep in close touch with this person who works on things that are similar to my own work, and to keep picking up those business cards I like to collect from people I meet at conferences who are interested in my research for some reason, and routinely emailing them. My database project is something I want to leave open source and twist others’ arms to take part in. So I’m thinking now, as I’m nearing the end of my PhD course, where to start with the idea of forming a like-minded group to continue to share and collaborate with. To keep the end goal always moving and yet always fulfilled, because it is within myself and other people, and not just about me and something outside of me.

presentation accepted: MCAA

A quick tidbit.

I’ve gotten a paper proposal accepted for the Midwest Conference on Asian Affairs in early October, in Columbus, OH. I’m excited about this conference in particular because of its focus on media and communication throughout history, and thinking hard about how we approach our various fields through this lens (or vice versa).

My own topic is something I will elaborate on later, but for now, let me tell you it’s about the impossibility of separating physicality from social network from archive from publication in the context of a certain book in the late 1800s. To be less vague, I’m going to talk about how one man’s “rediscovery” (via many allusions by a fiction author he liked) of Ihara Saikaku (then mostly forgotten, now Mr. Edo-Period Canonical Author) in the 1880s. Those who got excited about reading Saikaku talk quite a bit about buying, handling, and borrowing/lending old copies of Saikaku’s work, and in their anthology that they published, they go so far as to credit each work with whose archive/collection it came from. The sense of physical ownership – and being able to touch the thing itself – is overwhelming compared to everything else I’ve looked at from this period. It’s fascinating and exciting and I’m looking forward to sharing this finding as well as getting feedback on my methodological approach and conclusions. (Surely weak at best, given that this is news to me and I haven’t had a lot of time to develop my thinking over the past year, buried in a mountain of magazines in the library basement.)

By the way, this probably can’t fit into the paper, but the social ripples of Saikaku popularity vibrate constantly through the Meiji literature and general literary discourse that I read throughout my research. Saikaku love versus hate, going so far as to adopt a pseudonym that translates to “I love Saikaku” while attempting to imitate his style in one’s own writing, republishing his works in random magazines, the changing ideas about whether or not his works qualify as modern works of fiction (小説, now translated as “novel” but then quite contested), and reactions to him – they not only feed into and inform and make clear literary cliques and their interactions, but also literary trends and experimentation in an era where nearly anything goes.

A forgotten author as a window into an historical moment: nothing could make me happier about choosing the path that I have.