Cautiously Reading Data

I recently re-read “How to Read a Literary Visualisation: Network Effects in the Lake School of Romantic Poetry” by Laura Mandell et al. in tandem with Matthew L. Jockers’ text Macroanalysis: Digital Methods and Literary History, which I had not read before. My interest in both stemmed from my growing intention to map Iconclass notations surrounding specific symbols into a network diagram. Jockers offered two helpful tidbits on that front. First, pointing out Gephi, a program I may be able to use (163). Second, and more importantly, he clearly labels what the nodes and edges of his network diagram represent (163), a practice I certainly wish to follow. While neither Mandell et al. nor Jockers helped me understand how I can make my own network diagrams, they both provided some insight into a responsibly cautious approach to reading the visualisations.

Jockers shows his caution in a few different places, but in general it comes down to ‘not making the data say more than it actually says.’ For example, Jockers’ data shows that, within his corpus of 3,346 books, Jane Austen’s Sense and Sensibility is the most similar text to her Pride and Prejudice. He makes it quite clear that there could be another text outside his corpus that is ‘more similar’, and that as works are added to the corpus the data may change (161). Considering my emblem data comes from just a subset of the emblems digitised on Emblematica Online, because only some have been tagged with Iconclass notations, I will have to be somewhat cautious in my interpretations, recognising that what this set of data tells me, could be proven false as more emblems are tagged. In a separate context, discussing word clusters and what they could indicate about nationality, Jockers writes, “The temptation to “read” even more into these patterns is great”, concluding the paragraph with a sobering statement regarding the limits of what the data indicates (115). I believe the caution here is not to confuse one’s interpretation of the data with what the data actually contains. Not that Jockers is against interpretation – “there will always be a movement from facts to the interpretation of facts” (30) – just that he seems to advocate for a clear delineation between the two. I think it is responsible to state what the emblem data shows, and what I interpret that to mean, while not conflating the two or jumping too far from what the data can support.

To that end Mandell et al. warns against mistakes in reading visualisations, explaining that “a major principle of Information Visualisation is that the first thing we will see … is “errors” in our data” (section 1). ‘Errors’ here meaning “information about how the data is structured (section 1). Mandell et al. explain that the data in a visualisation can show things that are not directly related to the research question: the sudden increase of the word “presumption” in an ngram visualisation does not actually reflect the word’s usage, because it does not account for the shift from long- to short-s in typography (section 1). Later, Mandell et al. mention how program bugs miscategorised data, and how a publishing house fire (mid-19th century) removed a swath of data before the researchers even had the chance to visualise it (section 1). For my project, this means I have to grant that my visualisations likely will contain ““artifacts in the data”, and information “about the way it is collected”” (section 1) along with the information I wish to study. I think it would be irresponsible to refuse the possibility that the emblem books tagged may weigh things towards certain authors (Andrea Alciato, the ‘father’ of emblem books, being the prime example) and regions, simply because they are the ones tagged so far. It seems a little caution, both while reading the visualisations and while relating the data and my interpretations to others, is justified. Tempering my enthusiasm with some care likely will help ground my arguments, hopefully making them stronger.

 

Works Cited

Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana, IL: U of Illinois P, 2013. Print.

Mandell, Laura, et al. “How to Read a Literary Visualisation: Network Effects in the Lake School of Romantic Poetry.” Digital Studies / Le Champ Numérique 3.2 (2012): n. pag. Web. 11 Feb. 2015.

Leave a Reply

Your email address will not be published. Required fields are marked *