Singularity

Willard McCarty on Humanist pointed me to a, quite silly, article in the Economist entitled “March of the Machines”. It can almost be called a genre piece. The author downplays very much the possible negative effects of artificial intelligence and then argues that society should find an ‘intelligent response’ to AI—as opposed, I assume, to uninformed dystopian stories.

But I do hope the intelligent response society will seek to AI will be less intellectually lazy than the author of said contribution. I think to be honest that someone needed to crank out a 1000 words piece quickly, and reverted to sad stopgap rhetorics.

In this type of article there’s invariably a variation on this sentence: “Each time, in fact, technology ultimately created more jobs than it destroyed”. As if—not denying here any of a job’s power to be meaningful and fulfilling for many people—a job is the single quality of existence.

Worse is that such multi purpose filler arguments ignore unintended side effects of technological development. Mass production was brought on by mechanisation. We know that it also brought mass destruction. It is always sensible to consider both the possible dystopian and utopian scenarios. No matter what Andrew Ng (quoted in the article) obviously should say as an AI researcher, it is actually very sensible to consider overpopulation of Mars before you colonise it. Before conditions are improved for human live there—at whatever expense—even a few persons will effectively establish such an overpopulation. Ng’s argument is a non sequitur anyway. If the premise of the article is correct we are not decades away from ubiquitous application of AI. Quite the opposite, the conditions on Earth for AI have been very favourable for more than a decade already. We hardly can wait to try out all our new toys.

No doubt AI will bring some good, and also no doubt it will bring a lot of awful bad. This is not inherent in the technology, but the in the people that wield it. Thus it is useful to keep critically examining all applications of all technologies while we develop them, instead of downplaying without evidence its unintended side effects.

If we do not, we may create our own foolish utopian illusions. For instance when we start using arguments such as “AI may itself help, by personalising computer-based learning and by identifying workers’ skills gaps and opportunities for retraining.” Which effectively means asking the machines what the machines think the non-machines should do. Well, if you ask a machine, chances are you’ll get a machinery answer and eventually a machinery society. Which might be fine for all I know, but I’d like that to be a very well informed choice.

I am not a believer of The Singularity. Chances that machines and AI will aggressively push out human kind are in all likelihood gross exaggerations. But a realistic possibility is the covert permeation of human society by AI. We change society by our use of technology and the technology changes us too. This has been and will always be the case, and it is far from some moral or ethical wrong. But of these changes we should be conscious and informed, so that we hold the choice and not the machine. If a dialogue between man and (semi-)intelligent machine would be started as naive as the author of the Economist piece suggests, then human kind might indeed be very naively set to become machine like.

Machines and AI are, certainly until now, extensions and models of human behaviour. They are models and simulations of such behaviour, they are never humans. This can improve human existence manyfold. But having the heater on is something quite different than asking a model of yourself: “What gives my life meaning? How should I come to a fulfilling existence?” Asking that of a machine, even a very intelligent one, is still asking a machine what it is to be human. It is not at all excluded that a machine will not ever find a reasonable or valuable answer to that. But I would certainly wait beyond the first few iterations of this technology before possibly buying into any of the answers we might get.

It is deceptively easy to be unaware of such influences. In 1995 most people found cell phones marginally useful and far too expensive. A mere 20 years later almost no one wants to depart from his or her smartphone. This has changed how we communicate, when we communicate, how we live, who we are. AI will have similar consequences. Those might be good, those might be bad. They shouldn’t be however covert.

Thus I am not saying at all that a machine should never enter a dialogue with humans on human existence. But when we enter that dialogue we change the character of the interaction we have had with technologies since we can remember considerably. Humans have always defined technology, and our use of it has in part defined us. By changing technology we change ourselves. This acts out on the individual level—I am a different person now due to using programming languages than I was when I did not—and on the scale of society where we are part of socio-technical ecosystems comprising both technologies, communities, and individuals.

But these interactions have always been a monologue on the intellectual level. As soon as this becomes a dialogue because the technology literally can now speak to us, we need to be aware that it is not a human speaking to us, but a model of a human.

I for one would be excited to learn what that means, what riches is may bring. But I would always enter such a conversation well aware that I am talking not to another, but to a machine, and I would weigh that fact into the value and evaluation of the conversation. To assume that AI will answer questions on what course of action would lead me to improving my skills and my being, may be too heavily a buy in into the abilities of AI models to understand human life.

Sure AI can help. Even more so if we are aware of the fact that its helpful qualities are by definition limited to the realm of what the machine can understand.

 

Methodological safety pin

There is a trope in digital humanities related articles that I find particularly awkward. Just now I stumbled across another example, and maybe it is a good thing to muze about it a short bit. Whence the example comes I don’t think is important as I am interested in the trope in general and not in this particular instance per sé. Besides, I like the authors and have nothing against their research, but before you know it flames are flying everywhere. So in the interest of all I file this one for prosperity anonymized.

This is the quote in question: “The first step towards the development of an open-source mining technology that can be used by historians without specific computer skills is to obtain a hands-on experience with research groups that use currently available open-source mining tools.”

Readers of digital humanities related essays, articles, reviews etc. will have found ample variations on this theme in the literature. From where I am sitting such statements rig up a dangerous strawman or facade. There are a number of hidden (and often not so hidden at all) assumptions that are glossed over with such statements.

First of all there is the assumption that it is obvious that as a scholar without specific computer skills you still should be able to use computer technology. This is a nice democratic principle I guess, but is it a wise one too?

Second, there’s the suggestion that all computer technology is homogeneous. There is no need to differentiate between levels and types of interfaces and technologies. It can all sweepingly be nicely represented as this amorphous mass of “open-source mining technology”. I know it is not entirely fair to pin this on the authors of such statements. Indeed the authors may be very well aware that they are generalizing a bit in service of the less experienced reader. However, the scholarly equivalent would be to say that the first step for a computer scientist that wants to understand history is to get a hands-on experience with historians. Even if that might be in general true, from scholarly arguing I expect more precision. You do not ‘understand history’. One understands tiny, very specific parts of it, maybe, when approached with very specific very narrowly formulated research questions, and meticulous methodology. I do not understand why the wide brush is suddenly allowed if the methodology turns digital.

Third, and this is the assumption that I find most problematic: there is the assumption (rather axiom maybe) that there shall be a middle man, a bridge builder, a guide, a mediator, or go-in-between that shall translate the expertise from the computer skilled persons involved towards the scholar. You hardly ever read it the other way round by the way, it is never the computer scientist in need of some scholarly wisdom. This in particular is a reflex and a trope I do not understand. When you need expertise you talk to the expert, and you try to acquire the expertise. But when it comes to computational expertise we (scholars) are suddenly in need of a mediator. Someone who goes in between and translates between expertises. In much literature—that in itself is part of this process of expertise exchange—this is now a sine qua non that does not get questioned at all: of course you do not talk to the experts directly, and of course you do not engage with the technology directly. When your car stalls, you don’t dive into the motor compartiment with your scholarly hands do you?!

Maybe not—though I at least try to determine even with my limited knowledge of car engines what might be the trouble. But I sure a hell talk to the expert directly. The mechanic is going to fix my car, I want to know what the trouble is and what he is going to do. Yes well, the scholar retorts, but quite frankly I do not talk so much on the car engine trouble to my mechanic at all! Fair enough, might not be your cup of tea. But the methodology of your research should be. Suppose you are diagnosed with cancer, do you want to talk only to the secretary of your doctor?

Besides, it is about the skills. A standard technique to disguise logical fallacies in reasoning is to substitute object phrases. I play this little game with these tropes too: “The first step towards the development of a hand grenade that can be used by historians without specific combat skills is to obtain a hands-on experience with soldiers that use currently available hand grenades.”

This doesn’t invalidate the general truthiness of the logic, but it does serve to lay bare its methodological fallacy: if you want to use that technology, better acquire some basic skills from the experts if you want to rely safely on the outcome of its use.

Intellectual Glue and Computational Narrative

There exist several recurring debates in the digital humanities. Or rather maybe we should position these debates as between digital humanities and humanities proper. One that is particularly thorny is the “Do you need to know how to code?” debate. In my experience it is also frequently aliased as the “Should all humanists become programmers?” debate. One memorable event in the debate was Stephen Ramsay’s (2011) remark “Do you have to know how to code? I’m a tenured professor of Digital Humanities and I say ‘yes.’” A sure fire starter. Ramsay used the metaphor of building to describe coding work done in DH. Taking up on this Andrew Prescott (2012) argued that in most humanities software building DH researchers seemed to be uncomfortably in the backseat. Most non digital humanities PIs seem to regard developing software as a support act without intrinsic scientific merit, Prescott used to word ‘donkeywork’ to express what he generally experienced humanities researchers were thinking of software development. Prescott reasoned that as long as digital humanities researchers were not in the driver seat DH would remain a field lacking an intellectual agenda of its own.

I agree: in a service or support role DH nor coding will ever develop their full intellectual potential for the humanities. As long as it is donkeywork it will be a mere re-expression and remediaton of what went before. The problem there is that the donkey has to cast his or her epistemic phenomenology towards the concepts and relations of the phenomenology of the humanities PI. In such casting there will be mismatches and unrealized possibilities for modeling the domain, the problem, data, and the relations between them. It is most literally like a translation, but a warped and skewed one. Like what would result if the PI was to request a German translation of his English text but requiring it being written according to English syntax and ignoring the partial incommensurability of semantic items like ‘Dasein’ and ‘being’. Or compare it to commissioning a painting from Van Gogh but requiring it be done in the style of Rembrandt. The result would be interesting no doubt, but neither something that would satisfy the patron or the artist. The benefactor would get a not quite proper Rembrandt. And, for the argument here more essential, the artist under these restrictions would not be able to develop his own language of forms and style. He would be severely hampered in his expression and interpretation.

This discrepancy between the contexts of interpretation through code and through humanistic inquiry we find reflected I think in the way DH-ers tend to talk about their analytical methods as two realms separated. The best known of these metaphors is that of the contrast between ‘close’ and ‘distant’ reading, initiated by the works of Franco Morreti (2013). Ramsay (2011b) and Kirschenbaum (2008) also clearly differentiate between two levels or modes of analysis. One is a micro perspective, the other operates within a macro-level scope. Kirschenbaum described the switching from computational analysis of bit level data to putting up a meaningful perspective on the hermeneutic level of a synthesis narrative as “shuttling” back and forth between micro and macro modes of analysis. Martin Mueller (2012) in turn wished for an approach of “scalable reading” that would be able to make this switching between ‘close’ and ‘distant’ forms of analysis less hard, the shuttling more seamless.

We have microscopes and telescopes, what we lack is a tele-zoom lens. A way of seamlessly connecting the close with the distant. Without it these modes of analysis will stay well apart because the ‘scientistic’ view of computer analysis as objective forsakes the rich tradition of humanistic inquiry, as Hayles remarks (2012). Distant reading as analytic coding does gear towards an intellectual deployment of code (Ramsay 2011b). But the analytic reach of quantitative approaches is still quite unimpressive. I say this while stressing that this is not the same as ‘inadequate’, I dare bet there is beef in our methods (Scheinfeldt 2010). But although we count words in ever more subtle statistical ways to for instance analyze style, the reductive nature of these methods seems to kill the noise that is often relevant to much scholarly research (McGann 2015). For now it remains striking that the results of these approaches are confirmation oriented more than resulting in new questions or hypotheses; mostly they seem to reiterate well known hypotheses. Nevertheless, current examples of computational analyses could very well be the baby steps on a road towards a data driven approach to humanities research.

Thus if there is intellectual merit in a non-service role of code, then why do the realms of coding and humanistic inquiry stay so well apart as they seem to do? Let’s for a moment pass by the easy arguments that are all too often just there to serve the agenda of some sub-cultures in the humanities community. It is not a lack of transferable skills. I can teach 10 year old girls HTML in 30 minutes, everyone can learn to code. It is not an inherent conservative and technology averse nature of humanities (Courant 2006). Like any community the humanities has its conservative pockets and its idealist innovators. No, somehow the problem lies with computation and coding itself. Apparently we have not yet found the principles and form of computing that allow it to treat the complex nature of noisy humanities data and the even more complex nature of humanities’ abductive reasoning. That is, reasoning based more on what is plausible than what is provable or solvable as an equation. Humanities are about problematizing what we see, feel, and experience; about creating various and diverse perspectives so that the one interpretation can be compared to the other, enriching us with various informed views. Such various but differing views and interpretations are a type of knowledge too, albeit a different kind of knowledge than that results from quantification (Ramsay 2011b:7). These views acquire a scholarly or scientific status once they are rigorously tried, debated, and peer reviewed.

One of the aspects that sets humanities arguments apart from other types of scientific reasoning and analysis is its strong relation to and reliance on narrative. Narrative is the glue of humanities’ abductive logic. But code has narratological aspects too. As Donald Knuth has argued there is a literacy of code (Knuth 1984). Most humanities scholars are most literally illiterate in this realm. Yet many of the illiterate demand the intellectual primacy over code reliant research in the humanities. But to create an adequate intellectual narrative you need to be well versed in the language you’re using, you must be literate. I am not a tenured professor of digital humanities, but just the same I dare posit that you can not wield code as an intellectual tool if you are not literate in it.

Does this mean that the realms of humanities oriented computation and of humanistic abductive inquiry must stay apart? No, it means that non code literate humanists should grand those literate in code and humanities the time and space to develop the intellectual agenda of code in the humanities. But at the same time should those literate in code reflect on their mimicry of a ‘scientistic’ empiricism. The intellectual agenda of humanities is not to plow aimlessly through ever more data. Number crunching is a mere base prerequisite even within its own narrow understanding of scientific style. Only when we get into making sense of these numbers, of applying interpretation to them, we unleash the full power of the humanistic tradition. And making sense is all about building meaningful perspectives through the creation of narratives. The computational literate in the humanities need to figure out the intellectual agenda of digital humanities, and they need to develop their own style of scientific and intellectual narrative that connects it to the main stream intellectual effort of the humanities.

With all this in mind it is encouraging to learn that the Jupyter Notebook Project acquired substantial funding for further development (Perez 2015). We do not have that dreamed of tele-zoom, that scalable mode of reading. But Jupyter Notebooks may well be an ingredient of the glue needed to link the intellectual effort of humanities coding to mainstream humanities discourse. These Notebooks started out as a tool for interactive teaching of Python coding. The iPython Notebooks developed into computer language agnostic Jupyter Notebooks that allow the mixing of computer and human language narrative. In Jupyter Notebooks text and code integrate to clarify and support each other. The performative aspects of code and text are bundled to express the intellectual merit of both. Fernando Perez and Brian Granger (2015) developed their funding proposal strongly around the concept of computational narrative: “Computers are good at consuming, producing and processing data. Humans, on the other hand, process the world through narratives. Thus, in order for data, and the computations that process and visualize that data, to be useful for humans, they must be embedded into a narrative—a computational narrative—that tells a story for a particular audience and context.”

Hopefully the Jupyter Notebooks will be part of a leveling of the playing field for both narratively inclined and computationally oriented humanities scholars. Hopefully they will become a true middle-ground for computational and humanistic narrative to meet, mix, and grow from a methodological pidgin into a mature new semiotic system for humanistic intellectual inquiry.

—JZ_20151002_2318

Bibliography

Courant, P.N. et al., 2006. Our Cultural Commonwealth: The report of the American Council of Learned Societies’ Commission on Cyberinfrastructure for Humanities and Social Sciences. University of Southern California.

Hayles, K.N., 2012. How We Think: Digital Media and Contemporary Technogenesis, Chicago (US): University of Chicago Press.

Kirschenbaum, M., 2008. Mechanisms: New Media and the Forensic Imagination, MIT.

Knuth, D.E., 1984. Literate Programming. The Computer Journal, 27(1), pp.97–111.

McGann, J., 2015. Truth and Method: Humanities Scholarship as a Science of Exceptions. Interdisciplinary Science Reviewd, 40(2), pp.204–218.

Moretti, F., 2013. Distant Reading, London: Verso.

Mueller, M., 2012. Scalable Reading. Scalable Reading—dedicated to DATA: digitally assisted text analysis. Available at: https://scalablereading.northwestern.edu/scalable-reading/ [Accessed September 22, 2015].

Perez, F., 2015. New funding for Jupyter. Project Jupyter: Interactive Computing. Available at: http://blog.jupyter.org/2015/07/07/project-jupyter-computational-narratives-as-the-engine-of-collaborative-data-science/ [Accessed October 1, 2015].

Perez, F. & Granger, B.E., 2015. Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science. Project Jupyter: Interactive Computing. Available at: http://blog.jupyter.org/2015/07/07/jupyter-funding-2015/ [Accessed October 1, 2015].

Prescott, A., 2012. To Code or Not to Code? Digital Riffs: extemporisations, excursions, and explorations in the digital humanities. Available at: http://digitalriffs.blogspot.nl/2012/04/to-code-or-not-to-code.html [Accessed October 1, 2015].

Ramsay, S., 2011a. On Building. Stephen Ramsay — Blog. Available at: http://stephenramsay.us/text/2011/01/11/on-building/.

Ramsay, S., 2011b. Reading Machines: Toward an Algorithmic Criticism (Topics in the Digital Humanities), Chicago (US): University of Illinois Press.

Scheinfeldt, T., 2010. Where’s the Beef? Does Digital Humanities Have to Answer Questions? Found History. Available at: http://foundhistory.org/2010/05/wheres-the-beef-does-digital-humanities-have-to-answer-questions/ [Accessed October 1, 2015].

This Graph is my Graph, this Graph is your Graph

There is no better way to acknowledge that you are an academic or digital humanities arrival than finding yourself on the receiving end of a class act hatchet job on your work. This year at DH2014 Stefan Jänicke, Annette Geßner, Marco Büchler, and Gerik Scheuermann of Leipzig University’s Image and Signal processing group and Göttingen’s DH Center presented a paper firmly criticizing the graph layout that CollateX and StemmaWeb apply. Tara Andrews (who is responsible for the programming that makes the backend kick) and I presented some work on graph interactivity for StemmaWeb at DH2013. We have been working on the graph representation and interactivity because it is fun and pretty cool—a primary but often overlooked motivator of much DH work as Melissa Terras pointed out at her DHBenelux keynote. The academically more serious part is in the interaction between scholars and text. That is for me in any case, for Andrews it is the stemmatological component. I want to know how tools and interfaces either or not support scholar-text interaction. Hint: GUIs do more bad than good. Mostly I want to know how they–and the code that makes them tick–affects scholarly interpretation. Not many may know this, but I am the one that programmed the graph visualization and interaction in StemmaWeb. So I guess I am entitled to say one or two things on the work of Jänicke et al.

Let me first of all point out that I think their work is a very welcome and timely addition to the thinking and practice of graph interactivity. Not much work has been done on how graphs can be read as an aggregative representation of witness texts. Yet that work is essential in that it pertains to how scholars perceive their material. So the more, the merrier—which in general is my attitude anyway I believe. Above and beyond that, it is great to learn that someone is working on an actually JavaScript library for this type of scholarly lay out. I think Ronald Dekker, who works on CollateX, is already compiling the needed information to actually allow TRAViz to interoperate with CollateX.

So far the good. Now for the bad and the puzzling. Jänicke et al. argue for a number of design rules. Some of these make sense to me, like “Bundle major edges!”. In fact in StemmaWeb we wanted to have weighted edges, we… just didn’t get around to it first time round. Apparently this feature now present (the edge width is a function of the number of witnesses) wasn’t there yet when Jänicke et al. checked. The thing is, all the work was done in our copious free time, so weighted edges took a little longer. That takes me to the first puzzling. It took two computer scientists and two digital humanists to pull off the creation of a set of rules? The StemmaWeb graph visualization and interactivity components came around a little more economically, and with less ceremony in any case. The good part is that apparently there is much programming capacity around. It cannot be stressed enough how valuable it is to have some computing effort going into creating and maintaining a reusable code library specific to this type of visualization. We built the graph interactivity on top of the standard GraphViz layout engine that is based on a number of default graph design principles. The abstracting of that graph interaction and decoupling it as a separate library from the specifics of StemmaWeb logic has been a long standing wish for us, but given our primary research concerns we never got around to that. This summer we are trying again. Unfortunately I have a suspicion that again other academic and not so academic stuff will get in the way. And oh, did I point out we really seriously value collaboration?

Other rules really do not make sense to me. Not labeling edges? Why not, it is pretty essential mostly to scholars to know what witnesses coincide. Color coding that? We tried, it doesn’t work beyond seven witnesses, that’s when you run out of the colors of the rainbow, and humans are very bad at distinguishing shades of green. Thus color coding is limited, and above that it hides actual information. Abolish backward edges because of cognitive load? Reading James Joyce is a cognitive load, not following a backward edge. But going along for the purpose of it: how do Jänicke et al. treat transpositions in that case? “Rule 5: insert line breaks!” Again that exclamation mark. “Why shouldn’t we adopt the behavior of a text flowing in a book (with line breaks) for Text Variant Graphs?” Well, because it is not a book, and I am interested in new ways of reading. That is my take, but there’s no reason why we could not differ of opinion by good argument.

That takes me to the core of what I find unhelpful about this set of rules. They are hammered out in the style of god given dogmas. Even if this rule set has empirical user studies and design principles underpinning it, the dogmatism is still unpalatable. Especially the “line breaks” rule seems to suggest that print paradigm was heavily over represented in the user survey. The point is that if you ask users from the print paradigm what they want and what they like, chances are you will end up with a digital mimesis of print paradigm. I don’t mind if you want to get what you got by doing what you did, but raising that to absolute rule is not very explorative to say the least. What Jänicke et al. fail to appreciate is the digital humanities and human computer interaction potential here for experimenting with design choices and learning how they affect our reading, use, and interpretation of variant text representation. Instead they boilerplate a number of rules mostly based on print paradigm assumptions. And again: why does this need to be in this forbidding shouting of dogmatic rules? Will I be shot the next time I violate them for the sake of experimentation? I like the remarks they make on iterative approach. We have been reading the Agile Manifesto since 2001, and we fully embraced evolutionary development. But this goes for rules too: they are there to be iteratively questioned and purposely broken for the sake of progressing our knowledge.

For years there has been a quote under my email signature, which kind of says it all: “Jack Sparrow: I thought you were supposed to keep to the code. Mr. Gibbs: We figured they were more actual guidelines.”