A Data-Filled Journey Through Relationships in Hamlet

Or, how English profs can protect their jobs from the robots

Jack Ward

Try clicking around (if you're on mobile) or simply hovering over wedges of the circle (if you're not).

In the graph above, each colored wedge represents a character in Hamlet. The size of the chord is proportional to the number of lines that character delivers. The paths between the wedges, or chords, represent lines said in the presence of another character. The width of each chord represent the number of lines between the two characters, with its width at each end representing who does more of the talking.

Syntactically, Hamlet is simple and regular.

Here are the passes the computer goes through when pulling this data out of Hamlet:

  1. Online Version

    We start with MIT's online edition of the play, all nice and formatted. You can read it for yourself here, and I've copied a part of it below so we can see how our little program will work through it.

    ACT I

    SCENE I. Elsinore. A platform before the castle.

    FRANCISCO at his post. Enter to him BERNARDO

    BERNARDO

    Who's there?

    FRANCISCO

    Nay, answer me: stand, and unfold yourself.

  2. Flatten and Tag by Formatting

    Bold, italic, header formatting has meaning

    h act 1

    h scene i. elsinore. a platform before the castle.

    i francisco at his post. enter to him bernardo

    b bernardo

    who’s there?

    b francisco

    nay, answer me: stand, and unfold yourself.

  3. Find Entrances and Exits

    Check only italicized lines, and use keywords: “enter,” “exit,” “exeunt,” “dies” Beginnings of scenes (headers) mean all exit

    h act 1

    h scene i. elsinore. a platform before the castle.

    i francisco at his post. enter to him bernardo

    b bernardo

    who’s there?

    b francisco

    nay, answer me: stand, and unfold yourself.

  4. Find character names in exits and entrances

    Some characters have different names if they are being prompted or given stage directions. Additionally, characters like First Sailor are given entrances with “Enter Sailors,” not “Enter First Sailor.”

    h act 1

    h scene i. elsinore. a platform before the castle.

    i francisco at his post. enter to him bernardo

    b bernardo

    who’s there?

    b francisco

    nay, answer me: stand, and unfold yourself.

  5. Track Presence

    Update the list of those in the room at each change, accumulate lines said to each other character

    h act 1

    h scene i. elsinore. a platform before the castle.

    i francisco at his post. enter to him bernardo

    b bernardo

    who’s there? [bernardo to francisco (1), bernardo (1)]

    b francisco

    nay, answer me: stand, and unfold yourself. [francisco to francisco (1), bernardo (1)]

Simply rinse and repeat for all 5000 lines of the play, and we'll have this big jumble of data:

Counter({'hamlet': 1495, 'king claudius': 546, 'lord polonius': 355, 'horatio': 289, 'laertes': 206, 'ophelia': 173, 'queen gertrude': 157, 'ghost': 95, 'first clown': 94, 'rosencrantz': 93, 'marcellus': 62, 'guildenstern': 53, 'first player': 52, 'osric': 48, 'player king': 44, 'bernardo': 38, 'player queen': 30, 'prince fortinbras': 27, 'gentleman': 24, 'voltimand': 22, 'second clown': 18, 'reynaldo': 15, 'first priest': 13, 'captain': 12, 'francisco': 10, 'a lord': 7, 'lucianus': 6, 'first ambassador': 6, 'messenger': 5, 'first sailor': 5, 'prologue': 3, 'danes': 3, 'servant': 1}) dict_keys(['prologue', 'first player', 'voltimand', 'lucianus', 'ophelia', 'prince fortinbras', 'player king', 'rosencrantz', 'messenger', 'lord polonius', 'laertes', 'first clown', 'gentleman', 'reynaldo', 'captain', 'player queen', 'first ambassador', 'queen gertrude', 'danes', 'first sailor', 'bernardo', 'servant', 'second clown', 'osric', 'francisco', 'guildenstern', 'marcellus', 'a lord', 'king claudius', 'first priest', 'ghost', 'hamlet', 'horatio']) prologue Counter({'lord polonius': 3, 'ophelia': 3, 'hamlet': 3, 'horatio': 3, 'rosencrantz': 3, 'king claudius': 3, 'guildenstern': 3, 'queen gertrude': 3}) first player Counter({'hamlet': 50, 'lord polonius': 49, 'rosencrantz': 47, 'guildenstern': 47}) voltimand Counter({'queen gertrude': 22, 'lord polonius': 22, 'king claudius': 22, 'hamlet': 1, 'laertes': 1}) lucianus Counter({'player king': 6, 'lord polonius': 6, 'ophelia': 6, 'hamlet': 6, 'horatio': 6, 'rosencrantz': 6, 'king claudius': 6, 'guildenstern': 6, 'queen gertrude': 6}) ophelia Counter({'gentleman': 107, 'queen gertrude': 93, 'king claudius': 80, 'horatio': 61, 'lord polonius': 56, 'laertes': 42, 'hamlet': 36, 'danes': 31, 'rosencrantz': 16, 'guildenstern': 16, 'lucianus': 4, 'player king': 4, 'prologue': 2}) prince fortinbras Counter({'osric': 19, 'horatio': 19, 'first ambassador': 19, 'captain': 8}) player king Counter({'lord polonius': 44, 'ophelia': 44, 'hamlet': 44, 'player queen': 44, 'horatio': 44, 'rosencrantz': 44, 'king claudius': 44, 'guildenstern': 44, 'queen gertrude': 44}) rosencrantz Counter({'guildenstern': 88, 'hamlet': 59, 'king claudius': 34, 'lord polonius': 17, 'queen gertrude': 17, 'ophelia': 13, 'horatio': 11}) lord polonius Counter({'king claudius': 135, 'ophelia': 119, 'queen gertrude': 112, 'hamlet': 68, 'reynaldo': 66, 'rosencrantz': 40, 'guildenstern': 40, 'laertes': 32, 'horatio': 11, 'first player': 8, 'lucianus': 1, 'player king': 1}) messenger Counter({'laertes': 5, 'king claudius': 5}) second clown Counter({'first clown': 18}) gentleman Counter({'queen gertrude': 24, 'horatio': 13, 'king claudius': 11}) reynaldo Counter({'lord polonius': 15}) francisco Counter({'bernardo': 10, 'horatio': 3, 'marcellus': 3}) laertes Counter({'king claudius': 149, 'queen gertrude': 103, 'gentleman': 94, 'ophelia': 86, 'hamlet': 60, 'horatio': 53, 'danes': 47, 'osric': 35, 'first clown': 18, 'first priest': 18, 'lord polonius': 13}) player queen Counter({'player king': 30, 'lord polonius': 30, 'ophelia': 30, 'hamlet': 30, 'horatio': 30, 'rosencrantz': 30, 'king claudius': 30, 'guildenstern': 30, 'queen gertrude': 30}) first ambassador Counter({'osric': 6, 'horatio': 6, 'prince fortinbras': 6}) queen gertrude Counter({'king claudius': 100, 'hamlet': 77, 'laertes': 52, 'lord polonius': 36, 'horatio': 29, 'ophelia': 28, 'rosencrantz': 20, 'gentleman': 20, 'guildenstern': 20, 'ghost': 13, 'first clown': 12, 'first priest': 12, 'osric': 7, 'player king': 2, 'danes': 2, 'lucianus': 1}) danes Counter({'gentleman': 6, 'laertes': 3, 'queen gertrude': 3, 'king claudius': 3}) first sailor Counter({'horatio': 5}) bernardo Counter({'horatio': 30, 'marcellus': 30, 'francisco': 8, 'ghost': 5, 'hamlet': 4}) servant Counter({'horatio': 1}) first clown Counter({'horatio': 57, 'hamlet': 57, 'second clown': 43}) osric Counter({'hamlet': 48, 'horatio': 48, 'queen gertrude': 5, 'laertes': 5, 'king claudius': 5}) king claudius Counter({'laertes': 301, 'queen gertrude': 295, 'lord polonius': 158, 'hamlet': 135, 'gentleman': 106, 'rosencrantz': 74, 'guildenstern': 71, 'ophelia': 59, 'voltimand': 48, 'horatio': 43, 'danes': 37, 'osric': 28, 'first clown': 9, 'first priest': 9, 'player king': 3, 'messenger': 3, 'lucianus': 1}) guildenstern Counter({'rosencrantz': 52, 'hamlet': 37, 'horatio': 21, 'king claudius': 16, 'queen gertrude': 11, 'ophelia': 5, 'lord polonius': 5}) marcellus Counter({'horatio': 61, 'bernardo': 48, 'hamlet': 13, 'ghost': 10, 'francisco': 3}) a lord Counter({'hamlet': 7, 'horatio': 7}) captain Counter({'rosencrantz': 11, 'guildenstern': 11, 'hamlet': 11, 'prince fortinbras': 1}) first priest Counter({'first clown': 13, 'queen gertrude': 13, 'ophelia': 13, 'hamlet': 13, 'horatio': 13, 'laertes': 13, 'king claudius': 13}) ghost Counter({'hamlet': 95, 'queen gertrude': 6, 'horatio': 4, 'marcellus': 4}) hamlet Counter({'horatio': 703, 'rosencrantz': 350, 'guildenstern': 350, 'queen gertrude': 327, 'lord polonius': 271, 'king claudius': 207, 'marcellus': 184, 'ophelia': 175, 'first clown': 132, 'osric': 121, 'laertes': 102, 'ghost': 62, 'first player': 61, 'bernardo': 52, 'first priest': 39, 'player king': 25, 'lucianus': 12, 'captain': 11, 'a lord': 5, 'prologue': 4, 'player queen': 2}) horatio Counter({'marcellus': 187, 'bernardo': 149, 'hamlet': 133, 'ghost': 36, 'osric': 34, 'first sailor': 23, 'first ambassador': 22, 'prince fortinbras': 22, 'first clown': 10, 'queen gertrude': 4, 'servant': 2, 'gentleman': 2, 'laertes': 2, 'king claudius': 2, 'ophelia': 1, 'francisco': 1, 'first priest': 1})

Note: Stanford (pdf) and Nature (pdf) suggest all academic code be publicized for reproducibility's sake, so here it is. Sorry, it's ugly. Read at your own risk.