Computing with Diagrammatic Content: Lessons Learned from Archimedes

The Diagram Markup Language (DML) and its compiler provide an open standard for encoding and interacting with diagrammatic content. Marking up a diagram, an ill-defined problem months ago, now takes an average of 3 hours. One result is beautiful diagrams, but more significantly these tools provide a formal framework for referencing diagrammatic content, computationally finding patterns therein, and quantitatively expressing those patterns. The first half of this report details some observations found by encoding the diagrams in Floating Bodies, Book I. These observations motivate new computational techniques for systematically exploring the information contained within a diagram.



In Floating Bodies, propositions 1.6 and 1.7, Archimedes primarily reasons with logical primitives with no intuitive geometric representation. Unlike a Euclidean circle or even a fluid surface that can be presented as a geometric circle or curve, Archimedes introduces concepts like weight and volume. In Heath's edition, weight and volume are presented as rectangles. The heavier the weight of a solid, the larger its 'weight rectangle' becomes and the size of the corresponding solid increases as well. Propositions 1.6 and 1.7 demonstrate that Archimedes used geometric representations of non-geometric concepts in the reasoning process. What conventions did Archimedes follow for presenting primitives lacking an intuitive geometric form? The conventions observed in the Archimedes Palimpsest emphasize the logical properties required by the proof.


In proposition 1.5, Archimedes explicitly sets the construction up as in proposition 1.3 with slight changes in logic and presentation. The density of the solid 'efgh', the same as that of the fluid surface in 1.3, is lighter than the fluid surface in 1.5. Presentationally, little changes from 1.3 to 1.5 besides the shape and location of 'efgh'. This is because proposition 1.3 displays 'efgh' projecting above the fluid surface as if it were lighter per unit volume than the fluid. A logical impossibility in 1.3, the similar presentation is now logically consistent in proposition 1.5. Propositions 1.3 and 1.5 reveal that Archimedes actively uses previous constructions, changing them slightly in logic and presentation. Does Archimedes refer to some constructions more than others? Does he ever refer to a subset of steps within a construction or does he always use them in their entirety? Perhaps the logical dependencies determined by these patterns of reference affected the order in which Archimedes presented propositions.

In propositions 1.1 and 1.3, Archimedes uses proof by contradiction to demonstrate his claim. In the first proposition, Archimedes' logic requires that given a fluid surface with center 'o' and two points 'a' and 'b' on that surface, we construct lines 'oa' and 'ob' so that they are not equal in length. This logical impossibility is used to derive a contradiction. However, this logic is not reflected in the palimpsest's diagram, which presents the logically correct conclusion where 'oa' and 'ob' are equal in length. In contrast, proposition 1.3 displays the logical impossibility seen in Archimedes' proof by contradiction. Solid 'efgh' sits above the fluid surface even though it has the same density as the fluid. Propositions 1.1 and 1.3 show that Archimedes uses diagrams in two different ways in proof by contradiction. Sometimes he presents the logically impossible, other times he presents the logically correct diagram. Is there a pattern to when Archimedes uses one presentation over another? How often is a logical impossibility represented? Perhaps this depends upon the nature of Archimedes' logical assertions within the proof.

Computational Techniques in DML

Each of the observations illustrates how Archimedes interacted with diagrams. Each of these types of interactions defines a requirement for any system trying to model diagrams and their usage in Archimedes. The first observation: Archimedes used geometric representations of non-geometric concepts in his reasoning. Therefore, DML assumes that logical primitives may not be geometric in nature, keying logical concept to geometric representation via label, the same technology used by Archimedes to associate text and diagram. Questions such as whether Archimedes had conventions for presenting such primitives can be answered by extracting the presentational types for logical types such as 'tlg0552.tlg008:weight' and 'tlg0552.tlg008:volume' and doing a simple histogram where bins are presentational types. The results of such an analysis would inform diagram production, the process of serializing the logic of a construction into a presentational format. The Diagram Markup Language Compiler (dmlc) is an application built on DML to model the production of diagrams.

We observed that Archimedes actively uses previous constructions, changing them slightly in logic and presentation. Therefore, DML defines a machine-actionable format for references to previous constructions and diagram templates. These DML-URNs, similar to CTS-URNs, provide a mechanism for referring to previous diagrammatic content from the granularity of an entire construction or diagram instance to a single logical or presentational primitive. Whether Archimedes refers to some constructions more than others and the granularity at which reference occurs can be quantitatively explored by extracting a reference graph from the DML and looking at the number of incoming edges for each reference node. Proximity of reference could be explored by looking at the distance between the references with respect to the citation scheme of the text. Looking at different measures of this notion of 'proximity of reference' could provide insight into the order in which Archimedes presented propositions and how Archimedes expected one to logically navigate his work. Currently, the DML Navigator is a simple application leveraging the markup to help with diagram navigation. Users can explore the step-by-step logical structure of a construction and watch the diagram appear before their eyes. More elaborate diagram navigation applications would incorporate text.

We also observed that Archimedes uses diagrams in two different ways in a proof by contradiction. Sometimes he presents the logically impossible, other times, the logically correct diagram. DML makes it possible to represent this by defining a language for expressing the logic of a diagram and a separate language for expressing the presentation of a diagram. In DML, it is possible to render a logically inconsistent diagram for proposition 1.3 and then reuse the same presentational markup to generate a logically consistent diagram in proposition 1.5. One approach to answering the question of when Archimedes used one presentation over another would be to write a contradiction-finder. This application would walk the parse tree for the logical markup (much as the dmlc does now) and record all logical constraints on the the logical primitives in a table. The presentational markup would then be walked and a table of presentational primitives generated. For each constraint in the constraint table, the contradiction-finder would look up the corresponding primitives in the presentational symbol table and make sure they satisfy the geometric interpretation of the constraint. In this manner, logically inconsistent editions of diagrams could be automatically discovered. After resolving unintentional inconsistencies in the markup, the remaining results would return the exact location of the logical inconsistencies as a DML-URN.


The observations made while encoding the diagrams in Floating Bodies, Book I provided requirements for a computational framework enabling the quantitative study of diagrams and their use in Archimedes. With a representation of diagrams in hand, it now becomes possible to model a series of traditional operations for diagrams, namely, navigation, production, and logical assertion, as computations.

Further, while sketching computational solutions to these Classical questions, the new operation of querying diagrams has emerged. Parsing DML into a tabular format makes discovering diagrams by logical type, selecting nodes to include in a graph of logical dependencies, and locating logical contradictions a feasible operation that enables scholars to pursue entirely new questions about the use of diagrams in the Archimedes Palimpsest. Developing a larger corpus of encoded diagrams will provide more insight into how diagrams were used in Ancient Science and Mathematics, will help preserve and increase access to documents that paved the way for modern science, and will motivate additional computational techniques for quantitatively understanding how these materials were used to reason in antiquity and how they might help us to reason today.