The “big data” era offers novel challenges for accelerating scientific progress and enabling new modes of discovery (Honavar 2014; Singh & Reddy 2014). We present some work on the “Dr Inventor” (O’Donoghue et al. 2014; O’Donoghue et al. 2015) creativity support tool that aims to uncover novel analogy-based comparisons (Gentner 1
983; Fauconnier & Turner, 1998) between academic publications.
Dr Inventor does not work directly on the publications, but instead generates Research Object Skeleton (ROS) graphs. Generation of a ROS graph starts when a pdf document
enters the Text Mining Framework (Ronzano & Saggion 2015), addressing problems arising from the layout, text flow, images, equations etc. A parser generates the dependency tree for each sentence and Agarwal et al. (2015) we apply a set of rules to the dependency trees,
generating connected triples of nouns and verbs forming the ROS graph. Crucially, multiple mentions of the same concept are uniquely represented within each ROS, using the co-reference resolution output from the dependency parser.
ROS graphs enable the application of Gentner's (1983) structure mapping theory to finding and evaluating analogies between ROSs - and thus between publications. This uses a combination of computational power and topologically driven analogical retrieval. Semantic web annotations (Ruiz-Iniesta & Corcho 2014) of sentences allow Dr Inventor to explore analogies between the "background" of one paper and the "approach" of another. Dr Inventor is being evaluated by experts in computer graphics for its ability to discover novel and useful analogies, inspiring its users and igniting their creativity.