The DocEng program will showcase research topics related to Document Engineering, including document workflows and security, user applications, multimedia and mobile documents, document abstraction and summarisation, document classification and similarity measures, and visual document analysis among others.
The symposium will feature three tutorials held on 4 September
Historical Document Processing
Speaker: Basilis G. Gatos
Time: 09:30 - 13:00
Historical manuscript collections can be considered as an important source of original information in order to provide access to historical data and develop cultural documentation over the years. This tutorial focuses on recent advances and ongoing developments for historical handwritten document processing. It includes the main challenges involved, the different tasks that have to be implemented as well as practices and technologies that currently exist in the literature. The main tasks that have to be implemented in the historical document image recognition pipeline, include preprocessing for image enhancement and binarisation, segmentation for the detection of main page elements, of text lines and words and, finally, recognition. In cases where optical recognition is expected to give poor results, keyword spotting has been proposed to substitute full text recognition. The focus is given on the most promising techniques, related projects as well as on existing datasets and competitions that can be proved useful to historical handwritten document processing research.
Document Engineering Issues in Malware Analysis
Speaker: Charles Nicholas
Time: 09:30 - 13:00
The focus of the tutorial will be an overview of the field of malware analysis with emphasis on issues related to scalability. We introduce the field with a discussion of the types of malware, including executable binaries, malicious PDFs, and exploit kits. Some of the popular tools used for analyzing malicious binaries will be presented, including IDA, Binary Ninja, and x64dbg. Concepts and tools from static and binary analysis will be discussed. Some collections of malware specimens are available to researchers, and these will be used as examples as appropriate. We will discuss cluster analysis, malware attribution, and the problems caused by polymorphic malware. We will conclude with our view of important research questions in the field.
Understanding the User: User Studies and User Evaluation for Document Engineering
Speakers: Kim Marriott, Steven Simske and Margaret Sturgill
Time: 14:30 - 17:30
Document engineering is all about building systems and tools that allow people to work with documents and document collections. A key aspect is the usefulness and usability of these tools. In this tutorial we will look at the many different kinds of user studies and user evaluations that can be used to inform the design and improve utility and usability of document engineering applications. The tutorial will be based on actual studies and will also give participants a chance to explore how they might use these techniques in their research or system development. In the first part of the tutorial we will look at controlled experiments, questionnaires, in-depth interview, focus group and field studies, participative design, and user data collection and analysis. In the second part of the tutorial we will look at data analytics and how can data analytics be applied to user evaluation in the document engineering field. This is a two-direction relationship, namely data science to understand how users evaluate document sets and data science to understand how to evaluate users based on their interaction with the document set (user analytics), including time to task completion, robustness to frustration, ability to complete task, etc. The goal of the 'data analytics' portion of the tutorial will be to introduce the audience to classification and evaluation approaches, and from this understanding help to identify research challenges and experiments to be performed by the document engineering research community.
Registrations can be made through regonline.com/doceng2017 with the early bird registration ending on the 17 July. Registration fees are at $35 for a single tutorial and $50 for two tutorials. Students may register at a student rate of $30 for a single tutorial and $35 for two tutorials.