Search engine for graphical user interfaces

Search engine for graphical user interfaces

Master Thesis

“Where do I find the double underlined button in word?” Often users of graphical user interfaces (GUI) want to search for functionalities in software programs. Today people start searching in the world wide web or try to find it via given hierarchical menus or manuals. These menus are organized by a software developer which may have a totally different view of grouping these features. This thesis wants to give a solution for this problem. Hence a search engine for graphical user interfaces is proposed. The main approach is divided into two tasks.
The first task is to extract models of software programs live from users. This model contains what elements appear when a button or other UI elements are clicked or invoked. Thus it is possible to find out which path of interactions has to be executed to see a GUI element. The main sources of such information are frameworks of user interface (UI) accessibility like “Microsoft Active Accessibility” (MSAA) or the successor called “Microsoft UI Automation” (UIA). These frameworks are created for people with disabilities to interact with this kind of interfaces. It helps to find and select graphical control elements. Assistive technology (AT) like screen readers make heavy use of it. Extracting such models with UI accessibility frameworks can only happen live when the user interacts with a program because the elements are just accessible when they exist on the screen. Therefore additionally a tool called ClickMonkey is developed which explores the software by clicking on all UI elements.
The second task is to build up a search engine based on such a model to index all text displayed on screen. When the user is in a situation of not finding the right widget, it is possible to bring up a search window on top via a keyboard shortcut or a mouse click. Afterwards the user can type in a search term and a list of ranked user interface elements is presented. Selecting one of them triggers a execution of interactions (like mouse clicks) to bring the appropriate program in the right state. The interactions are inferred from the model because every interaction observed from a user can be simulated like in GUI testing frameworks.
The evaluation shows that the current implementation can help the people in difficult tasks at least in 60 % of the cases. For easy tasks which can be completed in less than 5 clicks the additional time to search is too long. This shows that the approach can help novice as well as expert users which know and work with the software.
The proposed approach tries to help users which could not find a functionality although it exists. It will guide them to the correct place and the user can continue to work.

Ende: 26.01.2015


  • Benedikt Schmidt

Forschungsgebiete: Knowledge Mining and Assessment