About the Speaker

David Ferrucci

Dr. David Ferrucci is an IBM Fellow and the Principal Investigator (PI) for the Watson/Jeopardy! project. He has been at IBM’s T.J. Watson’s Research Center since 1995 where he heads up the Semantic Analysis and Integration department. Dr. Ferrucci focuses on technologies for automatically discovering valuable knowledge in natural language content and using it to enable better decision making.

As part of his research he led the team that developed UIMA. UIMA is a software framework and open standard widely used by industry and academia for collaboratively integrating, deploying and scaling advanced text and multi-modal (e.g., speech, video) analytics. As chief software architect for UIMA, Dr. Ferrucci led its design and chaired the UIMA standards committee at OASIS. The UIMA software framework is deployed in IBM products and has been contributed to Apache open-source to facilitate broader adoption and development.

In 2007, Dr. Ferrucci took on the Jeopardy! Challenge – tasked to create a computer system that can rival human champions at the game of Jeopardy!. As the PI for the exploratory research project dubbed DeepQA, he focused on advancing automatic, open-domain question answering using massively parallel evidence based hypothesis generation and evaluation. By building on UIMA, on key university collaborations and by taking bold research, engineering and management steps, he led his team to integrate and advance many search, NLP and semantic technologies to deliver results that have out-performed all expectations and have demonstrated world-class performance at a task previously thought insurmountable with the current state-of-the-art. Watson, the computer system built by Ferrucci and his team beat the highest ranked Jeopardy! champions of all time on national television on February 14th 2011. He is now leading his team to demonstrate how DeepQA can make dramatic advances for intelligent decision support in areas including medicine and health care.

Dr. Ferrucci has been the Principal Investigator (PI) on several government-funded research programs on automatic question answering, intelligent systems and saleable text analytics. His team at IBM consists of 32 researchers and software engineers specializing in the areas of Natural Language Processing (NLP), Software Architecture, Information Retrieval, Machine Learning and Knowledge Representation and Reasoning (KR&R).

Dr. Ferrucci graduated from Manhattan College with a BS in Biology and from Rensselaer Polytechnic Institute in 1994 with a PhD in Computer Science specializing in knowledge representation and reasoning. He is published in the areas of AI, KR&R, NLP and automatic question-answering.

Beyond Jeopardy! The Future of Watson

Computer systems that directly and accurately understand and answer people’s questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Toy solutions are easy to create when the knowledge is narrowly bounded and the queries anticipated by the programmers. The real goal for Artificial Intelligence is for the machine to digest language as fluently and freely as humans, eliminating the need to manually and explicitly formalize the knowledge expressly for the machine. Being able to leverage knowledge as it is prolifically and naturally captured and communicated by humans would facilitate a new era in informed decision making, giving users efficient, context-aware and precise access to the enormous wealth of knowledge humans naturally create and enrich every day. Applications in business intelligence, healthcare, customer support, social computing, science and government could all benefit from computer systems capable of deeper language understanding. The DeepQA project at IBM is aimed at exploring how advancing and integrating Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), Knowledge Representation and Reasoning (KR&R) and massively parallel computation can advance the science and application of automatic Question Answering and more general natural language understanding. An exciting proof-point in this challenge was developing a computer system that could successfully compete against top human players at the Jeopardy! quiz show.

Attaining champion-level performance at Jeopardy! requires a computer to rapidly and accurately answer rich open-domain questions, and to predict its own performance on any given question. The system must deliver high degrees of precision and confidence over a very broad range of knowledge and natural language content with a 3-second response time. To do this, the DeepQA team advanced a broad array of NLP techniques to find, generate, evidence and analyze many competing hypotheses over large volumes of natural language content to build Watson (www.ibmwatson.com). An important contributor to Watson’s success is its ability to automatically learn and combine accurate confidences across a wide array of algorithms and over different dimensions of evidence. Watson produced accurate confidences to know when to “buzz in” against its competitors and how much to bet. High precision and accurate confidence computations are critical for real business settings where helping users focus on the right content sooner and with greater confidence can make all the difference. The need for speed and high precision demands a massively parallel computing platform capable of generating, evaluating and combing thousands of hypotheses and their associated evidence.

In this talk, I will introduce the audience to the Jeopardy! Challenge; explain how Watson was built to ultimately defeat the two most celebrated human champions of all time. I will discuss how Watson will advance beyond Jeopardy! to solve real problems in healthcare through natural language dialog, ultimately taking another step towards Turing’s vision.