Saturday, May 17, 2008

Theoretical Computer Science (TCS)

Theoretical Computer Science (TCS), broadly speaking, is concerned with understanding the very nature of computation:
What problems can be solved by computers? And how efficiently can such problems be solved? Can `hard' problems be used to our advantage in any way? TCS encompasses research in such diverse areas as complexity theory, algorithms, cryptography, distributed computing, machine learning, and more; the common thread is a focus on precise models and rigorous mathematical analysis of particular problems within those models.
We have a strong group of faculty actively working in this area. We also have a large number of students who are encouraged to get involved early by attending weekly TCS reading groups run by the group. See http://www.cs.umd.edu/areas/Theory/ for further details. Brief descriptions of faculty research interests follow.
Bill Gasarch is interested in complexity theory and TCS-at-large; his most recent research has been in the area of communication complexity and applications of combinatorics to problems in this area. To get a flavor for this type of research, consider the following problem: Alice holds as input an n-bit string A, while Bob holds as input an n-bit string B. Alice and Bob want to determine whether or not their strings are equal. Clearly, there is the trivial solution in which Alice sends all of A to Bob; is there a better protocol that communicates fewer bits? What is the minimum number of bits they need to exchange in order to solve the problem? Besides their inherent beauty, problems of this type have ties to other areas of TCS (e.g., proving lower bounds on computational efficiency of certain problems) and possible practical applications as well (e.g., minimizing resource-intensive communication in sensor networks).
Jonathan Katz focuses primarily on cryptography and, more generally, those aspects of computer security that can be precisely modeled and formally reasoned about. His research has spanned a wide range of topics, from work on theoretical foundations of cryptography (e.g., analyzing protocols for secure distributed computation), to the design of efficient cryptosystems with novel properties (e.g., "anonymous signatures" and identity-based encryption), to studies of secure algorithms for peer-to-peer and wireless sensor networks. Amazingly and sometimes counter-intuitively, it is possible to construct cryptographic solutions that can be proven secure against arbitrary (yet efficient) attackers as long as certain problems, such as factoring, are as "hard" as is commonly believed.Jonathan Katz focuses primarily on cryptography and, more generally,
those aspects of computer security that can be precisely modeled and
formally reasoned about. His research has spanned a wide range of
topics. He has worked on theoretical foundations of cryptography. One
example of this is secure distributed computation, which is when
different computers have information that they want to jointly use
to compute something; however, they do not want to reveal their
information. He has designed efficient cryptosystems with novel
properties (e.g., ``anonymous signatures's and identity-based encryption), com He has also worked on Computational number theory,
secure algorithms for peer-to-peer and wireless sensor networks.
//-->
Samir Khuller's research centers around the design of efficient graph algorithms and approximation algorithms. He is particularly interested in problems for which it is widely believed (for mathematical reasons) that there is no fast algorithm to obtain the optimal solutions (namely NP-hard problems. Recently, one specific research project of Samir's has dealt with algorithms for storage and movement of data; this work has applications to large multimedia data storage systems. Some of the principal challenges that arise in this context are: (a) deciding how many copies of each data item to store, (b) determining the exact layout of data on a set of servers, and (c) dealing with changing workloads and dynamic data access patterns, all while seeking to maximize access to the data. More recently, Samir has also been working on developing algorithms for finding the cheapest paths that incorporate actual costs (think: gas prices) and not just distances.
Clyde Kruskal recently solved a problem in geometric combinatorics related to ``colorings of the plane (A coloring of the plane is simply a color assignment to each point in the plane). Consider the statement: in any 2-coloring of the plane there exist two points of the same color exactly 1 inch apart. This is true (and easy to prove). It is also true for 3-colorings, but false for 7-colorings. The cases of 4,5, and 6 are open, though Erdos conjectured that 4 is the maximum value for which the statement is true. Clyde has looked at the problem in restricted regions of the plane showing, for example, that a 2-coloring of a (small) square exists in which there are no points of the same color exactly 1 inch apart.
Dave Mount's interests are in computational geometry, that is, the study of algorithms and data structures for geometric problems. He is particularly interested in problems that have applications in areas such as computer graphics, computer vision, and manufacturing. He is currently working on three projects. The first involves the computational complexity of approximate range searching. The problem is to store a large multi-dimensional point set in a data structure so that the number of points lying within some given query region (a sphere or cube, for example) can be computed very efficiently. In the approximate setting the range shape is treated as a fuzzy object, and points that are sufficiently close to the boundary may either be counted or not.
The second project involves efficient data structures for point-set pattern matching in a database, where the point sets may be subject to geometric transformations (such as translation and rotation) or the introduction of spurious or outlying points. The third project involves efficient algorithms for geometric interpolation based on a method called Kriging. This work is being applied to data fusion applications in remote sensing and geostatistics.
The work of Aravind Srinivasan lies generally in the area of randomized algorithms. His research has focused most recently on probabilistic models and algorithms in epidemiology, networking, and combinatorial optimization. Examples include: models for disease-spread in large urban areas, and efficient algorithms to control such spread; information-dissemination in unstructured peer-to-peer networks; estimating the capacity of wireless networks; and approximation algorithms for scheduling problems.
Uzi Vishkin's research interests center around parallel algorithms and architectures. Facilitating a transition into ubiquitous parallel computing has been a strategic objective for computer science and engineering since its inception in the 1940s. A theory enthusiast, the overriding theme guiding his work was using theory to guide the rest of the field in addressing this strategic objective. Key components in his comprehensive plan include: (i) the very rich PRAM parallel algorithmic theory, and (ii) a PRAM-On-Chip vision of a computer system framework. The latter provides a powerful approach to multi-core architectures, and, in particular, the exponential increase in the number of cores in the roadmap of most vendors into the late 2010s. Recently, he has focused on the feedback loop between algorithms, their programming and implementation, and architecture, as well as their performance modeling.
Most of the funding for theory comes from NSF.

The Human-Computer Interaction Lab (HCIL)

The Human-Computer Interaction Lab (HCIL) at the University of Maryland (directed by Associate Professor Allison Druin) designs, implements, and evaluates new interface technologies that are universally usable, useful, efficient and appealing to a broad cross-section of people. We believe it is critical to understand how the needs and dreams of people can be reflected in our future technologies. To this end, the HCIL develops advanced user interfaces and design methodology. Our primary activities include collaborative research, publication and the sponsorship of open houses, workshops and symposiums.
The HCIL is an interdisciplinary lab comprised of faculty and students from Computer Science, Information Studies, and Psychology. Our current work includes new approaches to information visualization, interfaces for digital libraries, mobile devices and electronic paper. We also explore zoomable user interfaces (ZUIs), technology design methods with and for children, and instruments for evaluating user interface technologies.
Founded in 1983 by Ben Shneiderman, the lab is one of the world's oldest centers for the study of human centered computing, with a long history of creating innovative interaction designs, understanding human performance, and applying these two to the development of applications that serves the community. From the concepts of direct manipulation and dynamic queries, to treemaps and other major advances in visualization, Ben Shneiderman has been a pioneer in the field, and continues to be an active participant in the lab today.
Projects are described extensively at www.cs.umd.edu/hcil/ - including:
International Children's Digital Library (ICDL) - www.childrenslibrary.org (led by Allison Druin, Ben Bederson & Ann Weeks): Imagine a world where a comprehensive library of international children's literature is available to all children across the globe. Imagine a technology that can cost effectively digitize massive amounts of information dedicated to the needs of children. The ICDL selects, collects, digitizes, and organizes children's materials in their original languages and to create appropriate technologies for access and use by children 3-13 years old. With over 1,500 books from over 40 countries with the website in 13 languages and a million users so far, the ICDL is working hard to fulfill this mission.
Treemap Visualization - www.cs.umd.edu/hcil/treemap/ (led by Ben Shneiderman and Catherine Plaisant): Treemap is a space-constrained visualization of hierarchical structures. It is very effective in showing attributes of leaf nodes using size and color coding. Treemaps enable users to compare nodes and sub-trees even at varying depth in the tree, and help them spot patterns and exceptions. They have been applied to a huge range of application domains and have been distributed by many organizations including Microsoft, The Hive Group and even the Marine Corps. Treemaps are one of many visualizations that the HCIL builds to help make sense of the ever-growing masses of multidimensional and multi-typed information.
People, Paper and Computers - www.cs.umd.edu/~francois/ (led by Francois Guimbretiere): While pen and paper are easy to use, reliable and extremely versatile, they stand on the margins of the digital world. The Paper Augmented Digital Documents (PADD) system uses digital pens to bridge the gap between digital documents and their printouts. PADD is used by PapierCraft, a command selection system designed specifically for paper-based interactions. With PapierCraft, users can copy information from a printout to their notes (either on paper or on a tablet PC), "stitch" paper documents together, or create links between two documents. By promoting cohabitation between digital documents and paper, this system lets users enjoy the benefits of both media - and provides an infrastructure for better understanding the differences between paper and digital media. PapierCraft was built in collaboration with Ken Hinckley at Microsoft and Jim Hollan at UCSD, and has been integrated as part of the classroom presenter system in collaboration with Richard Anderson at University of Washington.
Recent successes include: The completion of a three-year $1 million partnership with Microsoft Research; Recent graduates going to work at Google, Microsoft Research and the Universities of Iowa, Buffalo and South Carolina; Awards to Ben Bederson from IBM, to Françs Guimbretiè from NSF, and graduate students Amy Karlson from Microsoft and Haixia Zhao from the American Association for University Women; and funding from a wide range of industry and government sponsors.
Finally, the HCIL sponsors an annual Symposium with about 300 attendees that come to learn about HCIL innovations and technologies. It is typically held the first week of June, and all students are welcome for free. See the HCIL website for details.

Computer Systems

Computer Systems provides the foundation upon which all other software applications rely. The goal of systems research is to develop the key abstractions and services that enable software to be efficiently and portably run on hardware. Areas of interest to the systems group include operating systems, computer networks, parallel and distributed computation, and computer security.
The systems group tackles problems from both theoretical and experimental approaches. To support our experimental work, the group maintains several laboratories in the Computer Science department and in UMIACS. The Laboratory for Parallel and Distributed systems includes a collection of parallel computers and clusters to support systems research. Current equipment includes a 24 processor SPARC SMP, 8 processor IBM Power 4 system, and a 128 processor Myrinet-connected Linux cluster. The Distributed Systems Software Laboratory contains flexible networking environment to allow students to configure networking switches to allow for experimental research. In addition, the laboratory includes about 20 machines to support experiments.
The history of the systems group at Maryland dates back more than 30 years. One early member of the group, Yohan Chu, wrote the book "Computer Organization" which was the first major book on the subject. This book was used extensively at many universities and colleges. David Mills was an early researcher in computer networks. He set up an ARPANet IMP (predecessor of the current Internet) in his basement using a PDP 11/45 (an early mini-computer). At the time, this was the only full ARPANet node not located at a University or a Government facility. In the late 1970's, Chuck Rieger and Mark Weiser built ZMOB, an early parallel computer based on commodity microprocessors. The system consisted of 128 Z-80 processors.
Students and PostDocs from the systems group have gone on to faculty and industry positions around the world. Recent graduate students in faculty positions include: Gagan Agrawal (Ohio State University), Suman Banerjee (University of Wisconsin), Ugur Cetintemel (Brown University), Ibrahim Matta (Boston University), Bongki Moon (University of Arizona), Ron Larsen (Dean of College of Information Science, University of Pittsburg), and Sang Son (University of Virginia). Many of our former students have gone on to careers at major research labs including AT&T Labs (Vijay Gopalakrishnan , Seungjoon Lee), Google (Ruggero Morselli), and IBM T.J.Watson (Henrique Andrade, I-Hsin Chung, Andrzej Kochut, Kyung Ryu). The group also has a rich history of PostDoc researchers who have gone onto successful careers. For Example Anurag Acharya, and Guy Edjlali are now at Google.
Current faculty in the systems group include:
Bill Arbaugh's speciality is information security and privacy. In the past, information systems were large, expensive, barely inter-connected, and non-mobile. Today, information systems are small, lightweight, highly connected via wireless technology and mobile. Tomorrow, they will be even smaller and more mobile. The constant evolution in the design and operation of information systems presents new and increasingly complex challenges for computer security. One technology he developed is Copilot, a security and management monitor that is capable of detecting potential intruders in a high assurance manner and ensures that the intruders activities does not compromise the main system. He is also the CEO of a start-up company, Komoku, Inc.
Ashok Agrawala, an AAAS Fellow, researches the basic nature of information and it's implications to the design and implementation of computer systems. He developed an Information Dynamics Framework, which distinguishes between information and its representation, recognizing that computers only deal with the representations. In 2004, he won the University of Maryland's "Invention of the Year" for Horus Technology. Horus, a novel location determination technology developed with Moustafa Amin Youssef, uses unique algorithms to efficiently process the signal information which is used to determine position.
Bobby Bhattacharjee's research interests are in the design and implementation of wide-area networking, distributed systems, and security protocols. His current focus is on the design of decentralized secure systems for multi-party applications and large scale data distribution, especially in the context of peer-to-peer and overlay systems. His group has build systems that demonstrate protocols for scalable media streaming, randomized resilience, anonymous communication, bulk data delivery, multicast rekeying, distributed directory service, unstructured lookup, secure lookup, and predicate-based search. In current work, he is working on applying techniques from game theory and mechanism design to problems in wireless networking and peer-to-peer systems.
Jeff Hollingsworth's research is in the areas of tools for high performance computing, program instrumentation, and programmer productivity. His work on high performance computing includes tools to measure the performance of parallel programs and Active Harmony a system to allow runtime automatic tuning of parameters and algorithms for applications and middleware. His work on program instrumentation includes the Dyninst tool suite. Dyninst provides multi-platform instrumentation of binaries via both offline binary editing and online program modification. His work on programmer productivity seeks to understand how parallel programmers spend their time, and how tools and programming environments can be improved to increase productivity of parallel programmers.
Pete Keleher's research is primarily in the field of distributed systems. His work spans shared memory protocols, adaptive grid schedulers, peer-to-peer systems (including distributed searching and ranking algorithms), and gossip-based consensus algorithms. Currently, much of his work is in the context of storage systems, specifically a wide-area file system called MoteFS. MoteFS is a novel system structured around the concepts of lightweight snapshots and principal-free capabilities. The combination allows extremely efficient and fine-grained control over access to the data storage.
Udaya Shankar's research interests are in the design and analysis of distributed systems and network protocols, in both correctness and performance aspects. His correctness work deals with compositional methods for specification, verification, and testing of concurrent (including distributed) systems, focusing on realistic problems, especially at the transport and routing layers. He's currently working on a "programmer-friendly" compositional framework called SeSF that can be used for system design as well be incorporated in existing concurrent programming languages (Java, C#). His performance work deals with analytical, simulation and experimental evaluation of queuing models of networking systems, under both steady-state and transient conditions. He's currently working on timestep stochastic simulation (TSS), developing a method to compute sample paths of general queuing networks with state-dependent delayed feedback (e.g., TCP/IP/WLAN networks) with the accuracy of packet-level simulation but at a cost several orders cheaper.
Neil Spring's current research focuses on the design of network protocols that are self diagnostic: that embed features that expose and explain faults directly to users and administrators. Neil uses implementation as a primary means of evaluating his research ideas: he has constructed software to measure ISP network topologies efficiently and accurately, software to support arbitrary but safe network measurement, and software that couples instrumentation probing with data transfer in the nearly-ubiquitous TCP. To provide useful conclusions about network structure and behavior requires the aggregation of various sources of information: routing protocol information from BGP, information embedded in host names in DNS, information returned by routers in packet identifiers, source addresses, and record route entries, etc. Each source of information can, by virtue of varied implementation and configuration decisions, provide false data; a current challenge is building the reasoning logic to find the best explanations for measured Internet data.
Alan Sussman's main research area is in software tools for high performance parallel and distributed computing, which is now widely known as Grid computing. Within that broad area, one major research interest relates to interoperability of parallel (and sequential) programs, in particular how that can be applied to complex coupled physical simulations. That interest ties in closely to related interests in software component technologies, in particular how they can be applied to high-end supercomputing applications. Another major research interest involves various types of runtime and compiler support for high performance data intensive applications, and their relation to high performance database systems. A more recent research area is in applying peer-to-peer computing techniques to high-end computing problems, particularly focusing on utilizing desktop computers effectively to perform large-scale computations. All the parts of my research program are strongly motivated by high-end applications, with a current emphasis on space science, astronomy and earth science applications, motivated by collaborations with scientists working in those areas.
The Systems group receives support from the Department of Defense, Department of Energy, NASA, and the National Science Foundation. Additional support is provided by industrial partners including DoCoMo, Fujitsu, IBM, Microsoft, Samsung, and Sun Microsystems.