Research Information - Fall 2004

Trust No One: Building Secure Software

By Philip Fong

As we increasingly rely on computing infrastructures that are networked and thus open to security threats of all sorts, our society has acquired, in recent years, a better appreciation for the need of software security. One emerging challenge arises from the growing popularity of Dynamically Extensible Software Systems, such as mobile code language environments, scriptable applications, and software systems with “hot plug-in’’ architectures. In such systems, executable extensions can be dynamically linked into the address space of a host software system, either to deliver a short lived service, or to augment the capability of the underlying host in a permanent manner. If adopted, unchecked, malicious software extensions could compromise data confidentiality, system integrity, and service availability of the host. As a cross-disciplinary subject, the security of Dynamically Extensible Software Systems has been studied by researchers from various fields, including Computer Security, Programming Languages, and Operating Systems. Although the problem is now much better understood, and occasional inroads have been made, the ability to augment a complex software system securely at run time remains a holy grail for both researchers and practitioners. Armed with a sensitivity to the way complex software systems are constructed and configured, and equipped with time-tested methodologies, notations, tools and design principles, the Software Engineering community has much to offer in addressing the unique security challenges raised by the need for software extensibility. This research agenda belongs to an area that has been dubbed Trusted Software Engineering.

Over the years, researchers have proposed four approaches to protection for software extensions.
  1. Discretion refers to the use of cryptographic techniques such as digital signature to witness to the security of untrusted code. This approach is by itself not secure, because the semantics of a signature is not well-defined. Researchers have recently attempted to physically bind a digital signature to a formal verification algorithm using trusted hardware.

  2. Arbitration refers to the use of a trusted middle man to handle sensitive resources on behalf of untrusted code. The employment of a virtual machine or an execution monitor is a good example of arbitration. Recent scholarship has turned to study the inherent computational limitation of execution monitor: how does one characterize the class of security policies that can be enforced by an execution monitor endowed with a specific computation resource?

  3. Verification refers to the use of static program analyses, type checking, theorem proving, or model checking to scan untrusted code for potentially unsafe behavior. Various augmented type systems have been proposed recently to express security policies in programming languages such as Java.

  4. Transformation refers to the use of automatic program transformation to rewrite untrusted code so as to eliminate unsafe behavior. This complements the verification approach: if you cannot show that it is safe, then tweak it till it is. An interesting recent proposal has been to apply program transformation to inline execution monitors into untrusted code.

Software security is a rapidly growing field, with practical applications and challenging open problems. Research in this area involves both mathematical modeling and system building —- formalism is involved because it is the only way to certainty, and implementation is a must because there is no room for speculation in security. If you think you are interested in doing research in this area, please contact Philip Fong at

Computer Audio Topics and Applications

By David Gerhard

As with many topics in Computer Science, the motivation behind computer audio is split between technology and applications. Familiar applications of computer audio include speech recognition, music, and game sound. This article will discuss each of these applications, as well as some of the technology behind them, to educate and perhaps motivate interest in research in these applications. This work is very interdisciplinary and there are several individuals and research groups here on campus also doing work in this area, in the departments of music and media production and studies.

Speech recognition has been a goal of the artificial intelligence community for more than 50 years. When computers first began to show their ability to quickly and accurately process information, it was believed that speech recognition, being simply an information processing task, would be an appropriate and straightforward application for these new machines. Science fiction writers such as Isaac Asimov, Arthur C. Clarke, and Gene Roddenberry, envisioned machines which could have fluent conversations with humans, and even wonder at their own existence. It seems that understanding and creativity are the bottleneck of the intelligent machine, and while these are tasks that have not been realized as yet, a more unexpected difficulty is the dual tasks of understanding and generating human speech. As has been made clear by the efforts of large corporations like IBM, such tasks are much more difficult than they first appeared. It is tempting to believe that since most humans are able to communicate verbally even when they are quite young, verbal communication must be a task that can be encoded into a machine; however, scientists are discovering that the ability of young children to speak is contingent on many deeply coded, evolutionarily designed brain structures which are present even from birth. If we can apply some of these new findings from early childhood development to computational speech recognition, it may be possible to create machines that are capable of understanding speech. The question revolves around how to develop these built-in brain structures that allow children to be so proficient at learning languages. Such research may be applicable to all facets of machine learning.

Current speech recognition systems are tasked only with the job of writing down what we say. We must use a special microphone fitted very close to our mouth, we must train the machine to recognize our voice (and as a consequence of the training, ours alone) and we must expect that should we have a cold, or should there be excessive noise in the room, the machine will perform poorly. This is odd and somewhat frustrating considering that humans can understand conversations held a noisy environments and need no new training when speaking to a new person. Humans can also easily resolve ambiguities in language which are difficult for machines to resolve. For example, the phrases “Youth in Asia” and “Euthanasia” sound identical and can be resolved only by analyzing the context of these phrases.

Typically, speech recognition proceeds in several stages. First, the incoming sound is analyzed to determine the phonemes being used. These individual language sounds are arranged into diphones and triphones, groups of two or three of these sounds. These groupings are then compared against statistical models to judge which are most likely. Language models are then applied to these groups to discover which word, phrase, or sentence may be likely, given the initial acoustical signal. many researchers are investigating new ways of performing each of these stages, but the overlying process remains the same. Maybe YOU have some new ideas that could be applied to speech recognition!

Since the creation of the digital computer, musicians have seen an opportunity to augment their art. Computers allowed high-quality recording and playback in a way that magnetic tape could not match. Today, the most common interaction method for music, and in fact any recorded sound, is the compact disk, on which music is stored in digital format. With the rise of internet music and MP3-based compression, a new era of music storage and retrieval has arrived which allows individuals to carry hundreds of hours of music with them in a machine the size of a deck of cards. The next problem? How do you find what you are looking for amongst all of that music. Modern research into computer audio includes Music Information Retrieval, where the task is not just to store music but to index and retrieve songs based on the audio content. It is, of course, possible to search by artist, album and genre, but this information must be hand-coded at some point. Much better would be a system which could search through and characterize the music in a collection, so you could say “Play me some o’ that ol’ country music!” and it would generate a playlist based on the acoustic qualities of the music. Or you could find the name of that song that has been driving you mad just by humming it to the computer.

Computers are used throughout the music making industry as well. Do you think Britney Spears hits all the right notes every time? No way. She uses computer audio technology to “auto-tune” herself while she sings. So do many other artists on stage today. Computers are used to generate perfectly rhythmic drum beats, to play effects and samples through synthesizers, and even some DJs who scratch vinyl have converted to digital turntables which scratch sound files instead of records. Computers are even used to compose new music or to automatically add harmonies to a composer’s melody line.

Video games have often concentrated on graphics, relegating sound to a secondary role. Early video games had a few blips and beeps as chunky red squares moved around the screen. Some innovative game designers added a musical soundtrack to keep interest. Remember asteroids? The “Boop ... Boop ... ” that got faster and faster as the game progressed? Game sound has evolved since then, but sound has often been given much less processing power than it could use. Even modern games resort to playback of wavetables for sound effects, and we humans can tell when we’ve heard the same engine noise or gunshot a million times before.

So what’s new in the world of computer audio? 3-D game sound that simulates a real environment, with reverb and sound fields, played through a 5.1 or 10.2 sound system. (current game consoles can play 5.1 surround sound but it is often a stereo sound re-mixed to 5.1). Music information retrieval which can tell you what you are listening to without ID3 tags. Speech recognition that actually works. Conversational dialog systems that will replace the “press 3 for more options” systems. Audio user interfaces that require no screen or keyboard. Robotic musical instruments that play themselves.


To Top of Page