A Algorithms and Computational Biology Lab

Home | People | Projects | Directions | Colloquia | Misc

Colloquia

Fall 2006

Wednesday 4:00 pm - 5:00 pm @ EBU II Room 203

The refreshment will be provided by

Date Speaker Topic
Dec 6 Marek Chrobak

Title: "The doubling method to design approximation algorithms"


Abstract: This talk is based on an expository article by Claire Keyon and myself, in which we show that there is a number of approximation algorithms in the literature that use essentially the same (but yet not explicitely identified) technique that we refer to as "doubling". The essence of this approach is to use exponentially increasing estimates on the optimal solution to design approximate solutions. It can be used both in offline and online approximation algorithms. Applications include several clustering, searching, facility location, and scheduling problems.

Nov. 29 Prof. Haixu Tang

Title: "De novo identification of LTR retrotransposons in eukaryotic genomes"


Abstract: LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements. Here, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Our method consists of three steps, which identify young intact, old intact and solo LTR retroelements separately. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified many new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed.

Nov. 22 Serdar Bozdag

Title: "The importance of FPC and CORAL in the Barley Project"


Abstract:Barley ranks fourth among the cereals in worldwide production. It is an important crop for direct human consumption and for animal feed. In the Barley project, we aim to access gene-bearing regions of the genome. One of the most important phases of this project is assembling thousands of fingerprinted clones into contigs to build a physical map of barley. For this purpose, we use a software named FPC. FPC (FingerPrinted contigs) is an interactive program for building contigs from fingerprinted clones. FPC uses an algorithm to cluster clones into contigs based on their probability of coincidence score. FPC has editing facilities for the user to refine the coordinates and to remove poorly fingerprinted clones. However, a considerable amount of human intervention is required to improve the quality of the physical map built by FPC. Another group has developed a software named CORAL (Clone ORdering ALgorithm) to automatically order fingerprinted clones within contigs. They test CORAL on maps that have previously been manually edited and on maps derived from in silico simulations. Measurements suggest that CORAL is a significant improvement to FPC. In this talk, I will present the algorithms of FPC and CORAL. I will also talk about why these programs are important in the Barley project.

Nov. 15 Mathilde Hurand




Abstract:A popular approach in combinatorial optimization is to model problems as integer linear programs. Ideally, the relaxed linear program would have only integer solutions, which happens for instance when the constraint matrix is totally unimodular. Still, sometimes it is possible to build an integer solution with same cost from the fractional solution. Examples are two scheduling problems and the single disk prefetching/caching problem. We show that problems such as the three previously mentioned can be separated into two subproblems: (1) finding an optimal feasible set of slots, and (2) assigning the jobs or pages to the slots. It is straigthforward to show that the latter can be solved greedily. We are able to solve the former with a totally unimodular linear program, from which we obtain simple combinatorial algorithms with improved worst case running time. On the presentation I will briefly present total unimodularity, then show how we created 2 totally unimodular linear programs ( one problem of scheduling and a problem of caching), and show the parrallel with minimum flow problems.

Nov. 8 Christos Koufogiannakis

Title: "Bounded Degree Minimum Spanning Trees"

by Michel X. Goemans (MIT)

FOCS 2006

Abstract:We consider the minimum cost spanning tree problem under the restriction that all degrees must be at most a given value k. We show that we can efficiently find a spanning tree of maximum degree at most k +2 whose cost is at most the cost of the optimum spanning tree of maximum degree at most k. This is almost best possible. The approach uses a sequence of simple algebraic, polyhedral and combinatorial arguments. It illustrates many techniques and ideas in combinatorial optimization as it involves polyhedral characterizations, uncrossing, matroid intersection, and graph orientations (or packing of spanning trees). The result generalizes to the setting where every vertex has both upper and lower bounds and gives then a spanning tree which violates the bounds by at most two units and whose cost is at most the cost of the optimum tree.

Nov. 1 Yonghui Wu

Title: "Kernel-Based data fusion and its application to protein function prediction in yeast"

by G. R. G. Lanckriet, M. Deng, N. Cristianini, M. I. Jordan, and W. S. Noble

Pacific Symposium on Biocomputing 9:300-311(2004)

Abstract: Kernel methods provide a principled framework in which to represent many types of data, including vectors, strings, trees and graphs. As such, these methods are useful for drawing inferences about biological phenomena. We describe a method for combining multiple kernel representations in an optimal fashion, by formulating the problem as a convex optimization problem that can be solved using semide¯nite programming techniques. The method is applied to the problem of predicting yeast protein functional classi¯cations using a support vector machine (SVM) trained on ¯ve types of data. For this problem, the new method performs better than a previously-described Markov random ¯eld method, and better than the SVM trained on any single type of data.

Oct. 25 Qi Fu

Title: "Algorithmic approaches to selecting control clones in DNA array hybridization experiments"

Abstract: We study the problem of selecting control clones in DNA array hybridization experiments. The problem arises in the OFRG method for analyzing microbial communities. The OFRG method performs classification of rRNA gene clones using binary fingerprints created from a series of hybridization experiments, where each experiment consists of hybridizing a collection of arrayed clones with a single oligonucleotide probe. This experiment produces analog signals, one for each clone, which then need to be classified, that is, converted into binary values 1 and 0 that represent hybridization and non-hybridization events. Besides the sample clones, the array contains a number of control clones needed to calibrate the classification procedure of the hybridization signals. These control clones must be selected with care to optimize the classification process. We formulate this as a combinatorial optimization problem called Balanced Covering. We prove that the problem is NP-hard and we show some results on hardness of approximation. We propose an approximation algorithm based on randomized rounding and we show that, with high probability, it approximates well the optimum. The experimental results confirm that the algorithm finds high quality control clones. The algorithm has been implemented and is publicly available as part of the software package called CloneTools.

Oct. 25 Zheng Fu

Title: "Computing the Breakpoint Distance between Partially Ordered Genomes"

Abstract: The total order of the genes or markers on a chromosome is crucial for most comparative genomics studies. However, the current gene mapping efforts might only suffice to provide a partial order of the genes on a chromosome. Several different genes or markers might be mapped at the same position due to the low resolution of gene mapping or missing data. Moreover, conflicting datasets might give rise to the ambiguity of gene order. In this paper, we consider the reversal distance and breakpoint distance problems for partially ordered genomes. We first prove that these problems are NP-hard, and then give an efficient heuristic algorithm to compute the breakpoint distance between partially ordered genomes. The algorithm is based on an efficient approximation algorithm for a natural generalization of the well-known feedback vertex set problem, and has been tested on both simulated and real biological datasets. The experimental results demonstrate that our algorithm is quite effective for estimating the breakpoint distance between partially ordered genomes and for inferring the gene (total) order.

Oct. 18 Vladimir Vacic

Title: "An Important Connection Between Network Motifs and Parsimony Models"

by Teresa M. Przytycka (RECOMB 2006)

Abstract: We demonstrate an important connection between network motifs in certain biological networks and validity of evolutionary trees constructed using parsimony methods. Parsimony methods assume that taxa are described by a set of characters and infer phylogenetic trees by minimizing number of character changes required to explain observed character states. From the perspective of applicability of parsimony methods, it is important to assess whether the characters used to infer phylogeny are likely to provide a correct tree. We introduce a graph the- oretical characterization that helps to select correct characters. Given a set of characters and a set of taxa, we construct a network called character overlap graph. We show that the character overlap graph for characters that are appropriate to use in parsimony methods is char- acterized by significant under-representation of subnetworks known as holes, and provide a mathematical validation for this observation. This characterization explains success in constructing evolutionary trees using parsimony method for some characters (e.g. protein domains) and lack of such success for other characters (e.g. introns). In the latter case, the un- derstanding of mathematical obstacles to applying parsimony methods in a direct way has lead us to a new approach for dealing with inconsistent and/or noisy data. Namely, we introduce the concept of persistent char- acters which is similar but less restrictive than the well known concept of pairwise compatible characters. Application of this approach to introns produces the evolutionary tree consistent with the Coelomata hypoth- esis. In contrast, the direct application of a parsimony method, using introns as characters, produces a tree which is inconsistent with any of the two competing evolutionary hypotheses. Similarly, replacing persis- tence with pairwise compatibility does not lead to a correct tree. This indicates that the concept of persistence provides an important addition to the parsimony metohds.

Oct. 11
Lan Liu

Title: "Fast Elimination of Redundant Linear Equations and Reconstruction of Recombination-Free Mendelian Inheritance on a Pedigree"

by J. Xiao, L. Liu, L. Xia, T. Jiang. (SODA 2007)

Abstract: Computational inference of haplotypes from genotypes has attracted a great deal of attention in the computational biology community recently, partially driven by the international HapMap project. In this paper, we study the question of how to efficiently infer haplotypes from genotypes of individuals related by a pedigree, assuming that the hereditary process was free of mutations ( i.e. the Mendelian law of inheritance) and recombinants. The problem has recently been formulated as a system of linear equations over the finite field of $F(2)$ and solved in $O(m^3n3)$ time by using standard Gaussian elimination, where $m$ is the number of loci (or markers) in a genotype and $n$ the number of individuals in the pedigree. We give a much faster algorithm with running time $O(mn2 + n3\log2 n \log\log n)$. The key ingredients of our construction are (i) a new system of linear equations based on some spanning tree of the pedigree graph and (ii) an efficient method for eliminating redundant equations in a system of $O(mn)$ linear equations over $O(n)$ variables. Although such a fast elimination method is not known for general systems of linear equations, we take advantage of the underlying pedigree graph structure and recent progress on low-stretch spanning trees.

Click here for the paper

Winter 2006

Date Speaker Topic
Mar. 13
Zheng Liu

Title: "Combinatorial and Statistical Approaches for Oligonucleotide Fingerprinting of Ribosomal RNA Genes (OFRG)

Abstract: Oligonucleotide Fingerprint of Ribosomal RNA Genes (OFRG) is a high-throughput, microarray-based approach for identifying microorganisms. This method was developed by an interdisciplinary team of researchers at
the University of California at Riverside, and has been successfully applied to the analysis of bacterial and fungal communities. In this talk, I will present an overview of the OFRG approach and I will discuss two of the numerous computational problems that arise in OFRG, along with proposed solutions.

The first problem is a variant of approximate string matching. In genomic databases, gene sequences within a particular gene family are usually sandwiched between a specific pair of primers. In order to extract desired gene sequences with a given pair of primers, we have designed a fast algorithm for approximate string matching, called FAAST. FAAST generalizes the Tarhio-Ukkonen algorithm by requiring two or more matches when calculating shift distances. Both theoretical analysis and experimental results demonstrate a significant speed-up without loss of accuracy achieved by the algorithm.

he second challenge arises in the analysis of microarray data. In OFRG, the presence of specific nucleotides is determined based on intensity values of hybridization signals. Due to noise and technological limitations, these signals are sometimes too ambiguous for a reliable classification. In such situation, the traditional Bayes classification method would often lead to invalid predictions, affecting the accuracy of the OFRG results. To address this problem, we introduce a Modified Bayes Rule (MBR) which allows a “no prediction” result. MBR uses a cost structure to weigh the penalty for not making a definite prediction against the penalty for making an incorrect definite prediction. Experiments show that our method outperforms a neutral-zone rule that has been routinely used in this type of applications.

Mar. 20 Zheng Fu

Title: "A parsimony approach to genome-wide ortholog assignment."

by Zheng Fu, Xin Chen, Vladimir Vacic, Peng Nan, Yang Zhong, and Tao Jiang (RECOMB 2006)

Click here for the paper

Fall 2004

Date Speaker Topic
Oct. 15 Xin Chen

Title: "Predicting gene expression from sequence"

by Michael A. Beer, Saeed Tavazoie (Cell, Vol. 117, 185-198, April 16, 2004)

Abstract: We describe a systematic genome-wide approach for learning the complex combinatorial code underlying gene expression. Our probabilistic approach identifies local DNA-sequence elements and the positional and combinatorial constrains that determine their context-dependent role in transcriptional regulation. The inferred regulatory rules correctly predict expression patterns for 73% of genes in Saccharomyces cerevisiae, utilizing microarray expression data and sequences in the 800 bp upstream of genes. Application to Caenorhabditis elegans identifies predictive regulatory elements and combinatorial rules that control the phased temporal expression of transcription factors, histones, and germline specific genes. Successful prediction requires diverse and complex rules utilizing AND, OR, and NOT logic, with significant constraints on motif strength, orientation, and relative position. This system generates a large number of mechanistic hypotheses for focused experimental validation, and establishes a predictive dynamical framework for understanding cellular behavior from genomic sequence.

Oct. 22 Christoph Durr

Title: "Quantum algorithms for graph connectivity"

Abstract: Some intuition about quantum mechanics is given, definition of the model of quantum computation, explanation of Grover's search algorithm and application to graph connectivity.

Nov. 05 Haifeng Li

Title: "Minimum Entropy Clustering and Applications to Gene Expression Analysis"

Abstract: Clustering is a common methodology for analyzing the gene expression data. In this paper, we present a new clustering algorithm from an information-theoretic point of view. First, we propose the minimum entropy (measured on {\it a posteriori} probabilities) criterion, which is the conditional entropy of clusters given the observations. Fano's inequality indicates that it could be a good criterion for clustering. We generalize the criterion by replacing Shannon's entropy with Havrda-Charvat's structural $\alpha$-entropy. Interestingly, the minimum entropy criterion based on structural $\alpha$ -entropy is equal to the probability error of the nearest neighbor method when $\alpha=2$. This is another evidence that the proposed criterion is good for clustering. With a nonparametric approach for estimating {\it a posteriori} probabilities, an efficient iterative algorithm is then established to minimize the entropy. The experimental results show that the clustering algorithm performs significantly better than $k$-means/medians, hierarchical clustering, SOM, and EM in terms of adjusted Rand index. Particularly, our algorithm performs very well even when the correct number of clusters is unknown. In addition, most clustering algorithms produce poor partitions in presence of outliers while our method can correctly reveal the structure of data and effectively identify outliers simultaneously.

Nov. 12 Marek Chrobak Title: "Online Scheduling of Equal-Length Jobs:Randomization and Restarts Help"

           Abstract:We consider the following scheduling problem. The input is a set of jobs with equal processing times, where each job is specified by its release, time and deadline. The goal is to determine a single-processor, non-preemptive schedule that maximizes the number of completed jobs. In the online version, each job arrives at its release time. We give two online algorithms with competitive ratios below 2 and show several lower bounds on the competitive ratios. First, we give a barely random 5/3-competitive algorithm that uses only one random bit. We also show a lower bound of 3/2 for barely random algorithms that choose one of two deterministic algorithms. Second, we give a deterministic 3/2-competitive algorithm in the model that allows restarts, and we show that in this model the ratio 3/2 is optimal. For randomized algorithms with restarts we show a lower bound of 6/5. This is joint work with W.Jawor, J.Sgall, and T.Tichy.

Nov. 19 Petr Kolman Title: "Duality of multiroute flows and cuts"

           Abstract:A classical flow is a nonnegative linear combination of unit flows along simple paths. A~multiroute flow, first considered by Kishimoto and Takeuchi, generalizes this concept. The basic building blocks are not single paths with unit flows but rather tuples consisting of $k$ edge disjoint paths, each path with a unit flow. A multiroute flow is a nonnegative linear combination of such tuples. We present a simple combinatorial proof of the duality theorem for multiroute flows and cuts and its corollary which characterizes multiroute flows in terms of classical flows. Specifically, we show that a (classical) flow of size $F$ is a $k$-flow if and only if the flow through every edge is at most $F/k$. This duality then immediately yields an efficient algorithm. This is joint work with Amitabha Bagchi, Amitabh Chaudhary and Ji\v{r}\'\i\ Sgall.

Spring 2004

Date Speaker Topic
Apr. 12 Jie Zheng Title: "De novo Repeat Classification and Fragment Assembly."
Authors: Pavel A. Pevzner, Haixu Tang, Glenn Tesler.
RECOMB 2004.

           Abstract: "Repetitive sequences make up a significant fraction of almost any genome and an important and still open question in bioinformatics is how to represent all repeats in DNA sequences. We propose a radically new approach to repeat classficiation that is motivated by the fundamental topological notion of quotient spaces. A torus or Kein bottle are examples of quotient spaces that can be obtained from a square by gluing some points. Our new repeat classification algorithm is based on the observation that the alignment-induced quotient space of a DNA sequence compactly represents all sequence repeats. This observation leads to a simple and efficient solution of the repeat classification problem as well as new approaches to fragment assembly and multiple alignment."

Apr. 19 No Meeting  
Apr. 26 Qiaofeng Yang Title: "Discovering temporal relations in molecular pathways using protein-protein interactions."
Authors: Martin Farach-Colton, Yang Huang, John L. Woolford.
RECOMB 2004.

           Abstract: The availability of large-scale protein-protein interactin data provides us with many opportunities to study molecular pathways involving proteins. In this paper they propose to mine temporal relations in molecular pathways by protein-protein interaction data. In particular, they model the assembly pathways of protein complexes with interval graphs and determine the temporal order of joining the pathway for protein by ordering the vertices in the interaction graph. They develop a tool called XRONOS to perform such a computation. They then apply XRONOS to the ribosome assembly pathway and present validation results for the obtained ordering. The results are promising and show the potential usage for XRONOS in the study of molecular pathways.

May 3 Qi Fu Title: "ALIGNING ALIGNMENTS EXACTLY"
From: RECOMB 2004.

           Abstract: A basic computational problem that arises in both the construction and local-search phases of the best heuristics for multiple sequence alignment is that of aligning the columns of two multiple alignments. When the scoring function is the sum-of-pairs objecive and induced pairwise alignments are evaluated using linear gap-costs, this problem is called Aligning Alignments. While seemingly a straightforward extension of two-sequence alignment, it is actually proved to be NP-complete. As explained in the paper, this provides the first demonstration that minimizing linear gap-costs, in the context of multiple sequence alignment, is inherently hard.

           The paper also develop an exact algorithm for Aligning Alignments that is remarkably efficient in practice, both in time and space. Even though the problem is NP-complete, computational experiments on both biological and simulated data show we can compute optimal alignemtns for all benchmark instances in two standard datasets, and solve very large random instances with highly-gapped sequences.

May 10 Zheng Liu Title: "Disigning Multiple Simultaneous Seeds for DNA Similarity Search"
Authors: Yanni Sun, Jeremy Buhler
From: RECOMB 2004.

           Abstract: The challegne of similarity search in massive DNA sequence databases has inspired major changes in BLAST-style alignment tools, which accelerate serach by inspecting only pairs of sequences sharing a common short "seed", or pattern of matching residues. Some of these changes raise the possibility of improving search performance by probing sequence pairs with several distinct seeds, any one of which is sufficient for a seed match. However, designing a set of seeds to maximize their combined sensitivity to biologically meaningful sequence alignments is computationally difficult, even given recent advances in designing single seeds.

           This work describes algorithmic improvements to seed design that address the problem of designing a set of n seeds to be used simultaneously. We give a new local search method to optimize the sensitivity of seed sets. The method relies on efficient incremental computation of the probability that an alignment contains a match to a seed, given that it has already faied to match any of the seeeds in a set. We demonstrate experimentally that multi-seed designs, even with relatively few seeds, can be significantly more sensitive than even optimized single-seed designs.

Winter 2004

Date Speaker Topic
Jan. 07 Jing Li Title: "Efficient Haplotype Inference on Pedigrees and Applications"

           Ref: Extended Abstract

Jan. 14 Marek Chrobak Title: "The Wake-Up Problem in Multi-Hop Radio Networks"

           Ref: M. Chrobak, L.Gasieniec, D.Kowalski from Proc. SODA'2004

Jan. 21 Marek Chrobak Title: "The Analysis of the Greedy Algorithm for Minimum Common String Patrition"

           Ref: Work in Progress

Jan. 28 Neal Young Title: "Lagrangian relaxation via randomized rounding"

           Abstract: Lagrangian relaxation algorithms have a long history (dating back at least to von Neumann). These algorithms are used mainly to find approximate solutions to packing and covering problems -- linear programs and integer linear programs, with non-negative coefficients. The algorithms build a solution in small steps, guided by a smooth penalty function that serves as a proxy for some of the underlying constraints of the original problem. In comparison to other methods of solving linear programs, Lagrangian relaxation algorithms can be easier to implement and faster, and they handle sparse problems gracefully. These algorithms have been extensively studied over the last 15 years. Their design and analysis is fairly technical. For example, designing the appropriate penalty functions is something of a black art.
In the talk I will introduce a particular approach to the design of one class of Lagrangian relaxation algorithms (corresponding roughly to algorithms developed in the computer science literature over the last fifteen years). I will introduce techniques that help both to understand existing algorithms (including the underlying techniques and the relationships between the algorithms), and to design and analyze new algorithms The framework for this is randomized rounding -- an elegant technique, based on the probabilistic method, for designing approximation algorithms for combinatorial problems. This is a novel approach to Lagrangian relaxation algorithms and a novel use of randomized rounding, using a new kind of randomized rounding scheme based on random sampling.
More information available at http://www.cs.ucr.edu/~neal/wiki/wiki.pl?LRviaRR

Feb. 4 Jie Zheng Title: "Complexity and Approximation of Minimum Common Partition"

           Ref: Work in Progress

Feb. 11 Keith Humphreys Title: "Approximate Quantiles in One Pass with Limited Memory"

           Ref: Approximate Medians and other Quantiles in One Pass and with Limited Memory,
by G S Manku, S Rajagopalan and B G Lindsay,
Proc. 1998 ACM SIGMOD, Vol 27, No 2, p 426-35, June 1998.

           Slides from talk here

Feb. 18 Haifeng Li Title: "A Class of Edit Kernels for SVMs to Predict Translation Initiation Sites in Eukaryotic mRNAs"

           Ref:

Feb. 25 Qiaofeng Yang Title: "Finding biclusters by random projections"

           Ref: Work in Progress

Mar. 3 Wojciech Jawor Title: " Online Competitive Algorithms for Maximizing Weighted Throughput of Unit Jobs"

           (joint work with Yair Bartal, Francis Y. L. Chin, Marek Chrobak, Stanley P. Y. Fung, Ron Lavi, Jiri Sgall and Tomas Tichy)

           Abstract: We will consider an online problem of scheduling unit length jobs to maximize the weighted throughput. Each job will be specified by the release time, deadline, and weight. The goal of the algorithm is to schedule the maxium weighted number of jobs. Each job needs to be scheduled between its release time and deadline, otherwise it brings no profit. In the problem jobs arrive over time and the algorithm needs to make the decision of which job to schedule without the knowledge of the future. I will show upper and lower bounds for the problem, as well as some known results on some special cases of the problem.

Mar. 10 John Noga Title: "Online List Batching"

           Ref:

Fall 2003

Date Speaker Topic
Oct. 9 Jie Zheng Title: "Algorithmic Techniques of Sorting by Reversals"

           Ref: S. Hannenhalli, Pavel A. Pevzner.
Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. ACM STOC 1995.

Oct. 16 Xin Chen Title: "Methods for the assignment of orthologous genes"

           Ref: Tatusov RL, Koonin EV, Lipman DJ.
A genomic perspective on protein families. Science, 278(5338): 631-7, 1997.

Oct. 23 Petr Kolman Title: " Algorithms for the Maximum Disjoint Paths Problem" (Joint work with Christian Scheideler, JHU)

           Abstract: The {\em disjoint paths problem} is defined as follows. Given an undirected graph $G=(V,E)$ and a set $T$ of $k$ pairs of nodes $(s_i, t_i)$, $1\leq i\leq k$, decide whether there exist $k$ edge disjoint paths $P_1,\cdots,P_k$ such that the path $P_i$ connects $s_i$ and $t_i$. This is an NP-complete problem. The optimization variant of this problem is called the {\em maximum disjoint paths problem} (MDPP), which is simply to find the maximum subset of $T$ for which there exist edge-disjoint paths. In the talk a simple approximation (and online) algorithm for this problem will be presented.

Oct. 30 Avraham Goldstein (Yeshiva Univ) Title: "Some algorithmic and complexity issues in clustering fingerprint vectors with missing values."

          

Nov. 6 Swastik Kopparty Title: "Linearity Testing"

          

Nov. 13 Wojciech Jawor Title: "How to Guarantee Quality of Service on the Internet"

          

Nov. 13 Jiri Sgall
Mathematical Institute,
Academy of Sciences of the Czech Republic, Prague
Title: "Online preemptive scheduling" (based on joint work with Leah Epstein and Tomas Ebenlendr)

           Abstract: We survey the results on online preemptive scheduling on identical and uniformly related machines. (Preemptive means that a job can change machines or be idle for some time, related machines means that different machine may have different speeds.) While the problem for identical machines is completely solved and the optimal competitive ratio is about 1.58, for related machines there is still a gap between current lower bound of 2 and the upper bounds of 4 and 2.71 for deterministic and randomized algorithms, respectively. We present a semi-online algorithm which, if the optimal makespan is given in advance, produces an optimal schedule. We use it to build the currently best algorithms mentioned above.

Fall 2002

Date Speaker Topic
Oct 17 Wojciech Jawor E.Lawler, "Knapsack-like scheduling problems..."
Oct 24 Jing Li Title: "Haplotyping as Perfect Phylogeny: Conceptual Framework and Efficient Solutions" by Dan Gusfield
Ref: RECOMB'02 166-175
Oct 31 No Meeting Happy Halloween
Nov 7 Jiri Sgall Cake-cutting algorithms
Nov 14 Andres Figeuroa Statistical methods for assesing confidence in phylogenies with binary fingerprint vectors.
Nov 21 Glenn Tesler

Title: Genome Rearrangements in Mammalian Evolution: Lessons from Human and Mouse Genomes

Abstract: Although analysis of genome rearrangements was pioneered by Dobzhansky and Sturtevant 65 years ago, we still know very little about the rearrangement events that produced the existing varieties of genomic architectures. The genomic sequences of human and mouse provide evidence for a larger number of rearrangements than previously thought. We describe a new algorithm for constructing synteny blocks, study arrangements of synteny blocks in human and mouse, derive a most parsimonious human-mouse rearrangement scenario, and provide evidence that intrachromosomal rearrangements are more frequent than interchromosomal. Our analysis is based on the human-mouse breakpoint graph, which reveals related breakpoints and allows one to find a most parsimonious scenario.

Nov 28 Thankgiving
Dec 5 Keith Humphreys

Title: Primes in P,   The AKS Algorithm

Given n > 2, we would like determine if n is prime or composite in deterministic polynomial time. This talk covers the recient algorithm by Agrawal, Kayal, and Saxena, using subsequent insights by Daniel Bernstein.