Colloquia
Fall 2006
Wednesday 4:00 pm - 5:00 pm @ EBU II Room
203
The refreshment will be provided by
| Date |
Speaker |
Topic |
| Dec 6 |
Marek Chrobak |
Title: "The doubling method to design approximation algorithms"
Abstract:
This talk is based on an expository article by Claire Keyon and myself, in
which
we show that there is a number of approximation algorithms in the literature
that use essentially the same (but yet not explicitely identified)
technique that
we refer to as "doubling". The essence of this approach is to use
exponentially
increasing estimates on the optimal solution to design approximate solutions.
It can be used both in offline and online approximation algorithms.
Applications
include several clustering, searching, facility location, and scheduling
problems.
|
| Nov. 29 |
Prof. Haixu Tang |
Title: "De novo identification of LTR retrotransposons in eukaryotic genomes"
Abstract:
LTR retrotransposons are a class of mobile genetic elements containing two
similar long terminal repeats (LTRs). Currently, LTR retrotransposons are
annotated in eukaryotic genomes mainly through the conventional homology
searching approach. Hence, it is limited to annotating known elements. Here, we
report a de novo computational method that can identify new LTR
retrotransposons without relying on a library of known elements. Our method
consists of three steps, which identify young intact, old intact and solo LTR
retroelements separately. We tested this method on the two pairs of eukaryotic
genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura.
LTR retrotransposons in C. elegans and D. melanogaster have been intensively
studied using conventional annotation methods. Comparing with previous work, we
identified many new intact LTR retroelements and new putative families, which
may imply that there may still be new retroelements that are left to be
discovered even in well-studied organisms. Finally, the phylogenetic and
chromosomal distributions of the identified elements are discussed.
|
| Nov. 22 |
Serdar Bozdag |
Title: "The importance of FPC and CORAL in the Barley Project"
Abstract:Barley ranks fourth among the cereals in worldwide production. It is an important crop for direct human consumption and for animal feed. In the Barley project, we aim to access gene-bearing regions of the genome. One of the most important phases of this project is assembling thousands of fingerprinted clones into contigs to build a physical map of barley. For this purpose, we use a software named FPC. FPC (FingerPrinted contigs) is an interactive program for building contigs from fingerprinted clones. FPC uses an algorithm to cluster clones into contigs based on their probability of coincidence score. FPC has editing facilities for the user to refine the coordinates and to remove poorly fingerprinted clones. However, a considerable amount of human intervention is required to improve the quality of the physical map built by FPC. Another group has developed a software named CORAL (Clone ORdering ALgorithm) to automatically order fingerprinted clones within contigs. They test CORAL on maps that have previously been manually edited and on maps derived from in silico simulations. Measurements suggest that CORAL is a significant improvement to FPC.
In this talk, I will present the algorithms of FPC and CORAL. I will also talk about why these programs are important in the Barley project.
|
| Nov. 15 |
Mathilde Hurand |
Abstract:A popular approach in combinatorial optimization is to model problems
as integer linear programs. Ideally, the relaxed linear program would
have only integer solutions, which happens for instance when the
constraint matrix is totally unimodular. Still, sometimes it is
possible to build an integer solution with same cost from the
fractional solution. Examples are two scheduling problems and the
single disk prefetching/caching problem. We show that problems such as
the three previously mentioned can be separated into two subproblems:
(1) finding an optimal feasible set of slots, and (2) assigning the
jobs or pages to the slots. It is straigthforward to show that the
latter can be solved greedily. We are able to solve the former with a
totally unimodular linear program, from which we obtain simple
combinatorial algorithms with improved worst case running time.
On the presentation I will briefly present total unimodularity, then
show how we created 2 totally unimodular linear programs ( one problem
of scheduling and a problem of caching), and show the parrallel with
minimum flow problems.
|
| Nov. 8 |
Christos Koufogiannakis |
Title: "Bounded Degree Minimum Spanning Trees"
by Michel X. Goemans (MIT)
FOCS 2006
Abstract:We consider the minimum cost spanning tree problem under
the restriction that all degrees must be at most a given
value k. We show that we can efficiently find a spanning
tree of maximum degree at most k +2 whose cost is at most
the cost of the optimum spanning tree of maximum degree
at most k. This is almost best possible.
The approach uses a sequence of simple algebraic, polyhedral
and combinatorial arguments. It illustrates many
techniques and ideas in combinatorial optimization as it
involves polyhedral characterizations, uncrossing, matroid
intersection, and graph orientations (or packing of spanning
trees). The result generalizes to the setting where every
vertex has both upper and lower bounds and gives then
a spanning tree which violates the bounds by at most two
units and whose cost is at most the cost of the optimum
tree.
|
| Nov. 1 |
Yonghui
Wu |
Title: "Kernel-Based data fusion and its
application to protein function prediction in yeast"
by G. R. G. Lanckriet, M. Deng, N. Cristianini, M. I. Jordan, and
W. S. Noble
Pacific Symposium on Biocomputing 9:300-311(2004)
Abstract: Kernel methods provide a principled
framework in which to represent many types of data, including vectors,
strings, trees and graphs. As such, these methods are useful for
drawing inferences about biological phenomena. We describe a method
for combining multiple kernel representations in an optimal fashion,
by formulating the problem as a convex optimization problem that
can be solved using semide¯nite programming techniques. The
method is applied to the problem of predicting yeast protein functional
classi¯cations using a support vector machine (SVM) trained
on ¯ve types of data. For this problem, the new method performs
better than a previously-described Markov random ¯eld method,
and better than the SVM trained on any single type of data.
|
| Oct. 25 |
Qi
Fu |
Title: "Algorithmic approaches to selecting
control clones in DNA array hybridization experiments"
Abstract: We study the problem of selecting
control clones in DNA array hybridization experiments. The problem
arises in the OFRG method for analyzing microbial communities. The
OFRG method performs classification of rRNA gene clones using binary
fingerprints created from a series of hybridization experiments,
where each experiment consists of hybridizing a collection of arrayed
clones with a single oligonucleotide probe. This experiment produces
analog signals, one for each clone, which then need to be classified,
that is, converted into binary values 1 and 0 that represent hybridization
and non-hybridization events. Besides the sample clones, the array
contains a number of control clones needed to calibrate the classification
procedure of the hybridization signals. These control clones must
be selected with care to optimize the classification process. We
formulate this as a combinatorial optimization problem called Balanced
Covering. We prove that the problem is NP-hard and we show some
results on hardness of approximation. We propose an approximation
algorithm based on randomized rounding and we show that, with high
probability, it approximates well the optimum. The experimental
results confirm that the algorithm finds high quality control clones.
The algorithm has been implemented and is publicly available as
part of the software package called CloneTools. |
| Oct. 25 |
Zheng
Fu |
Title: "Computing the Breakpoint Distance
between Partially Ordered Genomes"
Abstract: The total order of the genes
or markers on a chromosome is crucial for most comparative genomics
studies. However, the current gene mapping efforts might only suffice
to provide a partial order of the genes on a chromosome. Several
different genes or markers might be mapped at the same position
due to the low resolution of gene mapping or missing data. Moreover,
conflicting datasets might give rise to the ambiguity of gene order.
In this paper, we consider the reversal distance and breakpoint
distance problems for partially ordered genomes. We first prove
that these problems are NP-hard, and then give an efficient heuristic
algorithm to compute the breakpoint distance between partially ordered
genomes. The algorithm is based on an efficient approximation algorithm
for a natural generalization of the well-known feedback vertex set
problem, and has been tested on both simulated and real biological
datasets. The experimental results demonstrate that our algorithm
is quite effective for estimating the breakpoint distance between
partially ordered genomes and for inferring the gene (total) order.
|
| Oct. 18 |
Vladimir
Vacic |
Title: "An Important Connection Between
Network Motifs and Parsimony Models"
by Teresa M. Przytycka (RECOMB 2006)
Abstract: We demonstrate an important connection
between network motifs in certain biological networks and validity
of evolutionary trees constructed using parsimony methods. Parsimony
methods assume that taxa are described by a set of characters and
infer phylogenetic trees by minimizing number of character changes
required to explain observed character states. From the perspective
of applicability of parsimony methods, it is important to assess
whether the characters used to infer phylogeny are likely to provide
a correct tree. We introduce a graph the- oretical characterization
that helps to select correct characters. Given a set of characters
and a set of taxa, we construct a network called character overlap
graph. We show that the character overlap graph for characters that
are appropriate to use in parsimony methods is char- acterized by
significant under-representation of subnetworks known as holes,
and provide a mathematical validation for this observation. This
characterization explains success in constructing evolutionary trees
using parsimony method for some characters (e.g. protein domains)
and lack of such success for other characters (e.g. introns). In
the latter case, the un- derstanding of mathematical obstacles to
applying parsimony methods in a direct way has lead us to a new
approach for dealing with inconsistent and/or noisy data. Namely,
we introduce the concept of persistent char- acters which is similar
but less restrictive than the well known concept of pairwise compatible
characters. Application of this approach to introns produces the
evolutionary tree consistent with the Coelomata hypoth- esis. In
contrast, the direct application of a parsimony method, using introns
as characters, produces a tree which is inconsistent with any of
the two competing evolutionary hypotheses. Similarly, replacing
persis- tence with pairwise compatibility does not lead to a correct
tree. This indicates that the concept of persistence provides an
important addition to the parsimony metohds. |
Oct. 11 |
Lan
Liu |
Title: "Fast Elimination of Redundant Linear
Equations and Reconstruction of Recombination-Free Mendelian Inheritance
on a Pedigree"
by J. Xiao, L. Liu, L. Xia, T. Jiang. (SODA 2007)
Abstract: Computational inference of haplotypes
from genotypes has attracted a great deal of attention in the computational
biology community recently, partially driven by the international
HapMap project. In this paper, we study the question of how to efficiently
infer haplotypes from genotypes of individuals related by a pedigree,
assuming that the hereditary process was free of mutations ( i.e.
the Mendelian law of inheritance) and recombinants. The problem
has recently been formulated as a system of linear equations over
the finite field of $F(2)$ and solved in $O(m^3n3)$ time by using
standard Gaussian elimination, where $m$ is the number of loci (or
markers) in a genotype and $n$ the number of individuals in the
pedigree. We give a much faster algorithm with running time $O(mn2
+ n3\log2 n \log\log n)$. The key ingredients of our construction
are (i) a new system of linear equations based on some spanning
tree of the pedigree graph and (ii) an efficient method for eliminating
redundant equations in a system of $O(mn)$ linear equations over
$O(n)$ variables. Although such a fast elimination method is not
known for general systems of linear equations, we take advantage
of the underlying pedigree graph structure and recent progress on
low-stretch spanning trees.
Click here for
the paper |
Winter 2006
| Date |
Speaker |
Topic |
Mar. 13 |
Zheng Liu |
Title: "Combinatorial and Statistical Approaches for Oligonucleotide Fingerprinting of Ribosomal RNA Genes (OFRG)
Abstract:
Oligonucleotide Fingerprint of Ribosomal RNA Genes (OFRG) is a
high-throughput, microarray-based approach for identifying microorganisms.
This method was developed by an interdisciplinary team of researchers
at
the University of California at Riverside, and has been successfully
applied to the analysis of bacterial and fungal communities. In
this talk,
I will present an overview of the OFRG approach and I will discuss
two of
the numerous computational problems that arise in OFRG, along with
proposed solutions.
The first problem is a variant of approximate
string matching. In genomic
databases, gene sequences within a particular gene family are usually
sandwiched between a specific pair of primers. In order to extract
desired
gene sequences with a given pair of primers, we have designed a
fast
algorithm for approximate string matching, called FAAST. FAAST generalizes
the Tarhio-Ukkonen algorithm by requiring two or more matches when
calculating shift distances. Both theoretical analysis and experimental
results demonstrate a significant speed-up without loss of accuracy
achieved by the algorithm.
he second challenge arises in the analysis of microarray data.
In OFRG,
the presence of specific nucleotides is determined based on intensity
values of hybridization signals. Due to noise and technological
limitations, these signals are sometimes too ambiguous for a reliable
classification. In such situation, the traditional Bayes classification
method would often lead to invalid predictions, affecting the accuracy
of the OFRG results. To address this problem, we introduce a Modified
Bayes Rule (MBR) which allows a “no prediction” result. MBR
uses a cost
structure to weigh the penalty for not making a definite prediction
against the penalty for making an incorrect definite prediction.
Experiments show that our method outperforms a neutral-zone rule
that has been routinely used in this type of applications.
|
| Mar. 20 |
Zheng Fu |
Title: "A parsimony approach to genome-wide ortholog assignment."
by Zheng Fu, Xin Chen, Vladimir Vacic, Peng Nan, Yang Zhong, and Tao Jiang (RECOMB 2006)
Click here for the paper
|
Fall 2004
| Date |
Speaker |
Topic |
| Oct. 15 |
Xin Chen |
Title: "Predicting gene expression from sequence"
by Michael A. Beer, Saeed Tavazoie (Cell, Vol. 117, 185-198, April 16, 2004)
Abstract: We describe a systematic genome-wide approach for learning the complex
combinatorial code underlying gene expression. Our probabilistic approach
identifies local DNA-sequence elements and the positional and
combinatorial constrains that determine their context-dependent role in
transcriptional regulation. The inferred regulatory rules correctly
predict expression patterns for 73% of genes in Saccharomyces cerevisiae,
utilizing microarray expression data and sequences in the 800 bp upstream
of genes. Application to Caenorhabditis elegans identifies predictive
regulatory elements and combinatorial rules that control the phased
temporal expression of transcription factors, histones, and germline
specific genes. Successful prediction requires diverse and complex rules
utilizing AND, OR, and NOT logic, with significant constraints on motif
strength, orientation, and relative position. This system generates a
large number of mechanistic hypotheses for focused experimental
validation, and establishes a predictive dynamical framework for
understanding cellular behavior from genomic sequence.
|
| Oct. 22 |
Christoph Durr |
Title: "Quantum algorithms for graph connectivity"
Abstract: Some intuition about quantum mechanics is given, definition
of the model of quantum computation, explanation of Grover's search algorithm
and application to graph connectivity.
|
| Nov. 05 |
Haifeng Li |
Title: "Minimum Entropy Clustering and Applications to Gene Expression Analysis"
Abstract: Clustering is a common methodology for analyzing the gene expression data.
In this paper, we present a new clustering algorithm from an information-theoretic
point of view. First, we propose the minimum entropy (measured on {\it a posteriori}
probabilities) criterion, which is the conditional entropy of clusters given the
observations. Fano's inequality indicates that it could be a good criterion for clustering.
We generalize the criterion by replacing Shannon's entropy with Havrda-Charvat's structural
$\alpha$-entropy. Interestingly, the minimum entropy criterion based on structural $\alpha$
-entropy is equal to the probability error of the nearest neighbor method when $\alpha=2$.
This is another evidence that the proposed criterion is good for clustering. With a
nonparametric approach for estimating {\it a posteriori} probabilities, an efficient
iterative algorithm is then established to minimize the entropy. The experimental results
show that the clustering algorithm performs significantly better than $k$-means/medians,
hierarchical clustering, SOM, and EM in terms of adjusted Rand index. Particularly, our
algorithm performs very well even when the correct number of clusters is unknown. In addition,
most clustering algorithms produce poor partitions in presence of outliers while our method can
correctly reveal the structure of data and effectively identify outliers simultaneously.
|
| Nov. 12 |
Marek Chrobak |
Title:
"Online Scheduling of Equal-Length Jobs:Randomization and Restarts Help"
Abstract:We consider the following scheduling problem. The input is a set of jobs
with equal processing times, where each job is specified by its release,
time and deadline. The goal is to determine a single-processor,
non-preemptive schedule that maximizes the number of completed
jobs. In the online version, each job arrives at its release time.
We give two online algorithms with competitive ratios below 2 and
show several lower bounds on the competitive ratios.
First, we give a barely random 5/3-competitive algorithm
that uses only one random bit. We also show a lower bound of 3/2 for
barely random algorithms that choose one of two deterministic algorithms.
Second, we give a deterministic 3/2-competitive algorithm in the model
that allows restarts, and we show that in this model the ratio
3/2 is optimal. For randomized algorithms with restarts we show a lower
bound of 6/5.
This is joint work with W.Jawor, J.Sgall, and T.Tichy.
|
| Nov. 19 |
Petr Kolman |
Title:
"Duality of multiroute flows and cuts"
Abstract:A classical flow is a nonnegative linear combination of unit flows along
simple paths. A~multiroute flow, first considered by Kishimoto and
Takeuchi, generalizes this concept. The basic building blocks are not
single paths with unit flows but rather tuples consisting of $k$ edge
disjoint paths, each path with a unit flow. A multiroute flow is a
nonnegative linear combination of such tuples.
We present a simple combinatorial proof of the duality theorem for
multiroute flows and cuts and its corollary which characterizes multiroute
flows in terms of classical flows. Specifically, we show that a
(classical) flow of size $F$ is a $k$-flow if and only if the flow through
every edge is at most $F/k$. This duality then immediately yields an
efficient algorithm.
This is joint work with Amitabha Bagchi, Amitabh Chaudhary and Ji\v{r}\'\i\
Sgall.
|
Spring 2004
| Date |
Speaker |
Topic |
| Apr. 12 |
Jie Zheng |
Title:
"De novo Repeat Classification and Fragment Assembly."
Authors: Pavel A. Pevzner, Haixu Tang, Glenn Tesler.
RECOMB 2004.
Abstract:
"Repetitive sequences make up a significant fraction of
almost any genome and an important and still open question in
bioinformatics is how to represent all repeats in DNA sequences.
We propose a radically new approach to repeat classficiation
that is motivated by the fundamental topological notion of
quotient spaces. A torus or Kein bottle are examples of quotient
spaces that can be obtained from a square by gluing some
points. Our new repeat classification algorithm is based on
the observation that the alignment-induced quotient space of
a DNA sequence compactly represents all sequence repeats. This
observation leads to a simple and efficient solution of the
repeat classification problem as well as new approaches to
fragment assembly and multiple alignment."
|
| Apr. 19 |
No Meeting |
|
| Apr. 26 |
Qiaofeng Yang |
Title:
"Discovering temporal relations in molecular pathways using protein-protein interactions."
Authors: Martin Farach-Colton, Yang Huang, John L. Woolford.
RECOMB 2004.
Abstract:
The availability of large-scale protein-protein interactin data provides us
with many opportunities to study molecular pathways involving proteins. In
this paper they propose to mine temporal relations in molecular pathways by
protein-protein interaction data. In particular, they model the assembly
pathways of protein complexes with interval graphs and determine the temporal
order of joining the pathway for protein by ordering the vertices in the
interaction graph. They develop a tool called XRONOS to perform such a
computation. They then apply XRONOS to the ribosome assembly pathway and
present validation results for the obtained ordering. The results are
promising and show the potential usage for XRONOS in the study of molecular
pathways.
|
| May 3 |
Qi Fu |
Title:
"ALIGNING ALIGNMENTS EXACTLY"
From: RECOMB 2004.
Abstract:
A basic computational problem that arises in both the construction and
local-search phases of the best heuristics for multiple sequence alignment
is that of aligning the columns of two multiple alignments. When the
scoring function is the sum-of-pairs objecive and induced pairwise
alignments are evaluated using linear gap-costs, this problem is called
Aligning Alignments. While seemingly a straightforward extension of
two-sequence alignment, it is actually proved to be NP-complete. As
explained in the paper, this provides the first demonstration that
minimizing linear gap-costs, in the context of multiple sequence
alignment, is inherently hard.
The paper also develop an exact algorithm for Aligning Alignments that
is remarkably efficient in practice, both in time and space. Even though
the problem is NP-complete, computational experiments on both biological
and simulated data show we can compute optimal alignemtns for all
benchmark instances in two standard datasets, and solve very large random
instances with highly-gapped sequences.
|
| May 10 |
Zheng Liu |
Title:
"Disigning Multiple Simultaneous Seeds for DNA Similarity Search"
Authors:
Yanni Sun, Jeremy Buhler
From: RECOMB 2004.
Abstract:
The challegne of similarity search in massive DNA sequence databases has
inspired major changes in BLAST-style alignment tools, which accelerate
serach by inspecting only pairs of sequences sharing a common short
"seed", or pattern of matching residues. Some of these changes raise the
possibility of improving search performance by probing sequence pairs with
several distinct seeds, any one of which is sufficient for a seed match.
However, designing a set of seeds to maximize their combined sensitivity
to biologically meaningful sequence alignments is computationally
difficult, even given recent advances in designing single seeds.
This work describes algorithmic improvements to seed design that address
the problem of designing a set of n seeds to be used simultaneously. We
give a new local search method to optimize the sensitivity of seed sets.
The method relies on efficient incremental computation of the probability
that an alignment contains a match to a seed, given that it has already
faied to match any of the seeeds in a set. We demonstrate experimentally
that multi-seed designs, even with relatively few seeds, can
be significantly more sensitive than even optimized single-seed designs.
|
Winter 2004
| Date |
Speaker |
Topic |
| Jan. 07 |
Jing Li |
Title:
"Efficient Haplotype Inference on Pedigrees and Applications"
Ref: Extended Abstract
|
| Jan. 14 |
Marek Chrobak |
Title:
"The Wake-Up Problem in Multi-Hop Radio Networks"
Ref: M. Chrobak, L.Gasieniec, D.Kowalski
from Proc. SODA'2004
|
| Jan. 21 |
Marek Chrobak |
Title:
"The Analysis of the Greedy Algorithm for Minimum Common String Patrition"
Ref: Work in Progress
|
| Jan. 28 |
Neal Young |
Title:
"Lagrangian relaxation via randomized rounding"
Abstract:
Lagrangian relaxation algorithms have a long history (dating back at
least to von Neumann). These algorithms are used mainly to find
approximate solutions to packing and covering problems -- linear
programs and integer linear programs, with non-negative coefficients.
The algorithms build a solution in small steps, guided by a smooth
penalty function that serves as a proxy for some of the underlying
constraints of the original problem. In comparison to other methods of
solving linear programs, Lagrangian relaxation algorithms can be easier
to implement and faster, and they handle sparse problems gracefully.
These algorithms have been extensively studied over the last 15 years.
Their design and analysis is fairly technical. For example, designing
the appropriate penalty functions is something of a black art.
In the talk I will introduce a particular approach to the design of one
class of Lagrangian relaxation algorithms (corresponding roughly to
algorithms developed in the computer science literature over the last
fifteen years). I will introduce techniques that help both to
understand existing algorithms (including the underlying techniques and
the relationships between the algorithms), and to design and analyze
new algorithms
The framework for this is randomized rounding -- an elegant technique,
based on the probabilistic method, for designing approximation
algorithms for combinatorial problems. This is a novel approach to
Lagrangian relaxation algorithms and a novel use of randomized
rounding, using a new kind of randomized rounding scheme based on
random sampling.
More information available at
http://www.cs.ucr.edu/~neal/wiki/wiki.pl?LRviaRR
|
| Feb. 4 |
Jie Zheng |
Title:
"Complexity and Approximation of Minimum Common Partition"
Ref: Work in Progress
|
| Feb. 11 |
Keith Humphreys |
Title:
"Approximate Quantiles in One Pass with Limited Memory"
Ref:
Approximate Medians and other Quantiles in One Pass and with Limited Memory,
by G S Manku, S Rajagopalan and B G Lindsay,
Proc. 1998 ACM SIGMOD, Vol 27, No 2, p 426-35, June 1998.
Slides from talk here
|
| Feb. 18 |
Haifeng Li |
Title:
"A Class of Edit Kernels for SVMs to Predict Translation Initiation Sites in Eukaryotic mRNAs"
Ref:
|
| Feb. 25 |
Qiaofeng Yang |
Title:
"Finding biclusters by random projections"
Ref: Work in Progress
|
| Mar. 3 |
Wojciech Jawor |
Title:
" Online Competitive Algorithms for Maximizing Weighted Throughput of Unit Jobs"
(joint work with Yair Bartal, Francis Y. L. Chin, Marek Chrobak, Stanley P. Y. Fung, Ron Lavi, Jiri Sgall and Tomas Tichy)
Abstract:
We will consider an online problem of scheduling unit length
jobs to maximize the weighted throughput. Each job will be specified by
the release time, deadline, and weight. The goal of the algorithm is to
schedule the maxium weighted number of jobs. Each job needs to be
scheduled between its release time and deadline, otherwise it brings no
profit. In the problem jobs arrive over time and the algorithm needs to
make the decision of which job to schedule without the knowledge of the
future. I will show upper and lower bounds for the problem, as well as
some known results on some special cases of the problem.
|
| Mar. 10 |
John Noga |
Title:
"Online List Batching"
Ref:
|
Fall 2003
| Date |
Speaker |
Topic |
| Oct. 9 |
Jie Zheng |
Title:
"Algorithmic Techniques of Sorting by Reversals"
Ref:
S. Hannenhalli, Pavel A. Pevzner.
Transforming Cabbage into Turnip:
Polynomial Algorithm for Sorting Signed Permutations
by Reversals.
ACM STOC 1995.
|
| Oct. 16 |
Xin Chen |
Title:
"Methods for the assignment of orthologous genes"
Ref:
Tatusov RL, Koonin EV, Lipman DJ.
A genomic perspective on protein
families. Science, 278(5338): 631-7, 1997.
|
| Oct. 23 |
Petr Kolman |
Title:
" Algorithms for the Maximum Disjoint Paths Problem"
(Joint work with Christian Scheideler, JHU)
Abstract: The {\em disjoint paths problem} is defined as follows. Given an
undirected graph $G=(V,E)$ and a set $T$ of $k$ pairs of nodes $(s_i,
t_i)$, $1\leq i\leq k$, decide whether there exist $k$ edge disjoint paths
$P_1,\cdots,P_k$ such that the path $P_i$ connects $s_i$ and $t_i$. This
is an NP-complete problem. The optimization variant of this problem is
called the {\em maximum disjoint paths problem} (MDPP), which is simply to
find the maximum subset of $T$ for which there exist edge-disjoint paths.
In the talk a simple approximation (and online) algorithm for this problem
will be presented.
|
| Oct. 30 |
Avraham Goldstein (Yeshiva Univ) |
Title:
"Some algorithmic and complexity issues in clustering
fingerprint vectors with missing values."
|
| Nov. 6 |
Swastik Kopparty |
Title:
"Linearity Testing"
|
| Nov. 13 |
Wojciech Jawor |
Title:
"How to Guarantee Quality of Service on the Internet"
|
| Nov. 13 |
Jiri Sgall
Mathematical Institute,
Academy of Sciences of the Czech Republic, Prague
|
Title:
"Online preemptive scheduling"
(based on joint work with Leah Epstein and Tomas Ebenlendr)
Abstract:
We survey the results on online preemptive scheduling on identical and
uniformly related machines. (Preemptive means that a job can change
machines or be idle for some time, related machines means that different
machine may have different speeds.)
While the problem for identical machines is completely solved and the
optimal competitive ratio is about 1.58, for related machines there is
still a gap between current lower bound of 2 and the upper bounds of 4 and
2.71 for deterministic and randomized algorithms, respectively.
We present a semi-online algorithm which, if the optimal makespan is given
in advance, produces an optimal schedule. We use it to build the currently
best algorithms mentioned above.
|
Fall 2002
| Date |
Speaker |
Topic |
| Oct 17 |
Wojciech Jawor |
E.Lawler, "Knapsack-like scheduling problems..."
|
| Oct 24 |
Jing Li |
Title: "Haplotyping as Perfect Phylogeny: Conceptual Framework and
Efficient Solutions" by Dan Gusfield
Ref: RECOMB'02 166-175
|
| Oct 31 |
No Meeting |
Happy Halloween |
| Nov 7 |
Jiri Sgall |
Cake-cutting algorithms
|
| Nov 14 |
Andres Figeuroa |
Statistical methods for assesing confidence in phylogenies with binary fingerprint vectors.
|
| Nov 21 |
Glenn Tesler |
Title:
Genome Rearrangements in Mammalian Evolution:
Lessons from Human and Mouse Genomes
Abstract:
Although analysis of genome rearrangements was pioneered by Dobzhansky
and Sturtevant 65 years ago, we still know very little about the
rearrangement events that produced the existing varieties of genomic
architectures. The genomic sequences of human and mouse provide
evidence for a larger number of rearrangements than previously
thought. We describe a new algorithm for constructing synteny blocks,
study arrangements of synteny blocks in human and mouse, derive a most
parsimonious human-mouse rearrangement scenario, and provide evidence
that intrachromosomal rearrangements are more frequent than
interchromosomal. Our analysis is based on the human-mouse breakpoint
graph, which reveals related breakpoints and allows one to find a most
parsimonious scenario.
|
| Nov 28 |
Thankgiving |
|
| Dec 5 |
Keith Humphreys |
Title: Primes in P, The AKS Algorithm
Given n > 2, we would like determine if n is prime or composite in
deterministic polynomial time.
This talk covers the recient algorithm by
Agrawal, Kayal, and Saxena, using subsequent insights by Daniel Bernstein.
|
|