Modeling Biological Networks

IV.1 Coordinators
IV.2 Participants
IV.3 Introduction
IV.4 Background and Significance
IV.5 Research Plan
IV.6 Specific Subprojects

IV.7 Connection to Specific Projects 2 (cytoskeleton) and 3 (organogenesis)
IV.8 Timeline

< Previous | Page 10 of 35 | Next >

IV.6.i.d Clustering:

As we argued in the Background and Significance (IV.4) the study of molecular building blocks alone cannot reveal emergent biological properties. Our attempt to understand the collective properties of cells employs a top-down or "reverse engineering" approach. On the other hand, many researchers believe that the organization of cellular function is modular, with distinct modules containing many species of interacting molecules carrying out cellular functions such as metabolism (Hartwell et al., 1999). In this model, chemical or spatial isolation separates functions into discrete modules. Indeed, the biological tradition is to group reactions into pathways, which we may view as modules. In addition these pathways group into a hierarchy of larger entities, representing the cell's known functions. For example, we characterize reactions as pertaining to metabolism, information transfer, DNA repair, etc., which we then break into smaller, and again smaller units, eventually arriving at the molecular level. The problem with this approach is that few functions within the cell are completely independent. Most functions share pathways and molecules (e.g., cell cycle vs. DNA repair) both at the substrate and the protein level, so components often participate in more than one module. Thus if modules exist, they are highly coupled and interconnected, making their identification difficult.

Computer scientists have investigated the network clustering of the World Wide Web (WWW), which, like metabolism, is a directed network (Gibson et al., 1998; Adamic and Adar, 1999; Adamic, 1999; Larson, 1996). While network clustering is an NP-complete problem (Flake et al., 2000; Garey and Johnson, 1979), implying that finding the optimal clusters for large systems is computationally prohibitive, cellular networks are small enough, with several thousand nodes at most, that we can calculate clustering with our current computing resources.

We expect that clustering algorithms will reproduce to a certain degree known biological modules and pathways. However, the project could offer some surprises as well: an automatic program could uncover relationships between clusters that might not be intuitively obvious. Our development of clustering methods will accompany detailed analysis of the biological significance of the clusters obtained. Together with dynamic approaches, network-topology-based identification of functional modules using objective, automated algorithms could offer an unbiased classification of the different metabolic pathways, and could further our understanding of the organization of cellular function.