Polynome: Parameter Estimation for Boolean Network Models

Description

Input

Supporting Algorithms/Procedures

Output

Credits


Description

Polynome is a web-based tool for parameter estimation and simulation of Boolean models of biological networks.

Given a set of input data, Polynome generates a Boolean network which fits the (Booleanized) data and simulates the model. We represent Boolean networks as time discrete dynamical systems, i.e., a Boolean network with n nodes is represented as an n-tuple of polynomial functions with coefficients in a field with two elements, where the i-th polynomial function describes the state transitions of the i-th node in the network.

There are two types of models that can be generated, deterministic or stochastic. For a network of n nodes, a deterministic Boolean network will have n polynomial functions, whereas a stochastic Boolean network will have n sets of polynomial functions. There are two ways to simulate a model, in parallel or sequentially. Parallel simulation updates all nodes simultaneously, whereas sequential simulation updates nodes according to an update schedule (provided by the user). Each of these options is described below.

There are a number of options for viewing the model and its simulation. If the user chooses to view the simulation, some statistics about the state space are returned to the screen, including the number of components and their sizes.

Return to the top

Input

Number of nodes

The number of nodes is the number of agents or constituents in the biological network. Currently the maximum value for n is 20; however certain procedures may require a smaller value for n (see Supporting Algorithms).
Return to the top

Format of the input data

Polynome accepts numerical data (floating point or multi-state) in table format: columns correspond to nodes and rows to time-series measurements. In the example below, the data represent 4 time-series measurements of 3 nodes.
1.1 2.8 3.4
2.5 1.2 3.3
3.2 2.1 2.4
0.1 2.9 2.1

Polynome is capable of handling multiple data sets. Each data set is preceded by a line beginning with a hash mark (#). Any characters following the hash mark in the same line are treated as user comments and are therefore ignored by the software.

#First time series
1.1 2.8 3.4
2.5 1.2 3.3
3.2 2.1 2.4
0.1 2.9 2.1
#Second time series
5.8 12.8 7.2
6.5 11.2 3.3
9.2 12.7 8.4
10.1 12.0 9.5

Data sets can be entered in the Input Data box or uploaded in a plain-text file.

Return to the top

Supporting Algorithms/Procedures

Data

Discretization

Polynome is designed to work with multi-state data, that is, data whose entries come from a finite set. The current implementation is limited to two (2) states. Booleanizing the data is often done in modeling and it is a reasonable procedure considering the typically small amounts of data available, as well as its low precision and noise content. If the input data are not Boolean, they are Booleanized via a procedure based on an existing discretization method.

Inconsistency

We view a time series (s1, s2, ..., sm) as a sequence of input-output pairs ((s1,s2),(s2,s3),...,(sm-1,sm)). Discretization may generate inconsistencies; this refers to discretized measurements where one input returns more than one output. Inconsistencies can occur when time delays, hidden variables or noise in the data are present.

Consider for instance the next discretized time course with 4 time steps (s1, …,s4):
s1 1 1 1
s2 1 1 1
s3 0 0 1
s4 0 0 0
Here, the state (1 1 1) has two outputs: (1 1 1) and (0 0 1). Because the model class underlying Polynome is that of polynomial functions, inconsistencies are handled as follows. All transitions from a state with multiple outputs are removed. In our example, we are left with the transition s3 -> s4.

An exception is the case where the number of nodes is small enough so that the minimal-model estimation method is used.

Return to the top

Modeling

The set of possible models that fits given data is typically very large. We have implemented three procedures for searching the model space.

Minimal model sampling

This procedure samples the subspace of minimal models and returns a collection of most likely PDSs. Our implementation is based on a Groebner-fan method which computes the model space using Groebner fans and samples using relative sizes of the cones in the fan. This method returns PDSs that fit the data exactly. It is used for networks with n <= 10. If a wiring diagram is requested, then it returns a labelled graph, where the weight on an incoming edge of a node x indicates the probability of the associated source node being a true input to x. If a stochastic model is requested, then it returns a set of weighted functions per node.

Minimal model estimation

This procedure computes an ``approximate'' polynomial dynamical system (PDS), that is, one that almost fits the data. It is based on the evolutionary algorithm REACT, which optimizes an internal fitness function which depends on model complexity and data fit. Because of its computational complexity, it is only used for networks with n <= 5. It is called if a wiring diagram is requested and the data are inconsistent or if a deterministic model is requested.

Minimal model selection

This procedure selects from the subspace of minimal models a PDS which fits the data. Our implementation is based on the Minimal Sets Algorithm, which restricts the data to essential coordinates as identified by the algorithm and explores the minimal model space involving only those corresponding variables. This procedure is used for networks with n > 10. It returns a minimal PDS and wiring diagram when a deterministic model is requested.

Wiring diagrams

There are two main

Polynomial dynamical systems

Simulation

Output

Type of model

This input determines the type of polynomial model to be constructed.
Return to the top

Type of simulation

This will determine update schedule of the polynomial functions in a model, that is, the order in which to evaluate the functions.
Return to the top

Viewing Options

Booleanized data

If the user selects this option, the discretization into 2 states of the input raw data, will be displayed.
Return to the top

Model

Wiring diagram

The wiring diagram of a network of size n is the directed graph with n nodes where the nodes represent the elements of the network and arrows denoted causal relationships between nodes. If the user selects this option, Polynome will display wiring diagram corresponding to the inferred network. The user can select the format of the image file from the list of formats (gif, jpg, png and ps).
Return to the top

Polynomial dynamical system

In this option, Polynome displays the polynomial model inferred through the input data.
Return to the top

State space simulation

The State Space of a network of size n is the directed graph with 2n nodes where the nodes represent the different states si of the network and an arrow from si to sj indicates the transition from input si to sj output; in other words, if f is the polynomial model of a given network, then f(si)= sj. If the user selects this option, Polynome will display the state space corresponding to the inferred network. In a stochastic network, the probability for each transition can be displayed in the state space. The user can select the format of the image file from the list of formats (gif, jpg, png and ps).
Return to the top

Credits