Utils

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

statalign.base
Class Utils

java.lang.Object
  statalign.base.Utils

public class Utils
extends java.lang.Object
extends java.lang.Object

This class contains multi-purpose static functions.

Author:: miklos, novak, herman

Field Summary
`static boolean`	`DEBUG` Debugging mode (various consistency checks done if on)
`static boolean`	`DOWNWEIGHT_INDEL_LIKELIHOOD` If true then we downweight the indel contribution to the overall likelihood.
`static org.apache.commons.math3.random.RandomGenerator`	`generator` The random number generator used throughout the program.
`static double`	`LEAF_COUNT_POW` Power determining how much we favour realigning the larger subtree first when doing a nearest-neighbour interchange move.
`static double`	`log0` log(0) is set to Double.NEGATIVE_INFINITY.
`static double`	`LOW_COUNT_MULTIPLIER`
`static int`	`LOW_COUNT_THRESHOLD`
`static double`	`MAX_ACCEPTANCE` During the burnin, the proposalWidthControlVariable for all continuous parameters is adjusted in order to ensure that the average acceptance rate is between MIN_ACCEPTANCE and MAX_ACCEPTANCE where possible.
`static int`	`MAX_SILENT_LENGTH`
`static double`	`MIN_ACCEPTANCE` During the burnin, the proposalWidthControlVariable for all McmcMove objects is adjusted (if `McmcMove.autoTune=true`) in order to ensure that the average acceptance rate is between MIN_ACCEPTANCE and MAX_ACCEPTANCE where possible.
`static double`	`MIN_EDGE_LENGTH`
`static double`	`MIN_SAMPLES_FOR_ACC_ESTIMATE` Number of samples during burnin used to get a rough estimate of the current acceptance rate, for the purposes of tuning the proposal variance control parameters.
`static int`	`MIN_SEQ_LENGTH` Minimum length for internal node sequence.
`static boolean`	`SHAKE_IF_STUCK` If `true` then during the first half of the burnin if a particular McmcMove has been below its minimum acceptance rate for at least (LOW_COUNT_THRESHOLD * MIN_SAMPLES_FOR_ACC_ESTIMATE) iterations, then for the purposes of computing the acceptance ratio, we multiply the new log likelihood by LOW_COUNT_MULTIPLIER raised to a power that increases with the number of iterations beyond the threshold.
`static double`	`SILENT_INSERT_PROB`
`static double`	`SPAN_MULTIPLIER` During the burnin, the proposalWidthControlVariable for all continuous parameters is adjusted in order to ensure that the average acceptance rate is between MIN_ACCEPTANCE and MAX_ACCEPTANCE where possible.
`static boolean`	`USE_FULL_WINDOWS` If this is set to `true` then the alignment moves operate on the whole alignment rather than selecting subwindows.
`static boolean`	`USE_INDEL_CORRECTION_FACTOR` If true then we divide out the stationary probability of the internal nodes from the indel likelihood, as per Redelings and Suchard (2005), using the TKF92 stationary distribution defined in Thorne et al. (1992).
`static boolean`	`USE_MODEXT_EM` If true, then ModelExtensions are allowed to offer a contribution to the emission probability used to compute the dynamic programming matrices for alignment proposals.
`static boolean`	`USE_MODEXT_UPP` If true, then ModelExtensions are allowed to offer an upper contribution to the emission probability used to compute the dynamic programming matrices for alignment proposals.
`static boolean`	`USE_UPPER` Whether to use information from the upper parts of the tree in order to fill out the `hmm2` and `hmm3` matrices.
`static boolean`	`VERBOSE`
`static double`	`WINDOW_MULTIPLIER` Initial value for the alignment proposal window length multiplier.

Method Summary

static java.lang.String[] alignmentTransformation(java.lang.String[] s, java.lang.String[] names, java.lang.String type, InputData input)
Transforms an alignment into the prescribed format

static double calcEmProb(double[] fel, double[] aaEquDist)
Calculates emission probability from Felsenstein likelihoods

static int chooseOne(double prob, statalign.base.MuDouble selectLogLike)
Behaves exactly like weightedChoose(new double[]{1-prob,prob}, selectLogLike), but faster

static java.util.List<java.lang.String> classesInPackage(java.lang.String packageName)
Finds all classes in a given package and all of its subpackages by walking through class path.

static java.lang.String convertTime(long x)
Takes a time in milliseconds and converts to a string to be printed.

static char[] copyOf(char[] array)

static double[] copyOf(double[] array)

static int[] copyOf(int[] array)

static



<T> java.util.List<T>

findPlugins(java.lang.Class<T> superClass)
Locates all plugins that are descendants of the specified plugin superclass.

static boolean isValidHistory(boolean p, boolean g, boolean[] neighb)
For a tree of the form: gg / g / \ p u / \ t b this function determines valid possible indel states for p and g given fixed states for the neighbouring nodes.

static boolean isValidHistory(boolean p, boolean g, boolean[] neighb, boolean gIsRoot)
For a tree of the form: gg / g / \ p u / \ t b or, if gIsRoot = true, then for a tree of the form g / \ p u / \ t b this function determines valid possible indel states for p and g given fixed states for the neighbouring nodes.

static



<T> java.lang.Iterable<T>

iterate(java.util.Enumeration<T> en)
Makes Enumeration iterable.

static java.lang.String joinStrings(java.lang.Object[] strs, java.lang.String separator)
Joins strings using a separator string.

static java.lang.String joinStrings(java.lang.Object[] strs, java.lang.String prefix, java.lang.String separator)
Joins strings using a prefix and a separator string.

static int linearizerWeight(int length, statalign.base.MuDouble selectLike, double expectedLength)
This function selects a random integer with expected value given by expectedLength.

static double linearizerWeightProb(int length, int index, double expectedLength)
This function returns the probability of choosing a particular index with linearizerWeight.

static double logAdd(double a, double b)
Logarithmically add two numbers

static double logBetaDensity(double x, double alpha, double beta)

static double logGammaDensity(double x, double shape, double rate)

static int logWeightedChoose(double[] logWeights)

static int logWeightedChoose(double[] logWeights, statalign.base.MuDouble selectLogLike)
Equivalent to weightedChoose(weights, selectLogLike) where logWeights[i] = Math.log(weights[i]), but avoids overflows that might result from exponentiation.

static int minMax(int value, int min, int max)

static java.lang.String repeatedString(java.lang.String s, int n)

static int weightedChoose(double[] weights)

static int weightedChoose(double[] weights, statalign.base.MuDouble selectLogLike)
Similar to weightedChoose(weights), but the log-probability of the selection will be subtracted from the mutable double object selectLogLike (reason: proposal is in the denominator of acceptance ratio) (MuDouble is used to allow for another return value, in C++ a double pointer/reference could be used instead)

static int weightedChoose(int[] weights)
This function returns a random index, weighted by the weights in the array `weights'

static int weightedChoose(java.util.List<java.lang.Double> weights, statalign.base.MuDouble selectLogLike)

static int weightedChoose(java.util.List<java.lang.Integer> weights)
This function returns a random index, weighted by the weights in the array `weights'

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

DEBUG

public static boolean DEBUG

Debugging mode (various consistency checks done if on)

USE_FULL_WINDOWS

public static boolean USE_FULL_WINDOWS

If this is set to true then the alignment moves operate on the whole alignment rather than selecting subwindows. This is usually much slower.

USE_MODEXT_EM

public static boolean USE_MODEXT_EM

If true, then ModelExtensions are allowed to offer a contribution to the emission probability used to compute the dynamic programming matrices for alignment proposals. NB this will be switched on automatically when a suitable ModelExtension is activated. Setting to true here will render this variable constitutively active, which is unlikely to be useful.

USE_MODEXT_UPP

public static boolean USE_MODEXT_UPP

If true, then ModelExtensions are allowed to offer an upper contribution to the emission probability used to compute the dynamic programming matrices for alignment proposals. The upper contribution involves information about all vertices outside of the current subtree. NB this will be switched on automatically when when a suitable ModelExtension is activated. Setting to true here will render this variable constitutively active, which is unlikely to be useful.

USE_UPPER

public static boolean USE_UPPER

Whether to use information from the upper parts of the tree in order to fill out the hmm2 and hmm3 matrices.

LEAF_COUNT_POW

public static double LEAF_COUNT_POW

Power determining how much we favour realigning the larger subtree first when doing a nearest-neighbour interchange move.

generator

public static org.apache.commons.math3.random.RandomGenerator generator

The random number generator used throughout the program. A new generator is constructed at each MCMC run using the seed in the corresponding MCMCPars object.

SPAN_MULTIPLIER

public static final double SPAN_MULTIPLIER

During the burnin, the proposalWidthControlVariable for all continuous parameters is adjusted in order to ensure that the average acceptance rate is between MIN_ACCEPTANCE and MAX_ACCEPTANCE where possible. This is done by repeatedly multiplying the proposalWidthControlVariable by SPAN_MULTIPLIER until the acceptance falls within the desired range.

See Also:: Constant Field Values

MIN_ACCEPTANCE

public static final double MIN_ACCEPTANCE

During the burnin, the proposalWidthControlVariable for all McmcMove objects is adjusted (if McmcMove.autoTune=true) in order to ensure that the average acceptance rate is between MIN_ACCEPTANCE and MAX_ACCEPTANCE where possible. This is done by repeatedly multiplying the proposalWidthControlVariable by SPAN_MULTIPLIER until the acceptance falls within the desired range.

See Also:: Constant Field Values

MAX_ACCEPTANCE

public static final double MAX_ACCEPTANCE

See Also:: Constant Field Values

WINDOW_MULTIPLIER

public static double WINDOW_MULTIPLIER

Initial value for the alignment proposal window length multiplier.

MIN_SAMPLES_FOR_ACC_ESTIMATE

public static final double MIN_SAMPLES_FOR_ACC_ESTIMATE

Number of samples during burnin used to get a rough estimate of the current acceptance rate, for the purposes of tuning the proposal variance control parameters.

See Also:: Constant Field Values

log0

public static final double log0

log(0) is set to Double.NEGATIVE_INFINITY. This is used in logarithmic adding. The logarithm of an empty sum is set to this value.

See Also:: Constant Field Values

MIN_EDGE_LENGTH

public static final double MIN_EDGE_LENGTH

See Also:: Constant Field Values

MIN_SEQ_LENGTH

public static final int MIN_SEQ_LENGTH

Minimum length for internal node sequence.

See Also:: Constant Field Values

DOWNWEIGHT_INDEL_LIKELIHOOD

public static final boolean DOWNWEIGHT_INDEL_LIKELIHOOD

If true then we downweight the indel contribution to the overall likelihood.

See Also:: Constant Field Values

USE_INDEL_CORRECTION_FACTOR

public static final boolean USE_INDEL_CORRECTION_FACTOR

If true then we divide out the stationary probability of the internal nodes from the indel likelihood, as per Redelings and Suchard (2005), using the TKF92 stationary distribution defined in Thorne et al. (1992).

See Also:: Constant Field Values

LOW_COUNT_THRESHOLD

public static final int LOW_COUNT_THRESHOLD

See Also:: Constant Field Values

LOW_COUNT_MULTIPLIER

public static final double LOW_COUNT_MULTIPLIER

See Also:: Constant Field Values

SHAKE_IF_STUCK

public static final boolean SHAKE_IF_STUCK

If true then during the first half of the burnin if a particular McmcMove has been below its minimum acceptance rate for at least (LOW_COUNT_THRESHOLD * MIN_SAMPLES_FOR_ACC_ESTIMATE) iterations, then for the purposes of computing the acceptance ratio, we multiply the new log likelihood by LOW_COUNT_MULTIPLIER raised to a power that increases with the number of iterations beyond the threshold. This gradually favours the state jumping, which may be useful to avoid getting stuck in local modes during the burnin. Ideally such a scheme should not be needed, however.

See Also:: Constant Field Values

MAX_SILENT_LENGTH

public static final int MAX_SILENT_LENGTH

See Also:: Constant Field Values

SILENT_INSERT_PROB

public static final double SILENT_INSERT_PROB

See Also:: Constant Field Values

VERBOSE

public static boolean VERBOSE

Method Detail

logGammaDensity

public static double logGammaDensity(double x,
                                     double shape,
                                     double rate)

Parameters:: x -; shape -; rate -
Returns:: The unnormalised log density of Gamma(x | shape, rate)

logBetaDensity

public static double logBetaDensity(double x,
                                    double alpha,
                                    double beta)

Parameters:: x -; alpha -; beta -
Returns:: The unnormalised log density of Beta(x | alpha, beta)

linearizerWeight

public static int linearizerWeight(int length,
                                   statalign.base.MuDouble selectLike,
                                   double expectedLength)

This function selects a random integer with expected value given by expectedLength. The probability of the selection of that particular index is returned in selectLike. (MuDouble is used to allow for another return value, in C++ a double pointer/reference could be used instead)

Parameters:: length - The length of the array we need.; selectLike - A mutable double object to return the selection probability; expectedLength - The expected window length.
Returns:: A random integer as described above

linearizerWeightProb

public static double linearizerWeightProb(int length,
                                          int index,
                                          double expectedLength)

This function returns the probability of choosing a particular index with linearizerWeight. The value returned is equal to mu.value when linearizerWeight(length, mu) returns 'index'.

Parameters:: length - Distribution parameter as in linearizerWeight; index - Selected index
Returns:: Probability of the selection

weightedChoose

public static int weightedChoose(int[] weights)

This function returns a random index, weighted by the weights in the array `weights'

weightedChoose

public static int weightedChoose(java.util.List<java.lang.Integer> weights)

This function returns a random index, weighted by the weights in the array `weights'

weightedChoose

public static int weightedChoose(double[] weights,
                                 statalign.base.MuDouble selectLogLike)

Similar to weightedChoose(weights), but the log-probability of the selection will be subtracted from the mutable double object selectLogLike (reason: proposal is in the denominator of acceptance ratio) (MuDouble is used to allow for another return value, in C++ a double pointer/reference could be used instead)

weightedChoose

public static int weightedChoose(double[] weights)

weightedChoose

public static int weightedChoose(java.util.List<java.lang.Double> weights,
                                 statalign.base.MuDouble selectLogLike)

chooseOne

public static int chooseOne(double prob,
                            statalign.base.MuDouble selectLogLike)

Behaves exactly like weightedChoose(new double[]{1-prob,prob}, selectLogLike), but faster

logWeightedChoose

public static int logWeightedChoose(double[] logWeights,
                                    statalign.base.MuDouble selectLogLike)

Equivalent to weightedChoose(weights, selectLogLike) where logWeights[i] = Math.log(weights[i]), but avoids overflows that might result from exponentiation. (MuDouble is used to allow for another return value, in C++ a double pointer/reference could be used instead)

logWeightedChoose

public static int logWeightedChoose(double[] logWeights)

isValidHistory

public static boolean isValidHistory(boolean p,
                                     boolean g,
                                     boolean[] neighb)

For a tree of the form:

this function determines valid possible indel states for p and g given fixed states for the neighbouring nodes.

Parameters:: p - The presence/absence of node p.; g - The presence/absence of node b.; neighb - An array indicating the state of the neighbouring nodes, in the order {t,b,u,gg}.
Returns:: A boolean value indicating whether the specified values of p and b are compatible with the neighbouring states.

isValidHistory

public static boolean isValidHistory(boolean p,
                                     boolean g,
                                     boolean[] neighb,
                                     boolean gIsRoot)

For a tree of the form:

or, if gIsRoot = true, then for a tree of the form

      g
     / \
    p   u
  /  \
 t    b

this function determines valid possible indel states for p and g given fixed states for the neighbouring nodes.

Parameters:: gIsRoot - This is true if g is the root of the tree.; p - The presence/absence of node p.; g - The presence/absence of node b.; neighb - An array indicating the state of the neighbouring nodes, in the order {t,b,u,gg} (if gIsRoot=false), or {t,b,u} (if gIsRoot=true).
Returns:: A boolean value indicating whether the specified values of p and b are compatible with the neighbouring states.

convertTime

public static java.lang.String convertTime(long x)

Takes a time in milliseconds and converts to a string to be printed.

Parameters:: x - The time to be formatted, in milliseconds (as a long).
Returns:: A string to be printed.

logAdd

public static double logAdd(double a,
                            double b)

Logarithmically add two numbers

Parameters:: a - log(x); b - log(y)
Returns:: log(x+y)

calcEmProb

public static double calcEmProb(double[] fel,
                                double[] aaEquDist)

Calculates emission probability from Felsenstein likelihoods

repeatedString

public static java.lang.String repeatedString(java.lang.String s,
                                              int n)

iterate

public static <T> java.lang.Iterable<T> iterate(java.util.Enumeration<T> en)

Makes Enumeration iterable.

Type Parameters:: T - Enumeration element type
Parameters:: en - the Enumeration
Returns:: an Iterable that can iterate through the elements of the Enumeration

joinStrings

public static java.lang.String joinStrings(java.lang.Object[] strs,
                                           java.lang.String separator)

Joins strings using a separator string. Accepts any Objects converting them to strings using their toString method.

Parameters:: strs - strings to join; separator - the separator string
Returns:: a string made up of the strings separated by the separator

joinStrings

public static java.lang.String joinStrings(java.lang.Object[] strs,
                                           java.lang.String prefix,
                                           java.lang.String separator)

Joins strings using a prefix and a separator string. Accepts any Objects converting them to strings using their toString method.

Parameters:: strs - strings to join; prefix - prefix for each string; separator - the separator string
Returns:: a string made up of the strings with the given prefix and separated by the separator

classesInPackage

public static java.util.List<java.lang.String> classesInPackage(java.lang.String packageName)

Finds all classes in a given package and all of its subpackages by walking through class path. Handles both directories and jar files.

Parameters:: packageName - the package in which the classes are searched for
Returns:: array of found class names (with full package prefixes)

findPlugins

public static <T> java.util.List<T> findPlugins(java.lang.Class<T> superClass)

Locates all plugins that are descendants of the specified plugin superclass. The plugins are expected to be in the package root.plugins where root refers to the package of the superclass.

Parameters:: superClass - the ancestral plugin class
Returns:: list of plugins found

alignmentTransformation

public static java.lang.String[] alignmentTransformation(java.lang.String[] s,
                                                         java.lang.String[] names,
                                                         java.lang.String type,
                                                         InputData input)

Transforms an alignment into the prescribed format

Parameters:: s - String array containing the alignment in StatAlign format; type - The name of the format, might be "StatAlign", "Clustal", "Fasta", "Phylip", "Nexus"; input - The input data. Needed for the Nexus format that needs a name of the alignment (set to input.title) the type of the alignment (either nucleotide or protein, read from input.model) and the list of characters in the substitution model (also read from input.model).
Returns:: String array containing the alignment in the prescribed format

copyOf

public static char[] copyOf(char[] array)

copyOf

public static int[] copyOf(int[] array)

copyOf

public static double[] copyOf(double[] array)

minMax

public static int minMax(int value,
                         int min,
                         int max)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

statalign.base Class Utils

DEBUG

USE_FULL_WINDOWS

USE_MODEXT_EM

USE_MODEXT_UPP

USE_UPPER

LEAF_COUNT_POW

generator

SPAN_MULTIPLIER

MIN_ACCEPTANCE

MAX_ACCEPTANCE

WINDOW_MULTIPLIER

MIN_SAMPLES_FOR_ACC_ESTIMATE

log0

MIN_EDGE_LENGTH

MIN_SEQ_LENGTH

DOWNWEIGHT_INDEL_LIKELIHOOD

USE_INDEL_CORRECTION_FACTOR

LOW_COUNT_THRESHOLD

LOW_COUNT_MULTIPLIER

SHAKE_IF_STUCK

MAX_SILENT_LENGTH

SILENT_INSERT_PROB

VERBOSE

logGammaDensity

logBetaDensity

linearizerWeight

linearizerWeightProb

weightedChoose

weightedChoose

weightedChoose

weightedChoose

weightedChoose

chooseOne

logWeightedChoose

logWeightedChoose

isValidHistory

isValidHistory

convertTime

logAdd

calcEmProb

repeatedString

iterate

joinStrings

joinStrings

classesInPackage

findPlugins

alignmentTransformation

copyOf

copyOf

copyOf

minMax

statalign.base
Class Utils