mems Namespace Reference


Classes

struct  _aFILE
struct  _aIORec
class  AbsolutComparator
class  AbstractGappedAlignment
class  AbstractMatch
 AbstractMatch is a pure virtual base class that defines an interface for both gapped and ungapped alignments among several sequences or several regions of the same sequence. More...

class  AbstractMatchSingleStartComparator
class  AbstractMatchStartComparator
class  Aligner
 Used to find locally colinear blocks (LCBs) and do recursive alignments on the blocks To create an alignment one need only use the align method. More...

class  AlignmentTreeNode
 A class that stores alignment-related information as a node in a phylogenetic tree. More...

struct  AlnProgressTracker
struct  bb_entry_s
struct  bin_s
struct  bmer
struct  buffer_list_s
struct  buffer_s
class  CgaBsComp
class  ClustalInterface
class  CompactGappedAlignment
 The CompactGappedAlignment stores a gapped alignment as a bit-vector Rather than using one byte per aligned position, this class uses one bit, making particularly space efficient. More...

class  DenseAbstractMatch
 The DenseAbstractMatch implements the AbstractMatch interface in a way that is most efficient when Multiplicity and SeqCount are identical or nearly so. More...

struct  device_s
class  DNAFileSML
 The seed pattern for DNA SMLs must be palindromic. More...

class  DNAMemorySML
 The DNAMemorySML is an implementation of sorted mer lists which creates and stores the sorted mer list entirely in memory. More...

class  EvenFasterSumOfPairsBreakpointScorer
class  FileSML
class  GappedAligner
class  GappedAlignment
class  GenericInterval
class  GenericIntervalList
 This class represents a set Intervals, each of which is a collinear aligned region There are functions to read and write an GenericIntervalList. More...

class  GenericMatchList
class  GenericMatchSeqManipulator
class  gnAlignedSequences
 gnAlignedSequences allows for the manipulation of aligned sequence data. More...

class  GreedyRemovalScorer
class  HssCols
class  HybridAbstractMatch
 The HybridAbstractMatch implements the AbstractMatch interface in a way that allows matches with a large SeqCount and low Multiplicity to be stored efficiently. More...

struct  idmer
struct  iodevice_s
class  Island
 A class to represent an island in an alignment. More...

class  IvTrackerComp
class  LabeledMem
 A mem labeled with a number. More...

class  LabeledMemComparator
 Compares Matches labeled with a number. More...

class  LabelSort
class  LCB
 This class is used to track relationships between LCBs during the LCB determination process. More...

class  LCBIDComparator
class  LCBLeftComparator
 Compares LCBs. More...

class  LCBLeftEndComp
class  LcbTrackingMatch
 A wrapper that maps a match among extant sequences to a match among ancestral and extant seqs. More...

class  MaskedMemHash
 Finds matches that meet a particular sequence mask, e.g. More...

class  MatchFinder
 This pure virtual class implements a general framework for finding exactly matching mers. More...

class  MatchHashEntry
 The Match class stores the location of an equal size (inexact or exactly) matching region between several sequences. More...

class  MatchLeftEndComparator
class  MatchProjectionAdapter
 MatchProjectionAdapter is a wrapper around an AbstractMatch that effectively projects a multi-match to a subset match. More...

class  MatchStartComparator
class  MemHash
 MemHash implements an algorithm for finding exact matches of a certain minimal length in several sequences. More...

class  MemorySML
 The MemorySML is an implementation of sorted mer lists which creates and stores the sorted mer list entirely in memory. More...

class  MerCompare
class  MheCompare
class  MoveScoreHeapComparator
class  MuscleInterface
class  PairwiseMatchAdapter
 PairwiseMatchAdapter is a wrapper around an AbstractMatch that effectively projects a multi-match to a pairwise match. More...

class  PairwiseMatchFinder
 Finds all pairwise matches with unique seeds among a group of sequences. More...

class  PairwiseScoringScheme
class  ParallelMemHash
 When built without OpenMP, the ParallelMemHash is just a stub wrapper around MemHash. More...

class  PlacementMatch
 A match with an associated list iterator. More...

class  PlacementMatchComparator
 Compares Matches. More...

class  ProgressiveAligner
 Computes multiple genome alignments using a progressive alignment algorithm. More...

struct  record_s
class  RepeatHash
 Finds repeats within a single sequence. More...

class  RepeatMatch
 The Match class stores the location of an equal size (inexact or exactly) matching region between several sequences. More...

class  RepeatMatchList
class  SearchCacheComparator
 used for search cache lookups More...

class  SeedOccurrenceList
struct  seqbuf_s
class  SimpleBreakpointScorer
 A breakpoint scorer that applies a fixed penalty for each breakpoint that exists in a set of two or more sequences. More...

class  SingleStartComparator
class  SlotAllocator
 This class allocates memory according to the slot allocation scheme for fixed size objects. More...

struct  sml_s
struct  SMLHeader
struct  SMLHeader_s
struct  sort_buf_s
class  SortedMerList
 A base class which defines an interface common to all sorted mer lists. More...

class  SparseAbstractMatch
 The SparseAbstractMatch implements the AbstractMatch interface in a way that allows matches with a large SeqCount and low Multiplicity to be stored efficiently. More...

class  SSC
class  SuperInterval
 A class that stores an alignment and coordinate mapping between collinear segments of an ancestral genome and two descendant genomes. More...

class  SuperIntervalManipulator
class  SuperIntervalPtrComp
class  TrackingLCB
 This class is used to track relationships between LCBs during the LCB determination process. More...

class  UngappedLocalAlignment
 The UngappedLocalAlignment class stores the location of an equal size (inexact or exactly) matching region between several sequences. More...

struct  working_set_s

Typedefs

typedef boost::tuple< CompactGappedAlignment<
> *, vector< bitset_t > *,
AbstractMatch * > 
_sort_tracker_type
typedef GenericMatchSeqManipulator AbstractMatchSeqManipulator
typedef mems::_aFILE aFILE
typedef mems::_aIORec aIORec
typedef std::vector< std::vector<
ULA * > > 
backbone_list_t
typedef mems::bb_entry_s bb_entry_t
typedef std::vector< std::pair<
int64, int64 > > 
bb_seqentry_t
typedef mems::bin_s bin_t
typedef boost::dynamic_bitset bitset_t
typedef mems::buffer_list_s buffer_list_t
typedef buffer_s buffer_t
typedef DenseAbstractMatch< 128 > DenseAbstractMatch128
typedef DenseAbstractMatch< 16 > DenseAbstractMatch16
typedef DenseAbstractMatch< 2 > DenseAbstractMatch2
typedef DenseAbstractMatch< 32 > DenseAbstractMatch32
typedef DenseAbstractMatch< 4 > DenseAbstractMatch4
typedef DenseAbstractMatch< 64 > DenseAbstractMatch64
typedef DenseAbstractMatch< 8 > DenseAbstractMatch8
typedef mems::device_s device_t
typedef dmtimer_s dmtimer_t
typedef boost::multi_array<
hss_list_t, 3 > 
hss_array_t
typedef std::vector< HssColshss_list_t
typedef std::list< idmerIdmerList
typedef GenericInterval Interval
typedef GenericIntervalList IntervalList
typedef mems::iodevice_s iodevice_t
typedef HssCols IslandCols
typedef pair< size_t, Interval * > iv_tracker_t
typedef unsigned long long mask_t
typedef UngappedLocalAlignment<
HybridAbstractMatch<> > 
Match
typedef void * MatchID_t
typedef GenericMatchList<
Match * > 
MatchList
typedef vector< size_t > neighbor_t
typedef unsigned long long offset_t
typedef boost::multi_array<
vector< pair< size_t, size_t > >, 3 > 
pairwise_genome_hss_t
typedef boost::multi_array<
std::vector< TrackingLCB<
TrackingMatch * > >, 2 > 
PairwiseLCBMatrix
typedef unsigned position_t
typedef mems::record_s record_t
typedef signed short sarID_t
typedef int score_t
typedef std::pair< mems::Match *,
mems::Match * > 
search_cache_t
 a cache type to remember which intervals have already been searched

typedef mems::seqbuf_s seqbuf_t
typedef mems::sml_s sml_t
typedef mems::SMLHeader_s SMLHeader_t
typedef uint32 smlSeqI_t
typedef mems::sort_buf_s sort_buf_t
typedef LcbTrackingMatch<
mems::AbstractMatch * > 
TrackingMatch
typedef unsigned uint32
typedef unsigned long long uint64
typedef unsigned char uint8
typedef mems::UngappedLocalAlignment<
mems::HybridAbstractMatch<> > 
ULA
typedef mems::working_set_s working_set_t

Enumerations

enum  { A_READ, A_WRITE }
enum  { DEV_FREE, DEV_BUSY }
enum  { OP_PENDING = -2, OP_FINISHED = -1, OP_NONE = 0 }
enum  { CalculateHistogram = 0, CopyPointers = 1, QsortPointers = 2, CopyData = 3 }
enum  dm_errors {
  SUCCESS, TOO_FEW_BINS, TOO_MANY_BINS, INPUT_NOT_OPENED,
  INVALID_WS_SIZE, SEQUENCE_TOO_SHORT, OUTPUT_NOT_OPENED, INVALID_NUMRECS,
  NO_FREE_BUFFERS, BIN_NOT_OPENED
}

Functions

void AaronsLCB (MatchList &mlist, set< uint > &breakpoints)
 Function to determine the breakpoints in a set of matches.

int aClose (aFILE *file)
template<typename ListType, typename Iter> void AddGapMatches (ListType &the_list, const Iter &first, const Iter &last, uint seqI, int64 left_end, int64 right_end, AbstractMatch::orientation seq_orient, uint seq_count)
void addGuy (uint seqI, AbstractMatch::orientation orient, std::vector< AbstractMatch * > &new_ivs, vector< Interval * > &new_list)
void AddTo64 (unsigned int amt, unsigned int *hi, unsigned int *lo)
template<class AbstractMatchType> void AddToMatchIdentityMatrix (const AbstractMatchType &amt, const std::vector< genome::gnSequence * > &seq_table, NumericMatrix< double > &identity)
void addUnalignedIntervals (IntervalList &iv_list, set< uint > seq_set, vector< gnSeqI > seq_lengths)
 Find regions in each sequence that do not belong to any LCB, add them to their own Interval (LCB) in the IntervalList.

template<class MatchVectorType> void addUnalignedIntervals_v2 (MatchVectorType &iv_list, std::set< uint > seq_set, std::vector< gnSeqI > seq_lengths)
template<class IntervalListType> void addUnalignedRegions (IntervalListType &iv_list)
int aFileBusy (aFILE *file)
void aFlush (aFILE *file)
template<typename MatchVector> void alignedNtCountFilter (MatchVector &matches, uint length)
void AlignLCBInParallel (bool collinear_genomes, mems::GappedAligner *gal, MatchList &mlist, Interval &iv, AlnProgressTracker &apt)
buffer_tAllocateFree (void)
aFILEaOpen (const char *path, int mode)
int aOperationComplete (aFILE *file, int operation)
template<class T, class Manipulator> void applyAncestralBreakpoints (const vector< SuperInterval > &siv_list, vector< T > &ord, uint seqI, Manipulator &m)
template<class T, class Maniplator> void applyBreakpoints (std::vector< gnSeqI > &bp_list, std::vector< T > &iv_list, Maniplator &manip)
int aRead (void *buffer, offset_t size, offset_t count, aFILE *file, offset_t pos)
unsigned long long aStatFileSize (const char *path)
unsigned long aStatSize (const char *path)
void aUpdateOperations (aFILE *file)
void aWaitComplete (aFILE *file, int operation)
void aWaitNotBusy (aFILE *file)
int aWrite (void *buffer, offset_t size, offset_t count, aFILE *file, offset_t pos)
template<class AbstractMatchVectorType> void BackboneIdentityMatrix (const AbstractMatchVectorType &matches, const std::vector< genome::gnSequence * > &seq_table, NumericMatrix< double > &identity)
void BinningPhase (void)
int bmer_compare (const void *a_v, const void *m_v)
bool bmer_id_lessthan (const bmer &a_v, const bmer &m_v)
bool bmer_lessthan (const bmer &a_v, const bmer &m_v)
int cachecomp (const void *e1, const void *e2)
offset_t CalculateDataReadSize (buffer_t *b)
int CalculateSortWriteSize (int sortI)
boolean checkCollinearity (Match *m1, Match *m2)
bool checkConsistent (const AbstractMatch *a, const AbstractMatch *b)
void checkForAllGapColumns (IntervalList &iv_list)
 sanity checks for alignment columns that contain only gaps

void chooseNextAlignmentPair (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t &node1, node_id_t &node2, node_id_t &ancestor)
 Select the next pair of nodes to align The chosen pair will either be unaligned extant sequences or unaligned ancestral sequences whose descendants have all been aligned.

void collapseCollinear (IntervalList &iv_list)
 collapse Intervals that are trivially collinear with each other

int comp_keys (record_t a, record_t b)
int CompareKeys (const record_t *r1, const record_t *r2)
int CompareKeys_qsort_wrapper (const void *r1, const void *r2)
void ComplementHss (const size_t alignment_length, hss_list_t &hss_list, hss_list_t &island_col_list, uint seqI=0, uint seqJ=0)
int ComputeAsciiBinNumber (const unsigned char key[10])
int ComputeBinNumber (const unsigned char key[10])
void computeConsensusScore (const std::vector< std::string > &alignment, const PairwiseScoringScheme &pss, std::vector< score_t > &scores, std::string &consensus, score_t &score)
void computeGapScores (const std::string &seq1, const std::string &seq2, const PairwiseScoringScheme &scoring, std::vector< score_t > &scores)
double computeGC (std::vector< genome::gnSequence * > &seq_table)
 compute the GC content of a set of sequences

double computeGC (std::vector< gnSequence * > &seq_table)
double computeID (GappedAlignment &gal, size_t seqI, size_t seqJ)
void computeLCBAdjacencies_v2 (IntervalList &iv_list, vector< int64 > &weights, vector< LCB > &adjacencies)
 Redesign to be more intuitive.

void computeLCBAdjacencies_v2 (vector< MatchList > &lcb_list, vector< int64 > &weights, vector< LCB > &adjacencies)
void computeLCBAdjacencies_v3 (mems::IntervalList &iv_list, std::vector< double > &weights, std::vector< mems::LCB > &adjacencies)
 Redesign to be more intuitive.

template<class MatchVector> void computeLCBAdjacencies_v3 (const std::vector< MatchVector > &lcb_list, std::vector< double > &weights, std::vector< mems::LCB > &adjacencies)
void ComputeLCBs (MatchList &meml, set< uint > &breakpoints, vector< MatchList > &lcb_list, vector< int64 > &weights)
template<class MatchVector> void ComputeLCBs_v2 (const MatchVector &meml, const std::vector< gnSeqI > &breakpoints, std::vector< MatchVector > &lcb_list)
void computeMatchScores (const std::string &seq1, const std::string &seq2, const PairwiseScoringScheme &scoring, std::vector< score_t > &scores)
int ComputeNNNNNBinNumber (const unsigned char key[10])
void computeSPScore (const std::vector< string > &alignment, const PairwiseScoringScheme &pss, std::vector< score_t > &scores, score_t &score)
void computeSPScore (const std::vector< std::string > &alignment, const PairwiseScoringScheme &pss, std::vector< score_t > &scores, score_t &score)
template<class Tree> bool containsNode (Tree &t, node_id_t subtree_nodeI, node_id_t query_nodeI)
 Depth first search to check whether a subtree contains a given node.

uint64 countGaps (string &seq)
size_t countUnrefined (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t ancestor)
 CREATE_EXCEPTION (SMLMergeError)
 Thrown when there is an error merging two sorted mer lists.

 CREATE_EXCEPTION (SMLCreateError)
 Thrown when there is an error creating a sorted mer list.

 CREATE_EXCEPTION (InvalidArgument)
 CREATE_EXCEPTION (InvalidData)
 InvalidData exceptions are thrown when the input to an algorithm is invalid.

 CREATE_EXCEPTION (AlignerError)
 Thrown if some error occurs during alignment.

node_id_t createAlignmentTreeRoot (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t node1, node_id_t node2)
void createBackboneList (const IntervalList &iv_list, backbone_list_t &ula_list)
uint8CreateBasicDNATable ()
void CreateGapSearchList (std::vector< LCB > &adjacencies, const std::vector< genome::gnSequence * > &seq_table, std::vector< std::vector< int64 > > &iv_regions, boolean entire_genome)
void CreateGapSearchList (vector< LCB > &adjacencies, const vector< gnSequence * > &seq_table, vector< vector< int64 > > &iv_regions, boolean entire_genome)
 iv_regions -- lists of intervening regions between LCBs in each sequence start positions organized as iv_regions[ seqI ][ lcbI * 2 ] end positions organized as iv_regions[ seqI ][ lcbI * 2 + 1 ]

template<typename MatchVector> void createMap (const MatchVector &mv_from, const MatchVector &mv_to, vector< size_t > &map)
void detectAndApplyBackbone (AbstractMatch *m, std::vector< genome::gnSequence * > &seq_table, CompactGappedAlignment<> *&result, backbone_list_t &bb_list, const Params &hmm_params, boolean left_homologous=false, boolean right_homologous=false)
 Applies pairwise transitive homology statistics to detect backbone in a single collinear alignment Unaligns any regions found to be non-homologous, returns coordinates of the homologous segments in bb_list.

void detectAndApplyBackbone (IntervalList &iv_list, backbone_list_t &bb_list, const Params &hmm_params)
 Applies pairwise transitive homology statistics to detect backbone in a genome alignment Unaligns any regions found to be non-homologous, returns coordinates of the homologous segments in bb_list.

void detectAndApplyBackbone (AbstractMatch *m, vector< gnSequence * > &seq_table, CompactGappedAlignment<> *&result, backbone_list_t &bb_list, const Params &hmm_params, boolean left_homologous, boolean right_homologous)
void DisplayStatus (void)
void DisplayStatusHeader (void)
void DistanceMatrix (IntervalList &iv_list, NumericMatrix< double > &distmat)
void DistanceMatrix (uint seq_count, const std::vector< std::pair< uint64, uint64 > > &detail_list, NumericMatrix< double > &distance)
void DistanceMatrix (const MatchList &mlist, NumericMatrix< double > &identity)
int dmSML (const char *input_file, const char *output_file, const char *const *scratch_paths, uint64 seed)
int dmsort (void)
void DoBinning (void)
void DoReading (void)
void EliminateOverlaps (MatchList &ml)
 Deletes overlapping regions in a set of matches.

template<class MatchVector> void EliminateOverlaps_v2 (MatchVector &ml, bool eliminate_both=false)
template<class MatchVector> void EliminateOverlaps_v2 (MatchVector &ml, const std::vector< uint > &seq_ids, bool eliminate_both=false)
 Delete overlapping regions in favor of the larger match.

void EnsureAllOperationsComplete (void)
void extendRootBranches (PhyloTree< AlignmentTreeNode > &alignment_tree)
void filterMatches (vector< LCB > &adjacencies, vector< MatchList > &lcb_list, vector< int64 > &weights)
 Takes a set of filtered LCB adjacencies and an unfiltered set of matches as input returns a filtered set of matches that reflects the LCBs found.

template<class MatchVector> void filterMatches_v2 (std::vector< mems::LCB > &adjacencies, std::vector< MatchVector > &lcb_list, std::vector< double > &weights, MatchVector &deleted_matches)
 Takes a set of filtered LCB adjacencies and an unfiltered set of matches as input returns a filtered set of matches that reflects the LCBs found.

template<class T, class S> void findAndErase (T &container, S &item)
template<class MatchVector> void FindBoundaries (const MatchVector &matches, std::vector< gnSeqI > &left_ends, std::vector< gnSeqI > &lengths, std::vector< bool > &orientations)
void findHssExcursions (std::vector< score_t > scores, score_t significance_threshold, hss_list_t &hss_list, uint seqI, uint seqJ, boolean left_hss, boolean right_hss)
template<typename MatchVector> void findHssHomologyHMM (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, hss_array_t &hss_array, const Params &hmm_params, boolean left_homologous, boolean right_homologous)
void findHssHomologyHMM (std::vector< std::string > &aln_table, hss_list_t &hss_list, uint seqI, uint seqJ, const Params &hmm_params, boolean left_homologous, boolean right_homologous)
template<typename MatchVector> void findHssRandomWalk (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, const PairwiseScoringScheme &scoring, score_t significance_threshold, hss_array_t &hss_array, boolean left_homologous=false, boolean right_homologous=false)
template<typename MatchVector> void findHssRandomWalk_v2 (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, const PairwiseScoringScheme &scoring, score_t significance_threshold, hss_array_t &hss_array, boolean left_homologous, boolean right_homologous)
template<typename MatchVector> void findHssRandomWalkCga (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, const PairwiseScoringScheme &scoring, score_t significance_threshold, std::vector< CompactGappedAlignment<> * > &hss_list)
void findHssRandomWalkScoreVector (std::vector< score_t > scores, score_t significance_threshold, hss_list_t &hss_list, uint seqI=0, uint seqJ=0, boolean left_homologous=false, boolean right_homologous=false)
void findIslandsBetweenLCBs (IntervalList &iv_list, uint island_size, ostream &island_out)
template<typename MatchVector> void findIslandsRandomWalk (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, const PairwiseScoringScheme &scoring, score_t significance_threshold, std::vector< Island > &island_list)
template<typename MatchVector> void findIslandsRandomWalkCga (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, const PairwiseScoringScheme &scoring, score_t significance_threshold, std::vector< CompactGappedAlignment<> * > &island_list)
void findMidpoint (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t &n1, node_id_t &n2)
 finds the midpoint of a phylogenetic tree, returns the ids of the surrounding nodes in n1 and n2

void findMscFromExcursions (std::vector< score_t > scores, score_t significance_threshold, hss_list_t &hss_list, hss_list_t &msc_list, uint seqI, uint seqJ, boolean left_hss, boolean right_hss)
void findRightEndpoint (size_t seqI, size_t seqJ, score_t significance_threshold, std::vector< score_t > &scores, hss_list_t &hss_list)
void FinishBinning ()
const char * Fmt (const char *fmt,...)
lint get_aln_score (void)
const std::vector< std::string > & GetAlignment (const GappedAlignment &ga, const std::vector< genome::gnSequence * > &seq_table)
void GetAlignment (const AbstractMatch &ga, const std::vector< genome::gnSequence * > &seq_table, std::vector< std::string > &alignment)
template<typename MatchVector> void getBpList (MatchVector &mvect, uint seq, vector< gnSeqI > &bp_list)
char * getCharmap ()
double getDefaultBpDistEstimateMinScore (std::vector< gnSequence * > &sequences)
unsigned getDefaultBreakpointMax (const std::vector< genome::gnSequence * > &seq_table)
double getDefaultBreakpointPenalty (std::vector< genome::gnSequence * > &sequences)
double getDefaultBreakpointPenalty (std::vector< gnSequence * > &sequences)
PairwiseScoringSchemegetDefaultScoringScheme ()
void getGapBounds (vector< gnSeqI > &seq_lengths, vector< LCB > &adjacencies, uint seqJ, int leftI, int rightI, int64 &left_start, int64 &right_start)
boolean getInterveningCoordinates (const AbstractMatch *iv, uint oseqI, Match *r_begin, Match *r_end, uint seqI, int64 &gap_lend, int64 &gap_rend)
boolean getInterveningCoordinates (std::vector< genome::gnSequence * > &seq_table, Match *r_begin, Match *r_end, uint seqI, int64 &gap_lend, int64 &gap_rend)
void GetLCBCoverage (MatchList &lcb, uint64 &coverage)
neighbor_tgetNeighbor (pair< neighbor_t, neighbor_t > &entry, int direction)
template<class MatchVector> double GetPairwiseAnchorScore (MatchVector &lcb, std::vector< genome::gnSequence * > &seq_table, const mems::PairwiseScoringScheme &subst_scoring, mems::SeedOccurrenceList &sol_1, mems::SeedOccurrenceList &sol_2, bool penalize_gaps=false)
 computes an anchoring score for the matches contained inside an LCB

void getPairwiseLCBs (uint nI, uint nJ, uint dI, uint dJ, vector< TrackingMatch * > &tracking_matches, vector< TrackingLCB< TrackingMatch * > > &t_lcbs, boost::multi_array< double, 3 > &tm_score_array, boost::multi_array< size_t, 3 > &tm_lcb_id_array)
 Computes all pairwise LCBs from a set of tracking matches.

int64 greedyBreakpointElimination (gnSeqI minimum_weight, vector< LCB > &adjacencies, vector< int64 > &weights, ostream *status_out)
 Version 2 of this algorithm: each time two LCBs coalesce, repeatedly search their intervening region until either a single LCB exists or all LCBs meet the current minimum_weight.

template<class BreakpointScorerType> int64 greedyBreakpointElimination_v4 (std::vector< mems::LCB > &adjacencies, std::vector< double > &scores, BreakpointScorerType &bp_scorer, std::ostream *status_out, size_t g1_tag=0, size_t g2_tag=0)
template<class SearchScorer> double greedySearch (SearchScorer &spbs)
 finds the best anchoring, returns the anchoring score

void HandleBinWriteCompletions (void)
void HandleReadingCompletions (void)
void HandleSeqbufWriteCompletions (void)
template<typename MatchVector> void HssArrayToCga (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, hss_array_t &hss_array, std::vector< CompactGappedAlignment<> * > &cga_list)
template<typename MatchVector> void HssColsToIslandCols (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, hss_array_t &hss_array, hss_array_t &island_col_array)
template<typename MatchVector> void hssColsToIslandCols (const MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, std::vector< HssCols > &hss_list, std::vector< IslandCols > &island_col_list)
template<class MatchVector> void IdentifyBreakpoints (MatchVector &mlist, std::vector< gnSeqI > &breakpoints)
void IdentityMatrix (const IntervalList &iv_list, NumericMatrix< double > &identity)
template<class AbstractMatchVectorType> void IdentityMatrix (const AbstractMatchVectorType &matches, const std::vector< genome::gnSequence * > &seq_table, NumericMatrix< double > &identity)
bool idmer_greaterthan (idmer &a_v, idmer &m_v)
bool idmer_id_lessthan (idmer &a_v, idmer &m_v)
bool idmer_lessthan (idmer &a_v, idmer &m_v)
bool idmer_position_lessthan (idmer &a_v, idmer &m_v)
int InitdmSML (long working_mb, long buffer_size, const char *input_filename, const char *output_filename, const char *const *scratch_paths, uint64 seed)
buffer_list_tInitList (buffer_list_t *list)
void InitRadixSort (sort_buf_t *sortbuf, buffer_t *scratch_buffer)
SMLHeader_t InitSML (aFILE *file, uint64 file_size, uint64 seed)
void InitTime ()
void initTrackingMatchLCBTracking (const std::vector< TrackingMatch > &tracking_matches, size_t n1_count, size_t n2_count, boost::multi_array< size_t, 3 > &tm_lcb_id_array)
 creates an appropriately sized matrix for mapping individual TrackingMatches to their containing LCBs

template<typename ListType, typename Ty> void insert (ListType &the_list, const typename ListType::iterator &iter, Ty &val)
template<typename ListType, typename RanIt, typename Ty> void insert (ListType &the_list, std::reverse_iterator< RanIt > &riter, Ty &val)
int IsDenseEnough (GappedAlignment *gal_iter)
template<typename MatchListType> void LoadAndCreateRawSequences (MatchListType &mlist, std::ostream *log_stream)
 Loads the sequences designated by the elements of the seq_filename vector and creates temporary RAW sequence files.

template<typename MatchListType> void LoadMFASequences (MatchListType &mlist, const std::string &mfa_filename, std::ostream *log_stream)
 Loads sequences to align from a Multi-FastA file The genome::gnSequence and SortedMerList objects are created on the heap and are not deallocated when this class is destroyed.

template<typename MatchListType> void LoadSequences (MatchListType &mlist, std::ostream *log_stream)
 Attempts to load the sequences designated by the elements of the seq_filename vector.

template<typename T> T * m_allocateAndCopy (const T &t)
template<typename T> void m_free (T *t)
template<typename MatchListType> void makeAlignmentTree (PhyloTree< AlignmentTreeNode > &alignment_tree, MatchListType &mlist, vector< uint > &node_sequence_map)
void makeAllPairwiseGenomeHSS (IntervalList &iv_list, vector< CompactGappedAlignment<> * > &iv_ptrs, vector< CompactGappedAlignment<> * > &iv_orig_ptrs, pairwise_genome_hss_t &hss_cols, const Params &hmm_params)
int MakeWorkingSet (working_set_t *ws, offset_t goalsize, offset_t minrecs, offset_t maxrecs)
void markAligned (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t subject_node, node_id_t neighbor)
void markAsRefined (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t ancestor)
void maskNNNNN (const genome::gnSequence &in_seq, genome::gnSequence &out_seq, std::vector< int64 > &seq_coords, int mask_n_length)
void maskNNNNN (const gnSequence &in_seq, gnSequence &out_seq, vector< int64 > &seq_coords, int mask_n_length)
template<class AbstractMatchType> void MatchIdentityMatrix (const AbstractMatchType &amt, const std::vector< genome::gnSequence * > &seq_table, NumericMatrix< double > &identity)
void mergePairwiseHomologyPredictions (vector< CompactGappedAlignment<> * > &iv_orig_ptrs, pairwise_genome_hss_t &hss_cols, vector< vector< ULA * > > &ula_list)
void mergeUnalignedIntervals (uint seqI, vector< Interval * > &iv_list, vector< Interval * > &new_list)
void msaFromSeqTable (MSA &msa, const vector< string > &seq_table, unsigned id_base=0)
template<typename MatchVector> void multFilter (MatchVector &matches, uint mult=2)
boolean my_validateLCB (MatchList &lcb)
 Test code to ensure that an individual LCB is truly collinear.

template<typename AbstractMatchImpl> std::ostream & operator<< (std::ostream &os, const UngappedLocalAlignment< AbstractMatchImpl > &ula)
std::ostream & operator<< (std::ostream &os, const RepeatMatch &mhe)
template<class GappedBaseImpl> std::ostream & operator<< (std::ostream &os, const GenericInterval< GappedBaseImpl > &cr)
 Writes this GenericInterval to the specified output stream (e.g.

std::ostream & operator<< (std::ostream &os, const GappedAlignment &ga)
std::istream & operator>> (std::istream &is, GappedAlignment &ga)
void outputBackbone (const std::vector< GappedAlignment > &backbone_regions, std::ostream &backbone_out)
 writes out a list of backbone regions

void outputBackbone (const vector< GappedAlignment > &backbone_regions, ostream &backbone_out)
char ** parseCommand (const string &cmd)
bool pipeExec (char **cmd_argv, const string &command, const string &input, string &output, string &error)
buffer_tPopHead (buffer_list_t *list)
buffer_tPopTail (buffer_list_t *list)
void prepareAlignmentTree (PhyloTree< AlignmentTreeNode > &alignment_tree)
template<class BoostMatType> void print2d_matrix (BoostMatType &mat, std::ostream &os)
void print_usage (const char *pname)
void printBbSeq (std::ostream &os, const bb_seqentry_t &bbseq)
void printMatch (mems::AbstractMatch *m, std::ostream &os)
void printProgress (uint prev_prog, uint cur_prog, ostream &os)
template<class MatchVector> void processNewMatch (uint seqI, MatchVector &new_matches, typename MatchVector::value_type &new_match)
void projectIntervalList (mems::IntervalList &iv_list, std::vector< uint > &projection, std::vector< std::vector< mems::MatchProjectionAdapter * > > &LCB_list, std::vector< mems::LCB > &projected_adjs)
void propagateInvert (PhyloTree< AlignmentTreeNode > &alignment_tree, node_id_t ancestor, size_t ans_siv)
 propagates an inversion of an ancestral SuperInterval to SuperIntervals in descendant nodes

void PushHead (buffer_list_t *list, buffer_t *item)
void PushTail (buffer_list_t *list, buffer_t *item)
void QBrute (record_t a[], int lo, int hi)
void QSort (record_t a[], int lo0, int hi0)
void RadixSort (sort_buf_t *sortbuffer)
void readBackboneColsFile (std::istream &bbcol_input, std::vector< std::pair< size_t, ULA > > &bb_list)
void readBackboneSeqFile (std::istream &bbseq_input, std::vector< bb_seqentry_t > &backbone)
void ReadBuffer (buffer_t *buffer, offset_t num_recs, iodevice_t *dev)
void ReadList (MatchList &mlist, std::istream &match_stream)
 Reads a GenericMatchList from an input stream Sequence and SML file names are read into the seq_filename and sml_filename vectors, but the actual files are not opened.

void readSubstitutionMatrix (std::istream &is, score_t matrix[4][4])
unsigned int ReadTimer (dmtimer_t *t)
void RecSort (record_t a[], int nelems)
template<class FromType, class ToType, class MatchListType> void RemapSubsetMatchAddresses (std::map< FromType, ToType > &old_to_new_map, MatchListType &match_list)
 Use this to update linkage pointers after copying an entire set of Matches.

int removeFile (const char *filename, int verbose)
 cross-platform file deletion

buffer_tRemoveItem (buffer_list_t *list, buffer_t *item)
void removeLargeGapsPP (GappedAlignment &gal, list< GappedAlignment * > &gal_list, vector< bool > &gap_iv, const vector< size_t > &group1, const vector< size_t > &group2)
template<class LcbVector> uint RemoveLCBandCoalesce (size_t lcbI, uint seq_count, LcbVector &adjacencies, std::vector< double > &scores, std::vector< std::pair< uint, uint > > &id_remaps, std::vector< uint > &impact_list)
 removes an LCB from an LCB list and coalesces surrounding LCBs.

void ReorganizeWorkingSet (working_set_t *ws, offset_t minrecs, offset_t maxrecs)
template<class Tree> void rerootTree (Tree &t, node_id_t new_root)
 place a root on the branch with endpoints root_left and root_right

void RestructureReadSMLBins (void)
void RestructureSMLBinsForWrite (void)
void scanFit (list< LabeledMem > &pair_list, list< LabeledMem >::iterator &list_iter, Match *new_match, uint sort_seq)
void scanLabels (set< uint > &no_match_labels, uint &start_label, boolean forward)
void scanLeft (int &left_recurseI, vector< LCB > &adjacencies, int min_weight, int seqI)
void scanRight (int &right_recurseI, vector< LCB > &adjacencies, int min_weight, int seqI)
void SearchLCBGaps (MatchList &new_matches, const std::vector< std::vector< int64 > > &iv_regions, MaskedMemHash &nway_mh)
void Shift64 (int amt, int *hi, int *lo)
 shifts a 64-bit value (in two 32 bit parts) either right or left.

void simpleFindBackbone (IntervalList &iv_list, uint backbone_size, uint max_gap_size, vector< GappedAlignment > &backbone_regions)
 Identifies stretches of alignment existing in all sequences that doesn't contain a gap larger than a particular size.

void simpleFindIslands (IntervalList &iv_list, uint island_size, vector< Island > &island_list)
void simpleFindIslands (IntervalList &iv_list, uint island_size, ostream &island_out)
 Identifies gaps in the alignment between pairs of sequences that are longer than some number of base pairs in length.

template<class MatchVector> uint64 SimpleGetLCBCoverage (MatchVector &lcb)
template<typename MatchVector> void SingleCopyDistanceMatrix (MatchVector &iv_list, std::vector< genome::gnSequence * > &seq_table, NumericMatrix< double > &distance)
int SortBuffer (buffer_t *buf)
void SortHandleCompletions (void)
void SortingEnsureAllOperationsComplete ()
void SortingPhase (void)
void SortReading (void)
void SortSorting (void)
void SortUpdateIOState ()
void SortWriting (void)
void splitGappedAlignment (const GappedAlignment &ga, GappedAlignment &ga1, GappedAlignment &ga2, std::vector< size_t > &seqs1, std::vector< size_t > &seqs2)
dmtimer_tStartTimer ()
void StopTimer (dmtimer_t *t)
void stripGapColumns (std::vector< std::string > &aln)
void stripGaps (std::string &str)
void TransformDistanceIdentity (NumericMatrix< double > &identity)
void Translate32 (uint32 *dest, const char *src, const unsigned len)
void translateToPairwiseGenomeHSS (const hss_array_t &hss_array, pairwise_genome_hss_t &hss_cols)
void transposeMatches (MatchList &mlist, uint seqI, const std::vector< int64 > &seq_regions)
void transposeMatches (MatchList &mlist, uint seqI, const vector< int64 > &seq_regions)
 Transposes the coordinates of matches in mlist to correspond to the original set of source sequence regions described by seq_regions, splitting matches if necessary.

void unalignIslands (IntervalList &iv_list, vector< CompactGappedAlignment<> * > &iv_orig_ptrs, vector< vector< ULA * > > &ula_list)
template<class LcbVector> void undoLcbRemoval (uint seq_count, LcbVector &adjs, std::vector< std::pair< uint, uint > > &id_remaps)
void UpdateDeviceIOExecuteState (working_set_t *ws, iodevice_t *dev)
void UpdateIOState (void)
void UpdateWSIOFinishedState (working_set_t *ws)
boolean validateLCB (MatchList &lcb)
 Test code to ensure that an individual LCB is truly collinear.

void validateRangeIntersections (vector< MatchList > &lcb_list)
const char * VFmt (const char *fmt, va_list args)
void writeBackboneColumns (ostream &bb_out, backbone_list_t &bb_list)
 Writes a backbone column file.

void writeBackboneSeqCoordinates (backbone_list_t &bb_list, IntervalList &iv_list, ostream &bb_out)
 Writes a backbone sequence coordinate file.

void WriteBuffer (buffer_t *buffer, offset_t num_recs, iodevice_t *dev)
void WriteList (const MatchList &mlist, std::ostream &match_stream)
 Writes a GenericMatchList to the designated output stream.

void WritePermutationCoordinates (IntervalList &perm_iv_list, std::string out_filename)

Variables

SearchCacheComparator cache_comparator
char charmap [128]
char colmap [5][5]
int dbg_count = 0
bool debug_aligner = false
 controls whether copious debugging tests and output gets written to screen

bool debug_cga = false
bool debug_interval = false
bool debug_me = false
bool debug_muscle = false
bool debug_shite = false
bool debugging_cltm = false
const uint32 DEFAULT_ENUMERATION_TOLERANCE = 1
const score_t default_gap_extend = -30
const score_t default_gap_open = -400
const mems::score_t DEFAULT_ISLAND_SCORE_THRESHOLD = 2727
const uint32 DEFAULT_MEM_TABLE_SIZE = 40000
const gnSeqI default_min_r_gap_size = 200
const uint32 DEFAULT_REPEAT_TOLERANCE = 0
uint8DNA_TABLE
int failure_count = 0
const score_t hoxd_matrix [4][4]
const unsigned INDEX_INTERVAL = 512
const score_t INV_SCORE = (std::numeric_limits<score_t>::max)()
const score_t INVALID_SCORE = (std::numeric_limits<score_t>::max)()
size_t lcb_hangover = 300
const uint LCB_UNASSIGNED = (std::numeric_limits<uint>::max)()
 indicates an LCB identifier hasn't been assigned or is unknown

const int LEFT_NEIGHBOR = -1
int mask_length = 31
int mask_weight = 31
double max_density = .9
size_t max_gap_length = 3000
double max_window_size = 20000
const int MEGA_ALIGN_COLUMNS = 60
const uint MIN_ANCHOR_LENGTH = 9
double min_density = .5
double min_window_size = 200
const uint NO_ADJACENCY = (std::numeric_limits<uint>::max)()
const gnSeqI NO_MATCH = 0
int NumBins
bool penalize_repeats = false
const double POOL_GROWTH_RATE = 1.6
 When more space is needed to store a datatype, the memory pool will grow by this factor.

boolean print_sp = false
const unsigned int PROGRESS_GRANULARITY = 100
bool progress_msgs = false
const int RIGHT_NEIGHBOR = 1
mask_t seed_mask = 0x7FFFFFFF
uint seq_compare_start


Typedef Documentation

typedef boost::tuple<CompactGappedAlignment<>*, vector< bitset_t >*, AbstractMatch* > mems::_sort_tracker_type
 

Definition at line 1485 of file ProgressiveAligner.cpp.

Referenced by mems::CgaBsComp< CompType >::operator()().

typedef GenericMatchSeqManipulator mems::AbstractMatchSeqManipulator
 

Definition at line 543 of file ProgressiveAligner.h.

Referenced by mems::ProgressiveAligner::translateGappedCoordinates().

typedef struct mems::_aFILE mems::aFILE
 

typedef struct mems::_aIORec mems::aIORec
 

typedef std::vector< std::vector< ULA* > > mems::backbone_list_t
 

Definition at line 37 of file Backbone.h.

Referenced by createBackboneList(), detectAndApplyBackbone(), writeBackboneColumns(), and writeBackboneSeqCoordinates().

typedef struct mems::bb_entry_s mems::bb_entry_t
 

typedef std::vector< std::pair< int64, int64 > > mems::bb_seqentry_t
 

Definition at line 86 of file Backbone.h.

Referenced by printBbSeq(), and readBackboneSeqFile().

typedef struct mems::bin_s mems::bin_t
 

typedef boost::dynamic_bitset mems::bitset_t
 

Definition at line 40 of file AbstractMatch.h.

Referenced by mems::ProgressiveAligner::constructLcbTrackingMatches(), mems::ProgressiveAligner::ConstructSuperIntervalFromMSA(), mems::CompactGappedAlignment< BaseType >::copyRange(), mems::CompactGappedAlignment< BaseType >::create_bitcount(), findMidpoint(), mems::UngappedLocalAlignment< AbstractMatchImpl >::GetAlignment(), mems::GenericInterval< GappedBaseImpl >::GetAlignment(), mems::GappedAlignment::GetAlignment(), mems::CompactGappedAlignment< BaseType >::Invert(), mems::CompactGappedAlignment< BaseType >::SeqPosToColumn(), mems::CompactGappedAlignment< BaseType >::SetAlignment(), and mems::CompactGappedAlignment< BaseType >::translate().

typedef struct mems::buffer_list_s mems::buffer_list_t
 

typedef struct buffer_s mems::buffer_t
 

Definition at line 580 of file FileSML.cpp.

typedef DenseAbstractMatch<128> mems::DenseAbstractMatch128
 

Definition at line 165 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<16> mems::DenseAbstractMatch16
 

Definition at line 162 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<2> mems::DenseAbstractMatch2
 

Definition at line 159 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<32> mems::DenseAbstractMatch32
 

Definition at line 163 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<4> mems::DenseAbstractMatch4
 

Definition at line 160 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<64> mems::DenseAbstractMatch64
 

Definition at line 164 of file DenseAbstractMatch.h.

typedef DenseAbstractMatch<8> mems::DenseAbstractMatch8
 

Definition at line 161 of file DenseAbstractMatch.h.

typedef struct mems::device_s mems::device_t
 

typedef struct dmtimer_s mems::dmtimer_t
 

Definition at line 222 of file FileSML.cpp.

typedef boost::multi_array< hss_list_t, 3 > mems::hss_array_t
 

Definition at line 62 of file Islands.h.

Referenced by detectAndApplyBackbone(), findHssHomologyHMM(), findHssRandomWalk(), findHssRandomWalk_v2(), findHssRandomWalkCga(), findIslandsRandomWalk(), findIslandsRandomWalkCga(), HssArrayToCga(), HssColsToIslandCols(), makeAllPairwiseGenomeHSS(), and translateToPairwiseGenomeHSS().

typedef std::vector< HssCols > mems::hss_list_t
 

Definition at line 61 of file Islands.h.

Referenced by ComplementHss(), findHssExcursions(), findHssHomologyHMM(), findHssRandomWalk(), findHssRandomWalk_v2(), findHssRandomWalkScoreVector(), findIslandsRandomWalk(), findMscFromExcursions(), findRightEndpoint(), HssArrayToCga(), HssColsToIslandCols(), and translateToPairwiseGenomeHSS().

typedef std::list<idmer> mems::IdmerList
 

Definition at line 35 of file MatchFinder.h.

Referenced by mems::RepeatHash::EnumerateMatches(), mems::PairwiseMatchFinder::EnumerateMatches(), mems::MemHash::EnumerateMatches(), mems::MatchFinder::EnumerateMatches(), mems::RepeatHash::HashMatch(), mems::MemHash::HashMatch(), and mems::MatchFinder::SearchRange().

typedef GenericInterval mems::Interval
 

Definition at line 185 of file Interval.h.

Referenced by addGuy(), addUnalignedIntervals(), addUnalignedIntervals_v2(), addUnalignedRegions(), mems::Aligner::align(), mems::Aligner::AlignLCB(), AlignLCBInParallel(), mems::ProgressiveAligner::alignProfileToProfile(), computeLCBAdjacencies_v2(), mems::ProgressiveAligner::ConstructSuperIntervalFromMSA(), mems::ProgressiveAligner::getAlignment(), makeAlignmentTree(), mergeUnalignedIntervals(), propagateInvert(), mems::GenericIntervalList< MatchType >::ReadList(), mems::GenericIntervalList< MatchType >::ReadStandardAlignment(), mems::GenericIntervalList< MatchType >::ReadStandardAlignmentCompact(), mems::Aligner::RecursiveAnchorSearch(), simpleFindBackbone(), simpleFindIslands(), mems::SuperInterval::SuperInterval(), std::swap(), and unalignIslands().

typedef GenericIntervalList mems::IntervalList
 

Definition at line 98 of file IntervalList.h.

Referenced by addUnalignedIntervals(), mems::Aligner::align(), checkForAllGapColumns(), collapseCollinear(), computeLCBAdjacencies_v2(), computeLCBAdjacencies_v3(), createBackboneList(), detectAndApplyBackbone(), findIslandsBetweenLCBs(), IdentityMatrix(), makeAllPairwiseGenomeHSS(), projectIntervalList(), mems::Aligner::RecursiveAnchorSearch(), simpleFindBackbone(), simpleFindIslands(), unalignIslands(), writeBackboneSeqCoordinates(), and WritePermutationCoordinates().

typedef struct mems::iodevice_s mems::iodevice_t
 

typedef HssCols mems::IslandCols
 

Definition at line 64 of file Islands.h.

Referenced by ComplementHss().

typedef pair< size_t, Interval* > mems::iv_tracker_t
 

Definition at line 62 of file Backbone.cpp.

Referenced by mems::IvTrackerComp::operator()().

typedef unsigned long long mems::mask_t
 

Definition at line 1262 of file FileSML.cpp.

typedef UngappedLocalAlignment< HybridAbstractMatch<> > mems::Match
 

Definition at line 26 of file Match.h.

Referenced by AddGapMatches(), addUnalignedIntervals(), addUnalignedIntervals_v2(), mems::MuscleInterface::Align(), mems::ClustalInterface::Align(), mems::Aligner::AlignLCB(), checkCollinearity(), EliminateOverlaps(), mems::ProgressiveAligner::extractAlignment(), mems::MaskedMemHash::FindSubsets(), mems::ProgressiveAligner::getAncestralMatches(), getInterveningCoordinates(), mems::ProgressiveAligner::getPairwiseMatches(), mems::ProgressiveAligner::getRepresentativeAncestralMatches(), makeAlignmentTree(), mems::MatchHashEntry::MatchHashEntry(), operator<<(), mems::ProgressiveAligner::pairwiseAnchorSearch(), mems::RepeatMatchList::ReadList(), ReadList(), mems::GenericIntervalList< MatchType >::ReadList(), mems::ProgressiveAligner::recurseOnPairs(), mems::Aligner::Recursion(), scanFit(), mems::Aligner::SearchWithinLCB(), mems::GenericInterval< GappedBaseImpl >::SetMatches(), transposeMatches(), mems::RepeatMatchList::WriteList(), WriteList(), and mems::GenericIntervalList< MatchType >::WriteList().

typedef void * mems::MatchID_t
 

Definition at line 142 of file MatchList.h.

Referenced by mems::RepeatMatchList::ReadList(), ReadList(), mems::RepeatMatchList::WriteList(), and WriteList().

typedef GenericMatchList< Match* > mems::MatchList
 

Definition at line 116 of file MatchList.h.

Referenced by AaronsLCB(), mems::Aligner::align(), mems::Aligner::AlignLCB(), AlignLCBInParallel(), mems::ProgressiveAligner::alignPP(), mems::ProgressiveAligner::alignProfileToProfile(), ComputeLCBs(), mems::Aligner::consistencyCheck(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), DistanceMatrix(), EliminateOverlaps(), filterMatches(), mems::MemHash::FindMatches(), mems::MemHash::FindMatchesFromPosition(), GetLCBCoverage(), mems::MemHash::GetMatchList(), my_validateLCB(), mems::LCBLeftEndComp::operator()(), mems::ProgressiveAligner::pairwiseAnchorSearch(), ReadList(), mems::ProgressiveAligner::recurseOnPairs(), mems::Aligner::Recursion(), mems::Aligner::RecursiveAnchorSearch(), mems::RepeatMatchList::RepeatMatchList(), SearchLCBGaps(), mems::Aligner::SearchWithinLCB(), transposeMatches(), validateLCB(), and WriteList().

typedef vector< size_t > mems::neighbor_t
 

Definition at line 77 of file Backbone.cpp.

Referenced by collapseCollinear(), and getNeighbor().

typedef unsigned long long mems::offset_t
 

Definition at line 290 of file FileSML.cpp.

typedef boost::multi_array< vector< pair< size_t, size_t > >, 3 > mems::pairwise_genome_hss_t
 

Definition at line 273 of file Backbone.cpp.

Referenced by detectAndApplyBackbone(), makeAllPairwiseGenomeHSS(), mergePairwiseHomologyPredictions(), and translateToPairwiseGenomeHSS().

typedef boost::multi_array< std::vector< TrackingLCB< TrackingMatch* > >, 2 > mems::PairwiseLCBMatrix
 

Definition at line 74 of file GreedyBreakpointElimination.h.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), and mems::EvenFasterSumOfPairsBreakpointScorer::EvenFasterSumOfPairsBreakpointScorer().

typedef unsigned mems::position_t
 

Definition at line 1261 of file FileSML.cpp.

typedef struct mems::record_s mems::record_t
 

typedef int16 mems::sarID_t
 

Definition at line 1272 of file FileSML.cpp.

Referenced by mems::SortedMerList::GetID(), mems::MatchFinder::SearchRange(), mems::SortedMerList::SetID(), and mems::FileSML::SetID().

typedef int mems::score_t
 

Definition at line 22 of file SubstitutionMatrix.h.

Referenced by computeConsensusScore(), computeGapScores(), computeMatchScores(), computeSPScore(), findHssExcursions(), findHssRandomWalk(), findHssRandomWalk_v2(), findHssRandomWalkCga(), findHssRandomWalkScoreVector(), findIslandsRandomWalk(), findIslandsRandomWalkCga(), findMscFromExcursions(), findRightEndpoint(), GetPairwiseAnchorScore(), mems::PairwiseScoringScheme::PairwiseScoringScheme(), readSubstitutionMatrix(), and mems::PairwiseScoringScheme::setMatrix().

typedef std::pair< mems::Match*, mems::Match* > mems::search_cache_t
 

a cache type to remember which intervals have already been searched

Definition at line 130 of file Aligner.h.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), cachecomp(), mems::SearchCacheComparator::operator()(), mems::ProgressiveAligner::recurseOnPairs(), and mems::Aligner::SearchWithinLCB().

typedef struct mems::seqbuf_s mems::seqbuf_t
 

typedef struct mems::sml_s mems::sml_t
 

typedef struct mems::SMLHeader_s mems::SMLHeader_t
 

typedef uint32 mems::smlSeqI_t
 

Definition at line 45 of file SortedMerList.h.

Referenced by mems::FileSML::base().

typedef struct mems::sort_buf_s mems::sort_buf_t
 

typedef LcbTrackingMatch< mems::AbstractMatch* > mems::TrackingMatch
 

Definition at line 33 of file GreedyBreakpointElimination.h.

Referenced by mems::ProgressiveAligner::constructLcbTrackingMatches(), mems::ProgressiveAligner::pairwiseScoreTrackingMatches(), and mems::EvenFasterSumOfPairsBreakpointScorer::validate().

typedef unsigned mems::uint32
 

Definition at line 1235 of file FileSML.cpp.

Referenced by mems::MemHash::AddHashEntry(), mems::MatchFinder::AmbiguityTolerance(), mems::SortedMerList::CalculateMaxMerSize(), mems::DNAFileSML::CalculateMaxMerSize(), mems::MemHash::Clear(), mems::DNAFileSML::DNAFileSML(), mems::DNAMemorySML::DNAMemorySML(), mems::MatchFinder::EnumerateMatches(), mems::MatchFinder::ExtendMatch(), mems::SortedMerList::FillDnaSML(), mems::SortedMerList::FillSML(), mems::MemHash::FindMatches(), mems::MemHash::FindMatchesFromPosition(), mems::MatchFinder::FindMatchSeeds(), mems::FileSML::FormatVersion(), mems::DNAFileSML::FormatVersion(), mems::RepeatMatch::FromSeq(), mems::MatchFinder::GetBreakpoint(), mems::SortedMerList::GetBSequence(), mems::MemHash::GetEnumerationTolerance(), mems::MemHash::GetMatchList(), mems::SortedMerList::GetMer(), mems::SortedMerList::GetMerMaskSize(), mems::MemHash::GetRepeatTolerance(), mems::RepeatHash::GetSar(), mems::MatchFinder::GetSar(), mems::RepeatHash::HashMatch(), mems::MemHash::LoadFile(), mems::FileSML::LoadFile(), mems::MemHash::MemCollisionCount(), mems::MemHash::MemCount(), mems::MemHash::MemHash(), mems::FileSML::MemoryMinimum(), mems::MemorySML::MemorySML(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Move(), mems::MatchFinder::Multiplicity(), mems::MatchLeftEndComparator::operator()(), operator<<(), mems::SortedMerList::operator=(), mems::MemHash::operator=(), mems::MemHash::PrintDistribution(), mems::MemHash::quadratic_li(), mems::RepeatMatchList::ReadList(), ReadList(), mems::SortedMerList::RevCompMer(), mems::MatchFinder::SearchRange(), mems::RepeatMatch::SeqId(), mems::MatchFinder::SetAmbiguityTolerance(), mems::MemHash::SetDirection(), mems::MemHash::SetEnumerationTolerance(), mems::SortedMerList::SetMerMaskSize(), mems::MemHash::SetRepeatTolerance(), mems::SortedMerList::SetSequence(), mems::MemHash::SetTableSize(), mems::MatchHashEntry::start_lessthan_ptr(), mems::MemHash::TableSize(), mems::SortedMerList::translate(), mems::SortedMerList::translate32(), mems::SortedMerList::UniqueMerCount(), and mems::MemHash::WriteFile().

typedef unsigned long long mems::uint64
 

Definition at line 1236 of file FileSML.cpp.

Referenced by mems::Aligner::align(), ComputeLCBs(), mems::SeedOccurrenceList::construct(), countGaps(), mems::SortedMerList::Create(), mems::MemorySML::Create(), mems::GenericMatchList< MatchPtrType >::CreateMemorySMLs(), DistanceMatrix(), mems::MatchFinder::ExtendMatch(), mems::SortedMerList::FillDnaSML(), mems::SortedMerList::FillSML(), filterMatches(), mems::SortedMerList::FindMer(), mems::MatchFinder::FindSubsets(), mems::MatchFinder::GetBreakpoint(), mems::SortedMerList::GetBSequence(), mems::SortedMerList::GetDnaMer(), mems::SortedMerList::GetDnaSeedMer(), GetLCBCoverage(), mems::SortedMerList::GetMer(), mems::DNAMemorySML::GetMer(), mems::DNAFileSML::GetMer(), mems::SortedMerList::GetMerMask(), mems::DNAFileSML::GetNeededMemory(), mems::SortedMerList::GetSeedMask(), mems::SortedMerList::GetSeedMer(), mems::DNAMemorySML::GetSeedMer(), mems::DNAFileSML::GetSeedMer(), mems::MaskedMemHash::HashMatch(), mems::FileSML::LoadFile(), mems::RepeatMatchList::LoadSMLs(), mems::GenericMatchList< MatchPtrType >::LoadSMLs(), mems::FileSML::MemoryMinimum(), mems::ProgressiveAligner::pairwiseAnchorSearch(), mems::Aligner::Recursion(), mems::SortedMerList::RevCompMer(), SearchLCBGaps(), mems::MatchFinder::SearchRange(), mems::SortedMerList::Seed(), mems::MaskedMemHash::SetMask(), and SimpleGetLCBCoverage().

typedef unsigned char mems::uint8
 

Definition at line 1234 of file FileSML.cpp.

Referenced by mems::SortedMerList::BasicDNATable(), computeGC(), computeMatchScores(), mems::SortedMerList::CreateBasicDNATable(), mems::SortedMerList::CreateProteinTable(), mems::DNAFileSML::DNAFileSML(), mems::DNAMemorySML::DNAMemorySML(), mems::SortedMerList::FillSML(), mems::SortedMerList::Find(), mems::SortedMerList::FindAll(), mems::MemorySML::MemorySML(), mems::SortedMerList::ProteinTable(), mems::SortedMerList::translate(), and mems::SortedMerList::translate32().

typedef mems::UngappedLocalAlignment< mems::HybridAbstractMatch<> > mems::ULA
 

Definition at line 36 of file Backbone.h.

Referenced by createBackboneList(), mergePairwiseHomologyPredictions(), and readBackboneColsFile().

typedef struct mems::working_set_s mems::working_set_t
 


Enumeration Type Documentation

anonymous enum
 

Enumeration values:
A_READ 
A_WRITE 

Definition at line 353 of file FileSML.cpp.

anonymous enum
 

Enumeration values:
DEV_FREE 
DEV_BUSY 

Definition at line 582 of file FileSML.cpp.

anonymous enum
 

Enumeration values:
OP_PENDING 
OP_FINISHED 
OP_NONE 

Definition at line 618 of file FileSML.cpp.

anonymous enum
 

Enumeration values:
CalculateHistogram 
CopyPointers 
QsortPointers 
CopyData 

Definition at line 1014 of file FileSML.cpp.

enum mems::dm_errors
 

Enumeration values:
SUCCESS 
TOO_FEW_BINS 
TOO_MANY_BINS 
INPUT_NOT_OPENED 
INVALID_WS_SIZE 
SEQUENCE_TOO_SHORT 
OUTPUT_NOT_OPENED 
INVALID_NUMRECS 
NO_FREE_BUFFERS 
BIN_NOT_OPENED 

Definition at line 1374 of file FileSML.cpp.


Function Documentation

void mems::AaronsLCB MatchList &  mlist,
std::set< uint > &  breakpoints
 

Function to determine the breakpoints in a set of matches.

Sorts the matches in mlist and returns the indices of breakpoints. This function attempts (sometimes unsuccessfully) to determine subset LCBs. If a set of matches containing subset LCBs has been passed to it, the resulting breakpoint set may be incorrect. You have been warned.

Parameters:
mlist A list of matches to search for LCBs.
breakpoints The indices of matches in the sorted match list that are at LCB boundaries

Definition at line 362 of file Aligner.cpp.

References checkCollinearity(), mems::PlacementMatch::iter, mems::LabeledMem::label, MatchList, mems::PlacementMatch::mem, mems::LabeledMem::mem, scanFit(), scanLabels(), and uint.

Referenced by mems::Aligner::consistencyCheck(), and mems::Aligner::RecursiveAnchorSearch().

int aClose aFILE *  file  ) 
 

template<typename ListType, typename Iter>
void AddGapMatches ListType &  the_list,
const Iter &  first,
const Iter &  last,
uint  seqI,
int64  left_end,
int64  right_end,
AbstractMatch::orientation  seq_orient,
uint  seq_count
 

Definition at line 348 of file Interval.h.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), insert(), Match, mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), and uint.

Referenced by addUnalignedRegions(), and mems::GenericInterval< GappedBaseImpl >::addUnalignedRegions().

void addGuy uint  seqI,
AbstractMatch::orientation  orient,
std::vector< AbstractMatch * > &  new_ivs,
vector< Interval * > &  new_list
 

Definition at line 2584 of file ProgressiveAligner.cpp.

References mems::GenericInterval< GappedBaseImpl >::Copy(), Interval, and mems::GenericInterval< GappedBaseImpl >::SetMatches().

Referenced by mergeUnalignedIntervals().

void AddTo64 unsigned int  amt,
unsigned int *  hi,
unsigned int *  lo
 

template<class AbstractMatchType>
void AddToMatchIdentityMatrix const AbstractMatchType &  amt,
const std::vector< genome::gnSequence * > &  seq_table,
NumericMatrix< double > &  identity
 

Definition at line 155 of file DistanceMatrix.h.

References GetAlignment(), and uint.

Referenced by BackboneIdentityMatrix(), and IdentityMatrix().

void mems::addUnalignedIntervals IntervalList &  iv_list,
std::set< uint seq_set = std::set< uint >(),
std::vector< gnSeqI >  seq_lengths = std::vector<gnSeqI>()
 

Find regions in each sequence that do not belong to any LCB, add them to their own Interval (LCB) in the IntervalList.

Definition at line 237 of file Islands.cpp.

References computeLCBAdjacencies_v2(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Free(), getGapBounds(), Interval, IntervalList, Match, mems::GenericIntervalList< MatchType >::seq_table, mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), and uint.

Referenced by mems::Aligner::align(), and findIslandsBetweenLCBs().

template<class MatchVectorType>
void addUnalignedIntervals_v2 MatchVectorType &  iv_list,
std::set< uint seq_set,
std::vector< gnSeqI >  seq_lengths
 

Definition at line 434 of file ProgressiveAligner.h.

References computeLCBAdjacencies_v3(), mems::GenericInterval< GappedBaseImpl >::Copy(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), getGapBounds(), Interval, Match, mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), mems::GenericInterval< GappedBaseImpl >::SetMatches(), and uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

template<class IntervalListType>
void addUnalignedRegions IntervalListType &  iv_list  ) 
 

Definition at line 842 of file Islands.h.

References AddGapMatches(), mems::GenericInterval< GappedBaseImpl >::Free(), and Interval.

Referenced by collapseCollinear(), and detectAndApplyBackbone().

int aFileBusy aFILE *  file  ) 
 

void aFlush aFILE *  file  ) 
 

template<typename MatchVector>
void alignedNtCountFilter MatchVector &  matches,
uint  length
 

Definition at line 1516 of file ProgressiveAligner.cpp.

References uint.

void AlignLCBInParallel bool  collinear_genomes,
mems::GappedAligner gal,
MatchList &  mlist,
Interval &  iv,
AlnProgressTracker &  apt
 

Definition at line 1293 of file Aligner.cpp.

References mems::GappedAligner::Align(), mems::GappedAlignment::Copy(), mems::AlnProgressTracker::cur_leftend, Interval, MatchList, mems::AlnProgressTracker::prev_progress, printProgress(), mems::GenericMatchList< MatchPtrType >::seq_table, mems::GenericInterval< GappedBaseImpl >::SetMatches(), mems::AlnProgressTracker::total_len, and uint.

Referenced by mems::Aligner::align().

buffer_t* AllocateFree void   )  [static]
 

aFILE* aOpen const char *  path,
int  mode
 

int aOperationComplete aFILE *  file,
int  operation
 

template<class T, class Manipulator>
void applyAncestralBreakpoints const vector< SuperInterval > &  siv_list,
vector< T > &  ord,
uint  seqI,
Manipulator &  m
 

Definition at line 250 of file ProgressiveAligner.cpp.

References applyBreakpoints(), and uint.

Referenced by mems::ProgressiveAligner::recursiveApplyAncestralBreakpoints(), and mems::ProgressiveAligner::translateGappedCoordinates().

template<class T, class Maniplator>
void applyBreakpoints std::vector< gnSeqI > &  bp_list,
std::vector< T > &  iv_list,
Maniplator &  manip
 

Definition at line 561 of file ProgressiveAligner.h.

References std::swap().

Referenced by applyAncestralBreakpoints(), makeAllPairwiseGenomeHSS(), mergePairwiseHomologyPredictions(), and mems::ProgressiveAligner::propagateDescendantBreakpoints().

int aRead void *  buffer,
offset_t  size,
offset_t  count,
aFILE *  file,
offset_t  pos
 

unsigned long long aStatFileSize const char *  path  ) 
 

unsigned long aStatSize const char *  path  ) 
 

void aUpdateOperations aFILE *  file  ) 
 

void aWaitComplete aFILE *  file,
int  operation
 

void aWaitNotBusy aFILE *  file  ) 
 

int aWrite void *  buffer,
offset_t  size,
offset_t  count,
aFILE *  file,
offset_t  pos
 

template<class AbstractMatchVectorType>
void BackboneIdentityMatrix const AbstractMatchVectorType &  matches,
const std::vector< genome::gnSequence * > &  seq_table,
NumericMatrix< double > &  identity
 

Definition at line 73 of file DistanceMatrix.h.

References AddToMatchIdentityMatrix(), GetAlignment(), Matrix< T >::init(), and uint.

void BinningPhase void   ) 
 

int mems::bmer_compare const void *  a_v,
const void *  m_v
[inline]
 

Definition at line 322 of file SortedMerList.h.

bool mems::bmer_id_lessthan const bmer &  a_v,
const bmer &  m_v
 

bool mems::bmer_lessthan const bmer &  a_v,
const bmer &  m_v
[inline]
 

Definition at line 317 of file SortedMerList.h.

Referenced by mems::MemorySML::Create().

int cachecomp const void *  e1,
const void *  e2
 

Definition at line 1988 of file ProgressiveAligner.cpp.

References cache_comparator, and search_cache_t.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

offset_t CalculateDataReadSize buffer_t *  b  ) 
 

int CalculateSortWriteSize int  sortI  ) 
 

boolean checkCollinearity Match *  m1,
Match *  m2
 

Definition at line 267 of file Aligner.cpp.

References Match, and uint.

Referenced by AaronsLCB().

bool checkConsistent const AbstractMatch *  a,
const AbstractMatch *  b
[inline]
 

Definition at line 274 of file ProgressiveAligner.h.

References mems::AbstractMatch::LeftEnd(), mems::AbstractMatch::SeqCount(), mems::AbstractMatch::Start(), and uint.

Referenced by EliminateOverlaps_v2().

void mems::checkForAllGapColumns IntervalList &  iv_list  ) 
 

sanity checks for alignment columns that contain only gaps

Definition at line 250 of file Backbone.cpp.

References GetAlignment(), IntervalList, and mems::GenericIntervalList< MatchType >::seq_table.

Referenced by detectAndApplyBackbone().

void mems::chooseNextAlignmentPair PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t node1,
node_id_t node2,
node_id_t ancestor
 

Select the next pair of nodes to align The chosen pair will either be unaligned extant sequences or unaligned ancestral sequences whose descendants have all been aligned.

The chosen pair has the shortest path on the tree When no sequences remain to be aligned, returns node1 == node2

Definition at line 3053 of file ProgressiveAligner.cpp.

References TreeNode::children, mems::AlignmentTreeNode::children_aligned, TreeNode::distance, node_id_t, mems::AlignmentTreeNode::ordering, TreeNode::parents, mems::AlignmentTreeNode::parents_aligned, mems::AlignmentTreeNode::sequence, PhyloTree< AlignmentTreeNode >::size(), and uint.

Referenced by mems::ProgressiveAligner::getAlignment().

void mems::collapseCollinear IntervalList &  iv_list  ) 
 

collapse Intervals that are trivially collinear with each other

Definition at line 88 of file Backbone.cpp.

References addUnalignedRegions(), getNeighbor(), IntervalList, neighbor_t, mems::GenericIntervalList< MatchType >::seq_filename, mems::GenericIntervalList< MatchType >::seq_table, and uint.

Referenced by detectAndApplyBackbone(), and unalignIslands().

int comp_keys record_t  a,
record_t  b
 

int CompareKeys const record_t *  r1,
const record_t *  r2
 

int CompareKeys_qsort_wrapper const void *  r1,
const void *  r2
 

void ComplementHss const size_t  alignment_length,
hss_list_t &  hss_list,
hss_list_t &  island_col_list,
uint  seqI = 0,
uint  seqJ = 0
[inline]
 

Definition at line 663 of file Islands.h.

References hss_list_t, IslandCols, mems::HssCols::left_col, mems::HssCols::right_col, mems::HssCols::seqI, mems::HssCols::seqJ, and uint.

Referenced by HssColsToIslandCols().

int ComputeAsciiBinNumber const unsigned char  key[10]  )  [static]
 

int ComputeBinNumber const unsigned char  key[10]  )  [static]
 

void mems::computeConsensusScore const std::vector< std::string > &  alignment,
const PairwiseScoringScheme &  pss,
std::vector< score_t > &  scores,
std::string &  consensus,
score_t &  score
[inline]
 

Definition at line 33 of file Scoring.h.

References computeGapScores(), computeMatchScores(), INVALID_SCORE, and score_t.

void mems::computeGapScores const std::string &  seq1,
const std::string &  seq2,
const PairwiseScoringScheme &  scoring,
std::vector< score_t > &  scores
[inline]
 

Definition at line 142 of file Scoring.h.

References mems::PairwiseScoringScheme::gap_extend, mems::PairwiseScoringScheme::gap_open, INVALID_SCORE, and score_t.

Referenced by computeConsensusScore(), computeSPScore(), findHssRandomWalk(), findHssRandomWalk_v2(), and GetPairwiseAnchorScore().

double computeGC std::vector< genome::gnSequence * > &  seq_table  ) 
 

compute the GC content of a set of sequences

double computeGC std::vector< gnSequence * > &  seq_table  ) 
 

Definition at line 301 of file Backbone.cpp.

References uint8.

double computeID GappedAlignment &  gal,
size_t  seqI,
size_t  seqJ
 

Definition at line 1925 of file ProgressiveAligner.cpp.

References mems::AbstractGappedAlignment< SparseAbstractMatch<> >::AlignmentLength(), GetAlignment(), and mems::SparseAbstractMatch<>::SeqCount().

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

void mems::computeLCBAdjacencies_v2 IntervalList &  iv_list,
std::vector< int64 > &  weights,
std::vector< LCB > &  adjacencies
 

Redesign to be more intuitive.

left_adjacency is always left, regardless of LCB orientation

Definition at line 645 of file Aligner.cpp.

References IntervalList, mems::LCB::lcb_id, mems::LCB::left_adjacency, mems::LCB::left_end, NO_ADJACENCY, mems::LCB::right_adjacency, mems::LCB::right_end, mems::LCB::to_be_deleted, uint, and mems::LCB::weight.

void mems::computeLCBAdjacencies_v2 std::vector< MatchList > &  lcb_list,
std::vector< int64 > &  weights,
std::vector< LCB > &  adjacencies
 

Definition at line 627 of file Aligner.cpp.

References Interval, IntervalList, and uint.

Referenced by addUnalignedIntervals(), filterMatches(), and mems::Aligner::RecursiveAnchorSearch().

void computeLCBAdjacencies_v3 mems::IntervalList iv_list,
std::vector< double > &  weights,
std::vector< mems::LCB > &  adjacencies
[inline]
 

Redesign to be more intuitive.

left_adjacency is always left, regardless of LCB orientation

Definition at line 317 of file GreedyBreakpointElimination.h.

References computeLCBAdjacencies_v3(), and IntervalList.

template<class MatchVector>
void computeLCBAdjacencies_v3 const std::vector< MatchVector > &  lcb_list,
std::vector< double > &  weights,
std::vector< mems::LCB > &  adjacencies
 

Definition at line 252 of file GreedyBreakpointElimination.h.

References FindBoundaries(), mems::LCB::lcb_id, mems::LCB::left_adjacency, mems::LCB::left_end, mems::LCB::right_adjacency, mems::LCB::right_end, uint, and mems::LCB::weight.

Referenced by addUnalignedIntervals_v2(), mems::ProgressiveAligner::alignProfileToProfile(), computeLCBAdjacencies_v3(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), filterMatches_v2(), getPairwiseLCBs(), and projectIntervalList().

void mems::ComputeLCBs MatchList &  meml,
std::set< uint > &  breakpoints,
std::vector< MatchList > &  lcb_list,
std::vector< int64 > &  weights
 

Definition at line 1049 of file Aligner.cpp.

References GetLCBCoverage(), MatchList, uint, and uint64.

Referenced by mems::Aligner::RecursiveAnchorSearch().

template<class MatchVector>
void ComputeLCBs_v2 const MatchVector &  meml,
const std::vector< gnSeqI > &  breakpoints,
std::vector< MatchVector > &  lcb_list
 

Definition at line 230 of file GreedyBreakpointElimination.h.

References uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), getPairwiseLCBs(), projectIntervalList(), and mems::EvenFasterSumOfPairsBreakpointScorer::validate().

void mems::computeMatchScores const std::string &  seq1,
const std::string &  seq2,
const PairwiseScoringScheme &  scoring,
std::vector< score_t > &  scores
[inline]
 

Definition at line 120 of file Scoring.h.

References INVALID_SCORE, mems::PairwiseScoringScheme::matrix, score_t, and uint8.

Referenced by computeConsensusScore(), computeSPScore(), findHssRandomWalk(), findHssRandomWalk_v2(), and GetPairwiseAnchorScore().

int ComputeNNNNNBinNumber const unsigned char  key[10]  )  [static]
 

void computeSPScore const std::vector< string > &  alignment,
const PairwiseScoringScheme &  pss,
std::vector< score_t > &  scores,
score_t &  score
[inline]
 

Definition at line 298 of file Scoring.h.

References computeGapScores(), computeMatchScores(), INVALID_SCORE, and score_t.

void computeSPScore const std::vector< std::string > &  alignment,
const PairwiseScoringScheme &  pss,
std::vector< score_t > &  scores,
score_t &  score
 

template<class Tree>
bool containsNode Tree &  t,
node_id_t  subtree_nodeI,
node_id_t  query_nodeI
 

Depth first search to check whether a subtree contains a given node.

Definition at line 22 of file TreeUtilities.h.

References node_id_t.

Referenced by rerootTree().

uint64 mems::countGaps string &  seq  ) 
 

Definition at line 197 of file gnAlignedSequences.cpp.

References uint, and uint64.

Referenced by mems::gnAlignedSequences::outputClustalW().

size_t countUnrefined PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t  ancestor
 

Definition at line 1719 of file ProgressiveAligner.cpp.

References node_id_t, and PhyloTree< AlignmentTreeNode >::size().

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

CREATE_EXCEPTION SMLMergeError   ) 
 

Thrown when there is an error merging two sorted mer lists.

CREATE_EXCEPTION SMLCreateError   ) 
 

Thrown when there is an error creating a sorted mer list.

CREATE_EXCEPTION InvalidArgument   ) 
 

CREATE_EXCEPTION InvalidData   ) 
 

InvalidData exceptions are thrown when the input to an algorithm is invalid.

CREATE_EXCEPTION AlignerError   ) 
 

Thrown if some error occurs during alignment.

node_id_t mems::createAlignmentTreeRoot PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t  node1,
node_id_t  node2
 

Definition at line 3148 of file ProgressiveAligner.cpp.

References TreeNode::children, mems::AlignmentTreeNode::children_aligned, findAndErase(), TreeNode::parents, mems::AlignmentTreeNode::parents_aligned, PhyloTree< AlignmentTreeNode >::push_back(), rerootTree(), and mems::AlignmentTreeNode::sequence.

void createBackboneList const IntervalList &  iv_list,
backbone_list_t &  ula_list
 

Definition at line 822 of file Backbone.cpp.

References backbone_list_t, mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), IntervalList, mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), and ULA.

Referenced by detectAndApplyBackbone().

uint8* CreateBasicDNATable  )  [static]
 

Definition at line 1240 of file FileSML.cpp.

void CreateGapSearchList std::vector< LCB > &  adjacencies,
const std::vector< genome::gnSequence * > &  seq_table,
std::vector< std::vector< int64 > > &  iv_regions,
boolean  entire_genome
 

void CreateGapSearchList vector< LCB > &  adjacencies,
const vector< gnSequence * > &  seq_table,
vector< vector< int64 > > &  iv_regions,
boolean  entire_genome
 

iv_regions -- lists of intervening regions between LCBs in each sequence start positions organized as iv_regions[ seqI ][ lcbI * 2 ] end positions organized as iv_regions[ seqI ][ lcbI * 2 + 1 ]

< enables debugging output

Definition at line 720 of file Aligner.cpp.

References uint.

Referenced by mems::Aligner::RecursiveAnchorSearch().

template<typename MatchVector>
void createMap const MatchVector &  mv_from,
const MatchVector &  mv_to,
vector< size_t > &  map
 

Definition at line 46 of file Backbone.cpp.

Referenced by makeAllPairwiseGenomeHSS(), mergePairwiseHomologyPredictions(), and unalignIslands().

void detectAndApplyBackbone AbstractMatch *  m,
std::vector< genome::gnSequence * > &  seq_table,
CompactGappedAlignment<> *&  result,
backbone_list_t &  bb_list,
const Params &  hmm_params,
boolean  left_homologous = false,
boolean  right_homologous = false
 

Applies pairwise transitive homology statistics to detect backbone in a single collinear alignment Unaligns any regions found to be non-homologous, returns coordinates of the homologous segments in bb_list.

Parameters:
m The input match in which homology detection will be applied
seq_table A sequence table with one gnSequence pointer per match component
result (output) A newly allocated CompactGappedAlignment that contains the resulting alignment of homologous sequence. It is the caller's responsibility to free the memory using AbstractMatch::Free()
bb_list (output) A list of homologous segments among each component of the output match
left_homologous Set to true if the detection code should assume that sequence beyond the left-most alignment column is homologous sequence
right_homologous Set to true if the detection code should assume that sequence beyond the right-most alignment column is homologous sequence

void mems::detectAndApplyBackbone IntervalList &  iv_list,
backbone_list_t &  bb_list,
const Params &  hmm_params
 

Applies pairwise transitive homology statistics to detect backbone in a genome alignment Unaligns any regions found to be non-homologous, returns coordinates of the homologous segments in bb_list.

Definition at line 940 of file Backbone.cpp.

References addUnalignedRegions(), backbone_list_t, checkForAllGapColumns(), collapseCollinear(), mems::CompactGappedAlignment< BaseType >::Copy(), createBackboneList(), IntervalList, makeAllPairwiseGenomeHSS(), mergePairwiseHomologyPredictions(), pairwise_genome_hss_t, mems::GenericIntervalList< MatchType >::seq_table, uint, and unalignIslands().

void detectAndApplyBackbone AbstractMatch *  m,
vector< gnSequence * > &  seq_table,
CompactGappedAlignment<> *&  result,
backbone_list_t &  bb_list,
const Params &  hmm_params,
boolean  left_homologous,
boolean  right_homologous
 

Definition at line 884 of file Backbone.cpp.

References backbone_list_t, mems::CompactGappedAlignment< BaseType >::Copy(), createBackboneList(), findHssHomologyHMM(), hss_array_t, IntervalList, mergePairwiseHomologyPredictions(), pairwise_genome_hss_t, mems::GenericIntervalList< MatchType >::seq_table, translateToPairwiseGenomeHSS(), uint, and unalignIslands().

void DisplayStatus void   ) 
 

void DisplayStatusHeader void   ) 
 

void DistanceMatrix IntervalList &  iv_list,
NumericMatrix< double > &  distmat
 

Definition at line 3474 of file ProgressiveAligner.cpp.

References IdentityMatrix(), and TransformDistanceIdentity().

void mems::DistanceMatrix uint  seq_count,
const std::vector< std::pair< uint64, uint64 > > &  detail_list,
NumericMatrix< double > &  distance
[inline]
 

Definition at line 285 of file DistanceMatrix.h.

References Matrix< T >::init(), uint, and uint64.

void mems::DistanceMatrix const MatchList &  mlist,
NumericMatrix< double > &  identity
[inline]
 

Definition at line 270 of file DistanceMatrix.h.

References IdentityMatrix(), MatchList, mems::GenericMatchList< MatchPtrType >::seq_table, and TransformDistanceIdentity().

Referenced by mems::Aligner::align(), and mems::ProgressiveAligner::alignPP().

int dmSML const char *  input_file,
const char *  output_file,
const char *const *  scratch_paths,
uint64  seed
 

int dmsort void   ) 
 

void DoBinning void   )  [static]
 

void DoReading void   )  [static]
 

void mems::EliminateOverlaps MatchList &  ml  ) 
 

Deletes overlapping regions in a set of matches.

This code isn't perfect, it can delete too many base pairs in some cases

Definition at line 62 of file Aligner.cpp.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), mems::UngappedLocalAlignment< AbstractMatchImpl >::CropEnd(), mems::UngappedLocalAlignment< AbstractMatchImpl >::CropStart(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Free(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Length(), Match, MatchList, and uint.

Referenced by mems::Aligner::align(), mems::Aligner::Recursion(), SearchLCBGaps(), and mems::Aligner::SearchWithinLCB().

template<class MatchVector>
void EliminateOverlaps_v2 MatchVector &  ml,
bool  eliminate_both = false
 

Definition at line 398 of file ProgressiveAligner.h.

References EliminateOverlaps_v2(), and uint.

template<class MatchVector>
void EliminateOverlaps_v2 MatchVector &  ml,
const std::vector< uint > &  seq_ids,
bool  eliminate_both = false
 

Delete overlapping regions in favor of the larger match.

This code isn't perfect, it can delete too many base pairs in some cases

Parameters:
ml The vector of matches
seq_ids The indexes of sequences in which overlaps should be eliminated
eliminate_both Delete both of the overlapping matches, instead of leaving one remaining

Definition at line 302 of file ProgressiveAligner.h.

References checkConsistent(), processNewMatch(), and uint.

Referenced by mems::ProgressiveAligner::alignPP(), mems::ProgressiveAligner::alignProfileToProfile(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), EliminateOverlaps_v2(), mems::ProgressiveAligner::getAncestralMatches(), mems::ProgressiveAligner::getRepresentativeAncestralMatches(), and mems::ProgressiveAligner::pairwiseAnchorSearch().

void EnsureAllOperationsComplete void   ) 
 

void extendRootBranches PhyloTree< AlignmentTreeNode > &  alignment_tree  ) 
 

Definition at line 3033 of file ProgressiveAligner.cpp.

References node_id_t, PhyloTree< AlignmentTreeNode >::root, PhyloTree< AlignmentTreeNode >::size(), and std::swap().

void mems::filterMatches std::vector< LCB > &  adjacencies,
std::vector< MatchList > &  lcb_list,
std::vector< int64 > &  weights
 

Takes a set of filtered LCB adjacencies and an unfiltered set of matches as input returns a filtered set of matches that reflects the LCBs found.

Definition at line 1814 of file Aligner.cpp.

References computeLCBAdjacencies_v2(), GetLCBCoverage(), MatchList, uint, and uint64.

Referenced by mems::Aligner::consistencyCheck(), and mems::Aligner::RecursiveAnchorSearch().

template<class MatchVector>
void filterMatches_v2 std::vector< mems::LCB > &  adjacencies,
std::vector< MatchVector > &  lcb_list,
std::vector< double > &  weights,
MatchVector &  deleted_matches
 

Takes a set of filtered LCB adjacencies and an unfiltered set of matches as input returns a filtered set of matches that reflects the LCBs found.

Definition at line 329 of file GreedyBreakpointElimination.h.

References computeLCBAdjacencies_v3(), and uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), and mems::ProgressiveAligner::CreatePairwiseBPDistance().

template<class T, class S>
void findAndErase T &  container,
S &  item
 

Definition at line 9 of file TreeUtilities.h.

Referenced by createAlignmentTreeRoot(), and rerootTree().

template<class MatchVector>
void FindBoundaries const MatchVector &  matches,
std::vector< gnSeqI > &  left_ends,
std::vector< gnSeqI > &  lengths,
std::vector< bool > &  orientations
 

Definition at line 685 of file Interval.h.

References uint.

Referenced by mems::GenericInterval< GappedBaseImpl >::CalculateOffset(), and computeLCBAdjacencies_v3().

void findHssExcursions std::vector< score_t >  scores,
score_t  significance_threshold,
hss_list_t &  hss_list,
uint  seqI,
uint  seqJ,
boolean  left_hss,
boolean  right_hss
[inline]
 

Definition at line 159 of file Islands.h.

References hss_list_t, mems::HssCols::left_col, mems::HssCols::right_col, score_t, mems::HssCols::seqI, mems::HssCols::seqJ, and uint.

Referenced by findHssRandomWalk_v2().

template<typename MatchVector>
void findHssHomologyHMM const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
hss_array_t &  hss_array,
const Params &  hmm_params,
boolean  left_homologous,
boolean  right_homologous
 

Definition at line 618 of file Islands.h.

References findHssHomologyHMM(), GetAlignment(), hss_array_t, hss_list_t, and uint.

void findHssHomologyHMM std::vector< std::string > &  aln_table,
hss_list_t &  hss_list,
uint  seqI,
uint  seqJ,
const Params &  hmm_params,
boolean  left_homologous,
boolean  right_homologous
[inline]
 

Definition at line 381 of file Islands.h.

References charmap, colmap, getCharmap(), hss_list_t, mems::HssCols::left_col, mems::HssCols::right_col, mems::HssCols::seqI, mems::HssCols::seqJ, and uint.

Referenced by detectAndApplyBackbone(), findHssHomologyHMM(), and makeAllPairwiseGenomeHSS().

template<typename MatchVector>
void mems::findHssRandomWalk const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
const PairwiseScoringScheme &  scoring,
score_t  significance_threshold,
hss_array_t &  hss_array,
boolean  left_homologous = false,
boolean  right_homologous = false
 

Definition at line 584 of file Islands.h.

References computeGapScores(), computeMatchScores(), findHssRandomWalkScoreVector(), GetAlignment(), hss_array_t, hss_list_t, score_t, and uint.

Referenced by findHssRandomWalkCga(), findIslandsRandomWalk(), and findIslandsRandomWalkCga().

template<typename MatchVector>
void findHssRandomWalk_v2 const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
const PairwiseScoringScheme &  scoring,
score_t  significance_threshold,
hss_array_t &  hss_array,
boolean  left_homologous,
boolean  right_homologous
 

Definition at line 547 of file Islands.h.

References computeGapScores(), computeMatchScores(), findHssExcursions(), findMscFromExcursions(), GetAlignment(), hss_array_t, hss_list_t, score_t, and uint.

template<typename MatchVector>
void mems::findHssRandomWalkCga const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
const PairwiseScoringScheme &  scoring,
score_t  significance_threshold,
std::vector< CompactGappedAlignment<> * > &  hss_list
 

Definition at line 738 of file Islands.h.

References findHssRandomWalk(), hss_array_t, HssArrayToCga(), HssColsToIslandCols(), and score_t.

void mems::findHssRandomWalkScoreVector std::vector< score_t >  scores,
score_t  significance_threshold,
hss_list_t &  hss_list,
uint  seqI = 0,
uint  seqJ = 0,
boolean  left_homologous = false,
boolean  right_homologous = false
[inline]
 

Definition at line 454 of file Islands.h.

References hss_list_t, mems::HssCols::left_col, mems::HssCols::right_col, score_t, mems::HssCols::seqI, mems::HssCols::seqJ, and uint.

Referenced by findHssRandomWalk().

void mems::findIslandsBetweenLCBs IntervalList &  iv_list,
uint  island_size,
std::ostream &  island_out
 

Definition at line 302 of file Islands.cpp.

References addUnalignedIntervals(), IntervalList, mems::GenericIntervalList< MatchType >::seq_table, and uint.

template<typename MatchVector>
void mems::findIslandsRandomWalk const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
const PairwiseScoringScheme &  scoring,
score_t  significance_threshold,
std::vector< Island > &  island_list
 

Definition at line 761 of file Islands.h.

References findHssRandomWalk(), GetAlignment(), hss_array_t, hss_list_t, mems::Island::leftI, mems::Island::leftJ, mems::Island::rightI, mems::Island::rightJ, score_t, mems::Island::seqI, mems::Island::seqJ, and uint.

template<typename MatchVector>
void mems::findIslandsRandomWalkCga const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
const PairwiseScoringScheme &  scoring,
score_t  significance_threshold,
std::vector< CompactGappedAlignment<> * > &  island_list
 

Definition at line 749 of file Islands.h.

References findHssRandomWalk(), hss_array_t, HssArrayToCga(), and score_t.

void findMidpoint PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t n1,
node_id_t n2
 

finds the midpoint of a phylogenetic tree, returns the ids of the surrounding nodes in n1 and n2

Definition at line 2930 of file ProgressiveAligner.cpp.

References bitset_t, and PhyloTree< AlignmentTreeNode >::size().

void findMscFromExcursions std::vector< score_t >  scores,
score_t  significance_threshold,
hss_list_t &  hss_list,
hss_list_t &  msc_list,
uint  seqI,
uint  seqJ,
boolean  left_hss,
boolean  right_hss
[inline]
 

Definition at line 208 of file Islands.h.

References hss_list_t, mems::HssCols::left_col, mems::HssCols::right_col, score_t, mems::HssCols::seqI, mems::HssCols::seqJ, and uint.

Referenced by findHssRandomWalk_v2().

void findRightEndpoint size_t  seqI,
size_t  seqJ,
score_t  significance_threshold,
std::vector< score_t > &  scores,
hss_list_t &  hss_list
[inline]
 

Definition at line 105 of file Islands.h.

References hss_list_t, mems::HssCols::left_col, mems::HssCols::right_col, score_t, mems::HssCols::seqI, and mems::HssCols::seqJ.

void FinishBinning  ) 
 

const char* Fmt const char *  fmt,
  ...
 

lint get_aln_score void   ) 
 

const std::vector< std::string > & mems::GetAlignment const GappedAlignment &  ga,
const std::vector< genome::gnSequence * > &  seq_table
[inline]
 

Definition at line 234 of file GappedAlignment.h.

References mems::GappedAlignment::align_matrix.

void mems::GetAlignment const AbstractMatch &  ga,
const std::vector< genome::gnSequence * > &  seq_table,
std::vector< std::string > &  alignment
[inline]
 

Definition at line 84 of file AbstractGappedAlignment.h.

References mems::AbstractMatch::GetAlignment(), mems::AbstractMatch::LeftEnd(), mems::AbstractMatch::Length(), and mems::AbstractMatch::Orientation().

Referenced by AddToMatchIdentityMatrix(), mems::ProgressiveAligner::alignProfileToProfile(), BackboneIdentityMatrix(), checkForAllGapColumns(), computeID(), mems::ProgressiveAligner::ConstructSuperIntervalFromMSA(), mems::ProgressiveAligner::extractAlignment(), findHssHomologyHMM(), findHssRandomWalk(), findHssRandomWalk_v2(), findIslandsRandomWalk(), mems::GenericInterval< GappedBaseImpl >::GetAlignedSequences(), GetPairwiseAnchorScore(), MatchIdentityMatrix(), mems::MuscleInterface::ProfileAlign(), mems::MuscleInterface::ProfileAlignFast(), mems::GenericIntervalList< MatchType >::ReadStandardAlignment(), mems::GenericIntervalList< MatchType >::ReadStandardAlignmentCompact(), mems::MuscleInterface::Refine(), mems::ProgressiveAligner::refineAlignment(), mems::MuscleInterface::RefineFast(), removeLargeGapsPP(), splitGappedAlignment(), mems::GenericIntervalList< MatchType >::WriteAlignedSequences(), mems::GenericIntervalList< MatchType >::WriteList(), and mems::GenericIntervalList< MatchType >::WriteStandardAlignment().

template<typename MatchVector>
void getBpList MatchVector &  mvect,
uint  seq,
vector< gnSeqI > &  bp_list
 

Definition at line 32 of file Backbone.cpp.

References uint.

Referenced by makeAllPairwiseGenomeHSS(), and mergePairwiseHomologyPredictions().

char* getCharmap  )  [inline]
 

Definition at line 349 of file Islands.h.

References charmap.

Referenced by findHssHomologyHMM().

double getDefaultBpDistEstimateMinScore std::vector< gnSequence * > &  sequences  ) 
 

Definition at line 121 of file ProgressiveAligner.cpp.

References getDefaultBreakpointPenalty().

Referenced by mems::ProgressiveAligner::alignPP().

unsigned getDefaultBreakpointMax const std::vector< genome::gnSequence * > &  seq_table  ) 
 

Definition at line 3322 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::CreatePairwiseBPDistance().

double getDefaultBreakpointPenalty std::vector< genome::gnSequence * > &  sequences  ) 
 

double getDefaultBreakpointPenalty std::vector< gnSequence * > &  sequences  ) 
 

Definition at line 109 of file ProgressiveAligner.cpp.

References uint.

Referenced by mems::ProgressiveAligner::alignPP(), and getDefaultBpDistEstimateMinScore().

PairwiseScoringScheme& getDefaultScoringScheme  )  [static]
 

Definition at line 67 of file SubstitutionMatrix.h.

References default_gap_extend, default_gap_open, and hoxd_matrix.

void mems::getGapBounds std::vector< gnSeqI > &  seq_lengths,
std::vector< LCB > &  adjacencies,
uint  seqJ,
int  leftI,
int  rightI,
int64 &  left_start,
int64 &  right_start
 

Definition at line 224 of file Islands.cpp.

References uint.

Referenced by addUnalignedIntervals(), and addUnalignedIntervals_v2().

boolean getInterveningCoordinates const AbstractMatch *  iv,
uint  oseqI,
Match *  r_begin,
Match *  r_end,
uint  seqI,
int64 &  gap_lend,
int64 &  gap_rend
 

Definition at line 564 of file ProgressiveAligner.cpp.

References mems::AbstractMatch::LeftEnd(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Length(), Match, mems::AbstractMatch::RightEnd(), and uint.

boolean mems::getInterveningCoordinates std::vector< genome::gnSequence * > &  seq_table,
Match *  r_begin,
Match *  r_end,
uint  seqI,
int64 &  gap_lend,
int64 &  gap_rend
[inline]
 

Definition at line 48 of file GappedAligner.h.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Length(), Match, and uint.

Referenced by mems::MuscleInterface::Align(), mems::ClustalInterface::Align(), mems::ProgressiveAligner::pairwiseAnchorSearch(), and mems::Aligner::Recursion().

void mems::GetLCBCoverage MatchList &  lcb,
uint64 &  coverage
 

Definition at line 599 of file Aligner.cpp.

References MatchList, mems::GenericMatchList< MatchPtrType >::seq_table, uint, and uint64.

Referenced by ComputeLCBs(), and filterMatches().

neighbor_t& getNeighbor pair< neighbor_t, neighbor_t > &  entry,
int  direction
 

Definition at line 79 of file Backbone.cpp.

References neighbor_t, and RIGHT_NEIGHBOR.

Referenced by collapseCollinear().

template<class MatchVector>
double mems::GetPairwiseAnchorScore MatchVector &  lcb,
std::vector< genome::gnSequence * > &  seq_table,
const mems::PairwiseScoringScheme subst_scoring,
mems::SeedOccurrenceList sol_1,
mems::SeedOccurrenceList sol_2,
bool  penalize_gaps = false
 

computes an anchoring score for the matches contained inside an LCB

Definition at line 404 of file GreedyBreakpointElimination.h.

References computeGapScores(), computeMatchScores(), mems::SeedOccurrenceList::frequency_type, GetAlignment(), mems::SeedOccurrenceList::getFrequency(), INV_SCORE, penalize_repeats, and score_t.

Referenced by mems::ProgressiveAligner::CreatePairwiseBPDistance(), and mems::ProgressiveAligner::pairwiseScoreTrackingMatches().

void mems::getPairwiseLCBs uint  nI,
uint  nJ,
uint  dI,
uint  dJ,
std::vector< TrackingMatch * > &  tracking_matches,
std::vector< TrackingLCB< TrackingMatch * > > &  t_lcbs,
boost::multi_array< double, 3 > &  tm_score_array,
boost::multi_array< size_t, 3 > &  tm_lcb_id_array
 

Computes all pairwise LCBs from a set of tracking matches.

Definition at line 53 of file GreedyBreakpointElimination.cpp.

References computeLCBAdjacencies_v3(), ComputeLCBs_v2(), mems::PairwiseMatchAdapter::Copy(), IdentifyBreakpoints(), mems::PairwiseMatchAdapter::Invert(), mems::LcbTrackingMatch< MatchType >::match_id, mems::PairwiseMatchAdapter::Orientation(), mems::PairwiseMatchAdapter::tm, and uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

int64 mems::greedyBreakpointElimination gnSeqI  minimum_weight,
std::vector< LCB > &  adjacencies,
std::vector< int64 > &  weights,
std::ostream *  status_out = NULL
 

Version 2 of this algorithm: each time two LCBs coalesce, repeatedly search their intervening region until either a single LCB exists or all LCBs meet the current minimum_weight.

Returns:
The weight of the minimum weight LCB that remains

< tracks how many of the LCBs are above the min weight

Definition at line 1615 of file Aligner.cpp.

References uint.

template<class BreakpointScorerType>
int64 mems::greedyBreakpointElimination_v4 std::vector< mems::LCB > &  adjacencies,
std::vector< double > &  scores,
BreakpointScorerType &  bp_scorer,
std::ostream *  status_out,
size_t  g1_tag = 0,
size_t  g2_tag = 0
 

Definition at line 651 of file GreedyBreakpointElimination.h.

References RemoveLCBandCoalesce(), and uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), and mems::Aligner::RecursiveAnchorSearch().

template<class SearchScorer>
double mems::greedySearch SearchScorer &  spbs  ) 
 

finds the best anchoring, returns the anchoring score

Definition at line 762 of file GreedyBreakpointElimination.h.

References debug_aligner, printProgress(), and uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

void HandleBinWriteCompletions void   )  [static]
 

void HandleReadingCompletions void   )  [static]
 

void HandleSeqbufWriteCompletions void   )  [static]
 

template<typename MatchVector>
void HssArrayToCga const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
hss_array_t &  hss_array,
std::vector< CompactGappedAlignment<> * > &  cga_list
 

Definition at line 698 of file Islands.h.

References mems::CompactGappedAlignment< BaseType >::Copy(), mems::CompactGappedAlignment< BaseType >::copyRange(), mems::CompactGappedAlignment< BaseType >::Free(), hss_array_t, hss_list_t, and uint.

Referenced by findHssRandomWalkCga(), findIslandsRandomWalkCga(), and makeAllPairwiseGenomeHSS().

template<typename MatchVector>
void HssColsToIslandCols const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
hss_array_t &  hss_array,
hss_array_t &  island_col_array
 

Definition at line 644 of file Islands.h.

References ComplementHss(), hss_array_t, hss_list_t, and uint.

Referenced by findHssRandomWalkCga().

template<typename MatchVector>
void hssColsToIslandCols const MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
std::vector< HssCols > &  hss_list,
std::vector< IslandCols > &  island_col_list
 

template<class MatchVector>
void IdentifyBreakpoints MatchVector &  mlist,
std::vector< gnSeqI > &  breakpoints
 

Definition at line 162 of file GreedyBreakpointElimination.h.

References uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), mems::ProgressiveAligner::CreatePairwiseBPDistance(), getPairwiseLCBs(), projectIntervalList(), and mems::EvenFasterSumOfPairsBreakpointScorer::validate().

void mems::IdentityMatrix const IntervalList &  iv_list,
NumericMatrix< double > &  identity
[inline]
 

Definition at line 40 of file DistanceMatrix.h.

References IdentityMatrix(), IntervalList, and mems::GenericIntervalList< MatchType >::seq_table.

template<class AbstractMatchVectorType>
void mems::IdentityMatrix const AbstractMatchVectorType &  matches,
const std::vector< genome::gnSequence * > &  seq_table,
NumericMatrix< double > &  identity
 

Definition at line 49 of file DistanceMatrix.h.

References AddToMatchIdentityMatrix(), Matrix< T >::init(), and uint.

Referenced by DistanceMatrix(), and IdentityMatrix().

bool idmer_greaterthan idmer &  a_v,
idmer &  m_v
[inline]
 

Definition at line 43 of file RepeatHash.h.

References mems::idmer::mer.

bool idmer_id_lessthan idmer &  a_v,
idmer &  m_v
[inline]
 

Definition at line 137 of file MatchFinder.h.

References mems::idmer::id.

Referenced by mems::PairwiseMatchFinder::EnumerateMatches(), mems::MatchFinder::EnumerateMatches(), and mems::MaskedMemHash::HashMatch().

bool idmer_lessthan idmer &  a_v,
idmer &  m_v
[inline]
 

Definition at line 131 of file MatchFinder.h.

References mems::idmer::mer.

Referenced by mems::MatchFinder::SearchRange().

bool idmer_position_lessthan idmer &  a_v,
idmer &  m_v
[inline]
 

Definition at line 48 of file RepeatHash.h.

References mems::idmer::position.

Referenced by mems::RepeatHash::HashMatch().

int InitdmSML long  working_mb,
long  buffer_size,
const char *  input_filename,
const char *  output_filename,
const char *const *  scratch_paths,
uint64  seed
 

buffer_list_t* InitList buffer_list_t *  list  ) 
 

void InitRadixSort sort_buf_t *  sortbuf,
buffer_t *  scratch_buffer
 

SMLHeader_t InitSML aFILE *  file,
uint64  file_size,
uint64  seed
 

void InitTime  ) 
 

void mems::initTrackingMatchLCBTracking const std::vector< mems::TrackingMatch > &  tracking_matches,
size_t  n1_count,
size_t  n2_count,
boost::multi_array< size_t, 3 > &  tm_lcb_id_array
 

creates an appropriately sized matrix for mapping individual TrackingMatches to their containing LCBs

Definition at line 123 of file GreedyBreakpointElimination.cpp.

References LCB_UNASSIGNED.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

template<typename ListType, typename Ty>
void insert ListType &  the_list,
const typename ListType::iterator &  iter,
Ty &  val
 

Definition at line 39 of file Interval.h.

template<typename ListType, typename RanIt, typename Ty>
void insert ListType &  the_list,
std::reverse_iterator< RanIt > &  riter,
Ty &  val
 

Definition at line 33 of file Interval.h.

Referenced by AddGapMatches().

int IsDenseEnough GappedAlignment *  gal_iter  ) 
 

Definition at line 977 of file ProgressiveAligner.cpp.

References mems::AbstractGappedAlignment< SparseAbstractMatch<> >::AlignmentLength(), mems::SparseAbstractMatch<>::LeftEnd(), mems::AbstractGappedAlignment< SparseAbstractMatch<> >::Length(), max_density, max_window_size, min_density, min_window_size, mems::SparseAbstractMatch<>::Multiplicity(), mems::SparseAbstractMatch<>::SeqCount(), and uint.

Referenced by mems::ProgressiveAligner::refineAlignment().

template<typename MatchListType>
void LoadAndCreateRawSequences MatchListType &  mlist,
std::ostream *  log_stream
 

Loads the sequences designated by the elements of the seq_filename vector and creates temporary RAW sequence files.

The resulting gnSequences are gnRAWSequences. The genome::gnRAWSequence objects are created on the heap and are not deallocated when this class is destroyed. They should be manually destroyed when no longer in use.

Definition at line 213 of file MatchList.h.

References CreateTempFileName(), genome::gnRAWSequence::length(), registerFileToDelete(), and uint.

template<typename MatchListType>
void LoadMFASequences MatchListType &  mlist,
const std::string &  mfa_filename,
std::ostream *  log_stream
 

Loads sequences to align from a Multi-FastA file The genome::gnSequence and SortedMerList objects are created on the heap and are not deallocated when this class is destroyed.

They should be manually destroyed when no longer in use.

Parameters:
mfa_filename The name of the Multi-FastA file to read in. Each sequence entry will be treated as a separate sequence to be aligned.
log_stream An output stream to log messages to. If NULL no logging is done

Definition at line 364 of file MatchList.h.

References uint.

template<typename MatchListType>
void LoadSequences MatchListType &  mlist,
std::ostream *  log_stream
 

Attempts to load the sequences designated by the elements of the seq_filename vector.

The genome::gnSequence objects are created on the heap and are not deallocated when this class is destroyed. They should be manually destroyed when no longer in use.

Definition at line 167 of file MatchList.h.

References uint.

template<typename T>
T* m_allocateAndCopy const T &  t  ) 
 

Definition at line 218 of file AbstractMatch.h.

References mems::AbstractMatchStartComparator< MatchType >::m_seq, and uint.

Referenced by mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), mems::PairwiseMatchAdapter::Copy(), mems::MatchProjectionAdapter::Copy(), mems::MatchHashEntry::Copy(), mems::GenericInterval< GappedBaseImpl >::Copy(), mems::GappedAlignment::Copy(), and mems::CompactGappedAlignment< BaseType >::Copy().

template<typename T>
void m_free T *  t  ) 
 

Definition at line 237 of file AbstractMatch.h.

References NO_MATCH.

Referenced by mems::UngappedLocalAlignment< AbstractMatchImpl >::Free(), mems::PairwiseMatchAdapter::Free(), mems::MatchProjectionAdapter::Free(), mems::MatchHashEntry::Free(), mems::GenericInterval< GappedBaseImpl >::Free(), mems::GappedAlignment::Free(), and mems::CompactGappedAlignment< BaseType >::Free().

template<typename MatchListType>
void makeAlignmentTree PhyloTree< AlignmentTreeNode > &  alignment_tree,
MatchListType &  mlist,
vector< uint > &  node_sequence_map
 

Definition at line 3431 of file ProgressiveAligner.cpp.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Free(), Interval, Match, node_id_t, PhyloTree< AlignmentTreeNode >::push_back(), mems::SuperInterval::SetLeftEnd(), mems::SuperInterval::SetLength(), mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), PhyloTree< AlignmentTreeNode >::size(), and uint.

Referenced by mems::ProgressiveAligner::alignPP().

void makeAllPairwiseGenomeHSS IntervalList &  iv_list,
vector< CompactGappedAlignment<> * > &  iv_ptrs,
vector< CompactGappedAlignment<> * > &  iv_orig_ptrs,
pairwise_genome_hss_t &  hss_cols,
const Params &  hmm_params
 

Definition at line 317 of file Backbone.cpp.

References applyBreakpoints(), mems::CompactGappedAlignment< BaseType >::Copy(), createMap(), findHssHomologyHMM(), getBpList(), hss_array_t, HssArrayToCga(), IntervalList, pairwise_genome_hss_t, projectIntervalList(), mems::GenericIntervalList< MatchType >::seq_table, and uint.

Referenced by detectAndApplyBackbone().

int MakeWorkingSet working_set_t *  ws,
offset_t  goalsize,
offset_t  minrecs,
offset_t  maxrecs
 

void mems::markAligned PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t  subject_node,
node_id_t  neighbor
 

Definition at line 2722 of file ProgressiveAligner.cpp.

References PhyloTree< AlignmentTreeNode >::size(), and uint.

Referenced by mems::ProgressiveAligner::alignNodes().

void markAsRefined PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t  ancestor
 

Definition at line 1737 of file ProgressiveAligner.cpp.

References node_id_t, and PhyloTree< AlignmentTreeNode >::size().

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

void maskNNNNN const genome::gnSequence &  in_seq,
genome::gnSequence &  out_seq,
std::vector< int64 > &  seq_coords,
int  mask_n_length
 

void maskNNNNN const gnSequence &  in_seq,
gnSequence &  out_seq,
vector< int64 > &  seq_coords,
int  mask_n_length
 

Definition at line 1536 of file FileSML.cpp.

template<class AbstractMatchType>
void mems::MatchIdentityMatrix const AbstractMatchType &  amt,
const std::vector< genome::gnSequence * > &  seq_table,
NumericMatrix< double > &  identity
 

Definition at line 108 of file DistanceMatrix.h.

References GetAlignment(), Matrix< T >::init(), and uint.

void mergePairwiseHomologyPredictions vector< CompactGappedAlignment<> * > &  iv_orig_ptrs,
pairwise_genome_hss_t &  hss_cols,
vector< vector< ULA * > > &  ula_list
 

Definition at line 455 of file Backbone.cpp.

References applyBreakpoints(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), createMap(), getBpList(), pairwise_genome_hss_t, uint, and ULA.

Referenced by detectAndApplyBackbone().

void mems::mergeUnalignedIntervals uint  seqI,
vector< Interval * > &  iv_list,
vector< Interval * > &  new_list
 

Definition at line 2602 of file ProgressiveAligner.cpp.

References addGuy(), mems::GenericInterval< GappedBaseImpl >::CropEnd(), mems::GenericInterval< GappedBaseImpl >::CropStart(), Interval, and mems::GenericInterval< GappedBaseImpl >::StealMatches().

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

void msaFromSeqTable MSA &  msa,
const vector< string > &  seq_table,
unsigned  id_base = 0
 

Definition at line 781 of file MuscleInterface.cpp.

References uint.

Referenced by mems::MuscleInterface::ProfileAlignFast(), and mems::MuscleInterface::RefineFast().

template<typename MatchVector>
void multFilter MatchVector &  matches,
uint  mult = 2
 

Definition at line 1501 of file ProgressiveAligner.cpp.

References uint.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

boolean my_validateLCB MatchList &  lcb  ) 
 

Test code to ensure that an individual LCB is truly collinear.

Returns:
true if the LCB is good

Definition at line 72 of file ProgressiveAligner.cpp.

References MatchList, and uint.

template<typename AbstractMatchImpl>
std::ostream & mems::operator<< std::ostream &  os,
const UngappedLocalAlignment< AbstractMatchImpl > &  ula
 

Definition at line 201 of file UngappedLocalAlignment.h.

References uint.

std::ostream & mems::operator<< std::ostream &  os,
const RepeatMatch &  mhe
 

Definition at line 40 of file RepeatMatch.cpp.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Length(), and uint32.

template<class GappedBaseImpl>
std::ostream& operator<< std::ostream &  os,
const GenericInterval< GappedBaseImpl > &  cr
 

Writes this GenericInterval to the specified output stream (e.g.

cout).

Definition at line 894 of file Interval.h.

References Match, and uint.

std::ostream & mems::operator<< std::ostream &  os,
const GappedAlignment &  ga
 

cout).

Definition at line 42 of file GappedAlignment.cpp.

References mems::GappedAlignment::align_matrix, mems::AbstractGappedAlignment< SparseAbstractMatch<> >::AlignmentLength(), mems::SparseAbstractMatch<>::SeqCount(), mems::SparseAbstractMatch<>::Start(), and uint.

std::istream & mems::operator>> std::istream &  is,
GappedAlignment &  ga
 

cin).

Definition at line 55 of file GappedAlignment.cpp.

References mems::GappedAlignment::align_matrix, mems::AbstractGappedAlignment< SparseAbstractMatch<> >::SetAlignmentLength(), mems::SparseAbstractMatch<>::SetStart(), and uint.

void outputBackbone const std::vector< GappedAlignment > &  backbone_regions,
std::ostream &  backbone_out
 

writes out a list of backbone regions

void outputBackbone const vector< GappedAlignment > &  backbone_regions,
ostream &  backbone_out
 

Definition at line 205 of file Islands.cpp.

References uint.

char ** mems::parseCommand const string &  cmd  ) 
 

Definition at line 49 of file MuscleInterface.cpp.

Referenced by mems::MuscleInterface::MuscleInterface(), mems::MuscleInterface::SetMuscleArguments(), and mems::MuscleInterface::SetMusclePath().

bool mems::pipeExec char **  cmd_argv,
const string &  command,
const string &  input,
string &  output,
string &  error
 

Definition at line 89 of file MuscleInterface.cpp.

Referenced by mems::MuscleInterface::CallMuscle(), and mems::MuscleInterface::ProfileAlign().

buffer_t* PopHead buffer_list_t *  list  ) 
 

buffer_t* PopTail buffer_list_t *  list  ) 
 

void prepareAlignmentTree PhyloTree< AlignmentTreeNode > &  alignment_tree  ) 
 

template<class BoostMatType>
void print2d_matrix BoostMatType &  mat,
std::ostream &  os
 

Definition at line 95 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::alignPP(), and mems::ProgressiveAligner::alignProfileToProfile().

void print_usage const char *  pname  ) 
 

void printBbSeq std::ostream &  os,
const bb_seqentry_t &  bbseq
[inline]
 

Definition at line 95 of file Backbone.h.

References bb_seqentry_t.

void mems::printMatch mems::AbstractMatch m,
std::ostream &  os
[inline]
 

Definition at line 134 of file GreedyBreakpointElimination.h.

References mems::AbstractMatch::RightEnd(), mems::AbstractMatch::SeqCount(), and mems::AbstractMatch::Start().

Referenced by mems::EvenFasterSumOfPairsBreakpointScorer::validate().

void mems::printProgress uint  prev_prog,
uint  cur_prog,
std::ostream &  os
 

Definition at line 39 of file GreedyBreakpointElimination.cpp.

References uint.

Referenced by AlignLCBInParallel(), mems::ProgressiveAligner::doGappedAlignment(), greedySearch(), and mems::ProgressiveAligner::refineAlignment().

template<class MatchVector>
void processNewMatch uint  seqI,
MatchVector &  new_matches,
typename MatchVector::value_type &  new_match
 

Definition at line 262 of file ProgressiveAligner.h.

References uint.

Referenced by EliminateOverlaps_v2().

void projectIntervalList mems::IntervalList iv_list,
std::vector< uint > &  projection,
std::vector< std::vector< mems::MatchProjectionAdapter * > > &  LCB_list,
std::vector< mems::LCB > &  projected_adjs
[inline]
 

Definition at line 500 of file ProgressiveAligner.h.

References computeLCBAdjacencies_v3(), ComputeLCBs_v2(), mems::MatchProjectionAdapter::Copy(), IdentifyBreakpoints(), and IntervalList.

Referenced by makeAllPairwiseGenomeHSS().

void propagateInvert PhyloTree< AlignmentTreeNode > &  alignment_tree,
node_id_t  ancestor,
size_t  ans_siv
 

propagates an inversion of an ancestral SuperInterval to SuperIntervals in descendant nodes

Definition at line 1340 of file ProgressiveAligner.cpp.

References Interval, node_id_t, mems::GenericInterval< GappedBaseImpl >::SetMatches(), and mems::GenericInterval< GappedBaseImpl >::StealMatches().

Referenced by mems::ProgressiveAligner::ConstructSuperIntervalFromMSA().

void PushHead buffer_list_t *  list,
buffer_t *  item
 

void PushTail buffer_list_t *  list,
buffer_t *  item
 

void QBrute record_t  a[],
int  lo,
int  hi
 

void QSort record_t  a[],
int  lo0,
int  hi0
 

void RadixSort sort_buf_t *  sortbuffer  ) 
 

void readBackboneColsFile std::istream &  bbcol_input,
std::vector< std::pair< size_t, ULA > > &  bb_list
[inline]
 

Definition at line 126 of file Backbone.h.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), and ULA.

void readBackboneSeqFile std::istream &  bbseq_input,
std::vector< bb_seqentry_t > &  backbone
[inline]
 

Definition at line 106 of file Backbone.h.

References bb_seqentry_t.

void ReadBuffer buffer_t *  buffer,
offset_t  num_recs,
iodevice_t *  dev
 

void mems::ReadList MatchList &  mlist,
std::istream &  match_stream
[inline]
 

Reads a GenericMatchList from an input stream Sequence and SML file names are read into the seq_filename and sml_filename vectors, but the actual files are not opened.

The calling function should load them after using this method.

Parameters:
match_stream The input stream to read from

Definition at line 490 of file MatchList.h.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::Copy(), Match, MatchID_t, MatchList, mems::GenericMatchList< MatchPtrType >::seq_filename, mems::GenericMatchList< MatchPtrType >::seq_table, mems::UngappedLocalAlignment< AbstractMatchImpl >::SetLength(), uint, and uint32.

void mems::readSubstitutionMatrix std::istream &  is,
score_t  matrix[4][4]
[inline]
 

Definition at line 76 of file SubstitutionMatrix.h.

References score_t.

unsigned int ReadTimer dmtimer_t *  t  ) 
 

void RecSort record_t  a[],
int  nelems
 

template<class FromType, class ToType, class MatchListType>
void mems::RemapSubsetMatchAddresses std::map< FromType, ToType > &  old_to_new_map,
MatchListType &  match_list
 

Use this to update linkage pointers after copying an entire set of Matches.

Definition at line 459 of file MatchList.h.

int removeFile const char *  filename,
int  verbose
 

cross-platform file deletion

buffer_t* RemoveItem buffer_list_t *  list,
buffer_t *  item
 

void removeLargeGapsPP GappedAlignment &  gal,
list< GappedAlignment * > &  gal_list,
vector< bool > &  gap_iv,
const vector< size_t > &  group1,
const vector< size_t > &  group2
 

Definition at line 1013 of file ProgressiveAligner.cpp.

References mems::AbstractGappedAlignment< SparseAbstractMatch<> >::AlignmentLength(), mems::GappedAlignment::Copy(), GetAlignment(), lcb_hangover, max_gap_length, mems::SparseAbstractMatch<>::SeqCount(), and mems::GappedAlignment::Split().

Referenced by mems::ProgressiveAligner::refineAlignment().

template<class LcbVector>
uint mems::RemoveLCBandCoalesce size_t  lcbI,
uint  seq_count,
LcbVector &  adjacencies,
std::vector< double > &  scores,
std::vector< std::pair< uint, uint > > &  id_remaps,
std::vector< uint > &  impact_list
 

removes an LCB from an LCB list and coalesces surrounding LCBs.

Returns the number of LCBs removed After LCBs are removed, the adjacency list should be processed with filterLCBs()

Parameters:
id_remaps This is populated with a list of LCB ids that were deleted or coalesced and now have a new LCB id for each coalesced LCB, an entry of the form <old id, new id> is added, deleted LCBs have entries of the form <deleted, -1>. Entries appear in the order operations were performed and the function undoLcbRemoval() can undo these operations in reverse order

Definition at line 147 of file GreedyBreakpointElimination.cpp.

References LCB_UNASSIGNED, and uint.

Referenced by greedyBreakpointElimination_v4(), mems::SimpleBreakpointScorer::operator()(), mems::GreedyRemovalScorer::remove(), mems::SimpleBreakpointScorer::remove(), and mems::EvenFasterSumOfPairsBreakpointScorer::remove().

void ReorganizeWorkingSet working_set_t *  ws,
offset_t  minrecs,
offset_t  maxrecs
 

template<class Tree>
void rerootTree Tree &  t,
node_id_t  new_root
 

place a root on the branch with endpoints root_left and root_right

Definition at line 45 of file TreeUtilities.h.

References containsNode(), findAndErase(), node_id_t, and uint.

Referenced by createAlignmentTreeRoot().

void RestructureReadSMLBins void   ) 
 

void RestructureSMLBinsForWrite void   ) 
 

void scanFit list< LabeledMem > &  pair_list,
list< LabeledMem >::iterator &  list_iter,
Match *  new_match,
uint  sort_seq
 

Definition at line 283 of file Aligner.cpp.

References Match, and uint.

Referenced by AaronsLCB().

void scanLabels set< uint > &  no_match_labels,
uint start_label,
boolean  forward
 

Definition at line 245 of file Aligner.cpp.

References uint.

Referenced by AaronsLCB().

void mems::scanLeft int &  left_recurseI,
std::vector< LCB > &  adjacencies,
int  min_weight,
int  seqI
 

Definition at line 705 of file Aligner.cpp.

void mems::scanRight int &  right_recurseI,
std::vector< LCB > &  adjacencies,
int  min_weight,
int  seqI
 

Definition at line 709 of file Aligner.cpp.

void mems::SearchLCBGaps MatchList &  new_matches,
const std::vector< std::vector< int64 > > &  iv_regions,
MaskedMemHash &  nway_mh
 

Definition at line 784 of file Aligner.cpp.

References mems::MemHash::Clear(), CreateTempFileName(), mems::FileSML::dmCreate(), EliminateOverlaps(), mems::MemHash::FindMatches(), getDefaultSeedWeight(), getSeed(), mems::GenericMatchList< MatchPtrType >::LengthFilter(), MatchList, MIN_DNA_SEED_WEIGHT, mems::GenericMatchList< MatchPtrType >::MultiplicityFilter(), mems::GenericMatchList< MatchPtrType >::seq_table, mems::GenericMatchList< MatchPtrType >::sml_table, transposeMatches(), uint, uint64, and WriteList().

Referenced by mems::ProgressiveAligner::alignProfileToProfile(), and mems::Aligner::RecursiveAnchorSearch().

void Shift64 int  amt,
int *  hi,
int *  lo
 

shifts a 64-bit value (in two 32 bit parts) either right or left.

amt negative -> left, positive -> right

void mems::simpleFindBackbone IntervalList &  iv_list,
uint  backbone_size,
uint  max_gap_size,
std::vector< GappedAlignment > &  backbone_regions
 

Identifies stretches of alignment existing in all sequences that doesn't contain a gap larger than a particular size.

Such regions are considered the backbone of the alignment.

Definition at line 96 of file Islands.cpp.

References mems::gnAlignedSequences::alignedSeqsSize(), mems::GenericInterval< GappedBaseImpl >::GetAlignedSequences(), Interval, IntervalList, mems::gnAlignedSequences::sequences, and uint.

void mems::simpleFindIslands IntervalList &  iv_list,
uint  island_size,
std::vector< Island > &  island_list
 

Definition at line 37 of file Islands.cpp.

References mems::gnAlignedSequences::alignedSeqsSize(), mems::GenericInterval< GappedBaseImpl >::GetAlignedSequences(), Interval, IntervalList, mems::Island::leftI, mems::Island::leftJ, mems::Island::rightI, mems::Island::rightJ, mems::Island::seqI, mems::Island::seqJ, mems::gnAlignedSequences::sequences, and uint.

void mems::simpleFindIslands IntervalList &  iv_list,
uint  island_size,
std::ostream &  island_out
 

Identifies gaps in the alignment between pairs of sequences that are longer than some number of base pairs in length.

Prints islands to an output stream

Definition at line 25 of file Islands.cpp.

References IntervalList, mems::Island::leftI, mems::Island::leftJ, mems::Island::rightI, mems::Island::rightJ, mems::Island::seqI, mems::Island::seqJ, and uint.

template<class MatchVector>
uint64 SimpleGetLCBCoverage MatchVector &  lcb  ) 
 

Definition at line 410 of file ProgressiveAligner.h.

References uint, and uint64.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

template<typename MatchVector>
void SingleCopyDistanceMatrix MatchVector &  iv_list,
std::vector< genome::gnSequence * > &  seq_table,
NumericMatrix< double > &  distance
 

Definition at line 195 of file DistanceMatrix.h.

References Matrix< T >::init(), TransformDistanceIdentity(), and uint.

int SortBuffer buffer_t *  buf  ) 
 

void SortHandleCompletions void   ) 
 

void SortingEnsureAllOperationsComplete  ) 
 

void SortingPhase void   ) 
 

void SortReading void   ) 
 

void SortSorting void   ) 
 

void SortUpdateIOState  ) 
 

void SortWriting void   ) 
 

void splitGappedAlignment const GappedAlignment &  ga,
GappedAlignment &  ga1,
GappedAlignment &  ga2,
std::vector< size_t > &  seqs1,
std::vector< size_t > &  seqs2
 

Definition at line 1002 of file ProgressiveAligner.cpp.

References GetAlignment(), mems::SparseAbstractMatch<>::SeqCount(), and mems::SparseAbstractMatch<>::SetLeftEnd().

Referenced by mems::ProgressiveAligner::refineAlignment().

dmtimer_t* StartTimer  ) 
 

void StopTimer dmtimer_t *  t  ) 
 

void mems::stripGapColumns std::vector< std::string > &  aln  ) 
 

Definition at line 870 of file MuscleInterface.cpp.

Referenced by mems::MuscleInterface::ProfileAlign(), and mems::MuscleInterface::ProfileAlignFast().

void stripGaps std::string &  str  ) 
 

Definition at line 891 of file MuscleInterface.cpp.

void mems::TransformDistanceIdentity NumericMatrix< double > &  identity  )  [inline]
 

Definition at line 276 of file DistanceMatrix.h.

References Matrix< T >::cols(), and Matrix< T >::rows().

Referenced by DistanceMatrix(), and SingleCopyDistanceMatrix().

void Translate32 uint32 *  dest,
const char *  src,
const unsigned  len
[static]
 

void translateToPairwiseGenomeHSS const hss_array_t &  hss_array,
pairwise_genome_hss_t &  hss_cols
 

Definition at line 275 of file Backbone.cpp.

References hss_array_t, hss_list_t, pairwise_genome_hss_t, and uint.

Referenced by detectAndApplyBackbone().

void transposeMatches MatchList &  mlist,
uint  seqI,
const std::vector< int64 > &  seq_regions
 

void transposeMatches MatchList &  mlist,
uint  seqI,
const vector< int64 > &  seq_regions
 

Transposes the coordinates of matches in mlist to correspond to the original set of source sequence regions described by seq_regions, splitting matches if necessary.

Definition at line 973 of file Aligner.cpp.

References mems::UngappedLocalAlignment< AbstractMatchImpl >::CropEnd(), mems::UngappedLocalAlignment< AbstractMatchImpl >::CropStart(), mems::UngappedLocalAlignment< AbstractMatchImpl >::Length(), Match, MatchList, and uint.

Referenced by SearchLCBGaps().

void unalignIslands IntervalList &  iv_list,
vector< CompactGappedAlignment<> * > &  iv_orig_ptrs,
vector< vector< ULA * > > &  ula_list
 

Definition at line 661 of file Backbone.cpp.

References collapseCollinear(), mems::CompactGappedAlignment< BaseType >::CondenseGapColumns(), mems::CompactGappedAlignment< BaseType >::Copy(), createMap(), Interval, IntervalList, mems::GenericIntervalList< MatchType >::seq_table, and uint.

Referenced by detectAndApplyBackbone().

template<class LcbVector>
void undoLcbRemoval uint  seq_count,
LcbVector &  adjs,
std::vector< std::pair< uint, uint > > &  id_remaps
 

Definition at line 273 of file GreedyBreakpointElimination.cpp.

References uint.

Referenced by mems::SimpleBreakpointScorer::operator()(), and mems::EvenFasterSumOfPairsBreakpointScorer::remove().

void UpdateDeviceIOExecuteState working_set_t *  ws,
iodevice_t *  dev
 

void UpdateIOState void   ) 
 

void UpdateWSIOFinishedState working_set_t *  ws  ) 
 

boolean mems::validateLCB MatchList &  lcb  ) 
 

Test code to ensure that an individual LCB is truly collinear.

Definition at line 36 of file Aligner.cpp.

References MatchList, and uint.

void validateRangeIntersections vector< MatchList > &  lcb_list  ) 
 

const char* VFmt const char *  fmt,
va_list  args
 

void mems::writeBackboneColumns std::ostream &  bb_out,
backbone_list_t &  bb_list
 

Writes a backbone column file.

This file type gets used by the Mauve GUI.

Definition at line 991 of file Backbone.cpp.

References backbone_list_t.

void mems::writeBackboneSeqCoordinates backbone_list_t &  bb_list,
IntervalList &  iv_list,
std::ostream &  bb_out
 

Writes a backbone sequence coordinate file.

This file type is easier to analyze with statistical packages.

Definition at line 1013 of file Backbone.cpp.

References backbone_list_t, mems::CompactGappedAlignment< BaseType >::GetColumn(), IntervalList, and uint.

void WriteBuffer buffer_t *  buffer,
offset_t  num_recs,
iodevice_t *  dev
 

void mems::WriteList const MatchList &  mlist,
std::ostream &  match_stream
[inline]
 

Writes a GenericMatchList to the designated output stream.

Parameters:
match_stream The output stream to write to

Definition at line 581 of file MatchList.h.

References Match, MatchID_t, MatchList, mems::GenericMatchList< MatchPtrType >::seq_filename, and mems::GenericMatchList< MatchPtrType >::seq_table.

Referenced by SearchLCBGaps().

void WritePermutationCoordinates IntervalList &  perm_iv_list,
std::string  out_filename
 

Definition at line 1918 of file Aligner.cpp.

References IntervalList, and mems::GenericIntervalList< MatchType >::seq_table.

Referenced by mems::Aligner::RecursiveAnchorSearch().


Variable Documentation

SearchCacheComparator mems::cache_comparator [static]
 

Definition at line 306 of file Aligner.h.

Referenced by cachecomp(), mems::ProgressiveAligner::recurseOnPairs(), and mems::Aligner::SearchWithinLCB().

char mems::charmap[128] [static]
 

Definition at line 347 of file Islands.h.

Referenced by findHssHomologyHMM(), and getCharmap().

char mems::colmap[5][5] [static]
 

Initial value:

 {

        {'1','3','4','5','7'},  
        {'3','2','6','4','7'},  
        {'4','6','2','3','7'},  
        {'5','4','3','1','7'},  
        {'7','7','7','7','\0'},  
}

Definition at line 370 of file Islands.h.

Referenced by findHssHomologyHMM().

int mems::dbg_count = 0 [static]
 

Definition at line 55 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

bool mems::debug_aligner = false
 

controls whether copious debugging tests and output gets written to screen

Definition at line 14 of file SuperInterval.cpp.

Referenced by mems::ProgressiveAligner::align(), mems::ProgressiveAligner::alignPP(), mems::ProgressiveAligner::alignProfileToProfile(), mems::ProgressiveAligner::createAncestralOrdering(), mems::SuperInterval::CropLeft(), mems::SuperInterval::CropRight(), mems::ProgressiveAligner::doGappedAlignment(), greedySearch(), mems::ProgressiveAligner::recursiveApplyAncestralBreakpoints(), and mems::ProgressiveAligner::translateGappedCoordinates().

bool mems::debug_cga = false [static]
 

Definition at line 124 of file CompactGappedAlignment.h.

Referenced by mems::CompactGappedAlignment< BaseType >::validate(), and mems::CompactGappedAlignment< BaseType >::validate_bitcount().

bool mems::debug_interval = false [static]
 

Definition at line 243 of file Interval.h.

Referenced by mems::GenericInterval< GappedBaseImpl >::ValidateMatches().

bool mems::debug_me = false
 

Definition at line 54 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::alignProfileToProfile().

bool mems::debug_muscle = false
 

Definition at line 44 of file MuscleInterface.cpp.

Referenced by mems::MuscleInterface::ProfileAlign().

bool mems::debug_shite = false
 

Definition at line 31 of file Aligner.cpp.

bool mems::debugging_cltm = false
 

Definition at line 1536 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::constructLcbTrackingMatches().

const uint32 mems::DEFAULT_ENUMERATION_TOLERANCE = 1 [static]
 

Definition at line 32 of file MemHash.h.

const score_t mems::default_gap_extend = -30 [static]
 

Definition at line 35 of file SubstitutionMatrix.h.

Referenced by getDefaultScoringScheme(), and mems::PairwiseScoringScheme::PairwiseScoringScheme().

const score_t mems::default_gap_open = -400 [static]
 

Definition at line 34 of file SubstitutionMatrix.h.

Referenced by getDefaultScoringScheme(), and mems::PairwiseScoringScheme::PairwiseScoringScheme().

const mems::score_t mems::DEFAULT_ISLAND_SCORE_THRESHOLD = 2727
 

Definition at line 34 of file Backbone.h.

const uint32 mems::DEFAULT_MEM_TABLE_SIZE = 40000 [static]
 

Definition at line 30 of file MemHash.h.

const gnSeqI mems::default_min_r_gap_size = 200
 

Definition at line 179 of file Aligner.cpp.

Referenced by mems::Aligner::Aligner().

const uint32 mems::DEFAULT_REPEAT_TOLERANCE = 0 [static]
 

Definition at line 31 of file MemHash.h.

uint8* mems::DNA_TABLE [static]
 

Definition at line 1260 of file FileSML.cpp.

int mems::failure_count = 0 [static]
 

Definition at line 516 of file MuscleInterface.cpp.

Referenced by mems::MuscleInterface::CallMuscle(), and mems::MuscleInterface::ProfileAlign().

const score_t mems::hoxd_matrix[4][4] [static]
 

Initial value:

 
{ 
        {91,    -114,   -31,    -123}, 

        {-114,  100,    -125,   -31}, 

        {-31,   -125,   100,    -114}, 

        {-123,  -31,    -114,   91}, 
}

Definition at line 23 of file SubstitutionMatrix.h.

Referenced by getDefaultScoringScheme(), and mems::PairwiseScoringScheme::PairwiseScoringScheme().

const unsigned mems::INDEX_INTERVAL = 512 [static]
 

Definition at line 270 of file CompactGappedAlignment.h.

Referenced by mems::CompactGappedAlignment< BaseType >::create_bitcount(), mems::CompactGappedAlignment< BaseType >::GetColumn(), mems::CompactGappedAlignment< BaseType >::SeqPosToColumn(), and mems::CompactGappedAlignment< BaseType >::validate_bitcount().

const score_t mems::INV_SCORE = (std::numeric_limits<score_t>::max)()
 

Definition at line 398 of file GreedyBreakpointElimination.h.

Referenced by GetPairwiseAnchorScore().

const score_t mems::INVALID_SCORE = (std::numeric_limits<score_t>::max)() [static]
 

Definition at line 21 of file Scoring.h.

Referenced by computeConsensusScore(), computeGapScores(), computeMatchScores(), and computeSPScore().

size_t mems::lcb_hangover = 300
 

Definition at line 63 of file ProgressiveAligner.cpp.

Referenced by removeLargeGapsPP().

const uint mems::LCB_UNASSIGNED = (std::numeric_limits<uint>::max)()
 

indicates an LCB identifier hasn't been assigned or is unknown

Definition at line 72 of file GreedyBreakpointElimination.h.

Referenced by initTrackingMatchLCBTracking(), mems::EvenFasterSumOfPairsBreakpointScorer::remove(), RemoveLCBandCoalesce(), and mems::EvenFasterSumOfPairsBreakpointScorer::validate().

const int mems::LEFT_NEIGHBOR = -1
 

Definition at line 75 of file Backbone.cpp.

int mems::mask_length = 31 [static]
 

Definition at line 1265 of file FileSML.cpp.

int mems::mask_weight = 31 [static]
 

Definition at line 1266 of file FileSML.cpp.

double mems::max_density = .9
 

Definition at line 61 of file ProgressiveAligner.cpp.

Referenced by IsDenseEnough().

size_t mems::max_gap_length = 3000
 

Definition at line 62 of file ProgressiveAligner.cpp.

Referenced by removeLargeGapsPP().

double mems::max_window_size = 20000
 

Definition at line 59 of file ProgressiveAligner.cpp.

Referenced by IsDenseEnough(), mems::ProgressiveAligner::ProgressiveAligner(), mems::ProgressiveAligner::refineAlignment(), and mems::ProgressiveAligner::SetMaxGappedAlignmentLength().

const int mems::MEGA_ALIGN_COLUMNS = 60
 

Definition at line 41 of file gnAlignedSequences.h.

Referenced by mems::gnAlignedSequences::outputMega().

const uint mems::MIN_ANCHOR_LENGTH = 9 [static]
 

Definition at line 270 of file Aligner.h.

double mems::min_density = .5
 

Definition at line 60 of file ProgressiveAligner.cpp.

Referenced by IsDenseEnough().

double mems::min_window_size = 200
 

Definition at line 58 of file ProgressiveAligner.cpp.

Referenced by IsDenseEnough().

const uint mems::NO_ADJACENCY = (std::numeric_limits<uint>::max)()
 

Definition at line 640 of file Aligner.cpp.

Referenced by computeLCBAdjacencies_v2(), and mems::Aligner::RecursiveAnchorSearch().

const gnSeqI mems::NO_MATCH = 0 [static]
 

Definition at line 27 of file AbstractMatch.h.

Referenced by mems::AbstractMatch::End(), and m_free().

int mems::NumBins
 

Definition at line 1039 of file FileSML.cpp.

bool mems::penalize_repeats = false
 

Definition at line 37 of file GreedyBreakpointElimination.cpp.

Referenced by GetPairwiseAnchorScore().

const double mems::POOL_GROWTH_RATE = 1.6
 

When more space is needed to store a datatype, the memory pool will grow by this factor.

Definition at line 27 of file SlotAllocator.h.

Referenced by mems::SlotAllocator< T >::Allocate().

boolean mems::print_sp = false
 

Definition at line 169 of file MatchFinder.cpp.

Referenced by mems::MatchFinder::SearchRange().

const unsigned int mems::PROGRESS_GRANULARITY = 100
 

Definition at line 37 of file MatchFinder.h.

Referenced by mems::MatchFinder::SearchRange().

bool mems::progress_msgs = false
 

Definition at line 52 of file ProgressiveAligner.cpp.

Referenced by mems::ProgressiveAligner::getAlignedChildren(), and mems::ProgressiveAligner::recursiveApplyAncestralBreakpoints().

const int mems::RIGHT_NEIGHBOR = 1
 

Definition at line 76 of file Backbone.cpp.

Referenced by getNeighbor().

mask_t mems::seed_mask = 0x7FFFFFFF [static]
 

Definition at line 1264 of file FileSML.cpp.

uint mems::seq_compare_start [static]
 

Definition at line 28 of file Match.h.


Generated on Fri Mar 14 06:01:38 2008 for libMems by doxygen 1.3.6