

class vamp.utils.ContigComposition

Association of two genomic intervals, used to represent the composition of one interval by another

vamp.utils.find_deletions(contig_composition_list, verbose=False)

Find contigs with deletions and return tuple containing list of idicies of contigs to be replaced along with replacement. e.g.:

([2, 3],
 {'seq': 'chr', 'start': 3, 'end': 5, 'contig': 'contig1',
  'contig_start': 5, 'contig_end': 10, 'strand': '+', 'contig_size': 20})

replaces contig_composition_list[2:3] with the new contig composition specified

vamp.utils.get_sequence_length_from_maf(maf_file, reference_species, sequence_id)

Return length of the reference_species.sequence_id

vamp.utils.get_sequence_net_alignment(maf_filename, reference_species, sequence_id, species, verbose=False)

Return alignment created by stitching MAF blocks along entire sequence (including gaps) Also returns a list of intervals relative to the alignment that indicate the MAF block, block start, and block end of the source of that piece of the alignment


Read contig composition summary and return attributes

vamp.utils.replace_alignment_with_block(alignment, block, reference_species, sequence_id, verbose=False)

Return updated alignment, interval

vamp.utils.subtract_intervals(interval1, interval2)

Subtract two intervals, return list of resulting intervals

vamp.utils.summarize_contig_composition(interval_list, src_tag, start_tag, end_tag, strand_tag, source_size_tag)

Summarize the contig composition in a list of tuples (seq, start, end, contig, contig_start, contig_end, strand, contig_size)

vamp.utils.update_contig_composition_summary(contig_composition_summary, replacements)

Update list of ContigComposition objects with replacements Replacements are a list of tuples containing a list of indicies of contigs to be replaced along with replacements The replacements must be non-overlapping and sorted

vamp.utils.update_sequence_with_replacements(seq, replacements, replacement_seq_dict)

Update Seq object with replacements Replacements must be non-overlapping and sorted


Convert coordinates from GFF or BED file using multi-fasta alignments


Extract fasta sequences from regions defined in GFF/BED file and output fasta to stdout



TODO summarize_alignments multi_align_fasta reference_sequence [-h,–help] [-v,–verbose] [–version]


TODO This describes how to use this script. This docstring will be printed by the script if there is an error or if the user requests help (-h or –help).


TODO: Show some examples of how to use this script.


TODO: List exit codes


TODO: lparsons <>



seq_utils.summarize_alignments.parse_event(event, reference_sequence, alternate_sequence)

Parse a simple event with reference_position, reference_base, and new_base and determine the type and add padding if necessary (for VCF compatibility)

seq_utils.summarize_alignments.summary_of_alignment(alignment, reference_sequence_id)

Summarizes changes in given alignment (pairwise only) Input: alignment = Bio.AlignIO object

reference_index = index of the reference sequence in alignment (default is 1)
Output: dictionary with key for each non-reference sequence in alignment
Each key has a dictionary with keys (match_count, mismatch_count, mismatches, contiguous_change_count)
mismatches is list of mismatches by base: ‘RefBase(RefPos)NewBase’ contiguout_change_count is the number of contiguous change “events”


Utility classes and methods for working with sequence data

class seq_utils.utils.GenomicRegion(region_string)

A genomic region specified by chr:start-end, using 1-based cooredinates

seq_utils.utils.convert_interval_gapped_to_nongapped(seq, start, end)

Take position with gaps and return position without gaps Uses 0-based positions

seq_utils.utils.convert_interval_nongapped_to_gapped(seq, start, end, include_end_gaps=False)

Take position without gaps and return position with gaps Uses 0-based positions

Table Of Contents

Previous topic


This Page