NetMSA
NetMSA.NetMSA
NetMSA.Particle
NetMSA.Position
NetMSA.aligned
NetMSA.createPeerMatrix
NetMSA.createswarm
NetMSA.flydown
NetMSA.full
NetMSA.getposition
NetMSA.matrixalignment
NetMSA.mostfrequent
NetMSA.objective
NetMSA.rowalignment
NetMSA.stopcriteria
NetMSA.weight
NetMSA.NetMSA
— ModuleThis module provides an implementation of NetMSA algorithm in Julia, which can be used for multiple sequence alignment.
NetMSA.createPeerMatrix
— MethodcreatePeerMatrix(inputStrings::Vector{String})::Matrix{Union{Missing,Char}}
Create and return a Peer matrix, containing charachters as elements, where each input sequence, provided in the inputStrings, is represented as a column. Missing values are represented in the matrix by the missing
keyword.
Examples
julia> NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
NetMSA.matrixalignment
— Methodmatrixalignment(M)
Align the matrix using NetMSA algorithm.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.matrixalignment(M)
9×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' '-' 'b' 'b'
'c' 'c' 'c' 'c'
'b' 'b' '-' 'b'
'c' 'c' '-' 'c'
'd' 'f' 'h' 'j'
'e' 'g' 'i' 'k'
'm' '-' 'm' 'm'
'-' '-' 'n' '-'
NetMSA.Particle
— TypeA particle that is used for creating swarms.
Fields
- value::Char : Value of the particle, e.g. 'b' or 'c'
- updated::Int64 : Number of turns till last updated
- pos::Position : The original position of the particle
- best::Position : The best local position of the particle
- bestvalue::Float64 : Best local score
NetMSA.Position
— TypeStore the position of a given particle. Position $x_{s_{i}}(r)$ of the particle $p_{s_{i}}$ is defined by using the row $r$ that contains the symbol $s_i$ as well as locations of the symbol $s_i$ in the different columns (indexes of the columns that contain $s_i$ ) in the row $r$.
NetMSA.aligned
— Methodaligned(row)::Bool
Return whether a row is aligned or not.
A row is aligned if it only contains different occurrences of the same symbol.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.aligned(M[1, :])
true
juila> NetMSA.aligned(M[2, :])
false
NetMSA.createswarm
— Methodcreateswarm(rowindex::Int64, M)
Create a swarm containing unique Particles in the current row.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.createswarm(2, M)
2-element Array{NetMSA.Particle,1}:
NetMSA.Particle('c', 0, NetMSA.Position(2, [2]), NetMSA.Position(2, [2]), 0.0)
NetMSA.Particle('b', 0, NetMSA.Position(2, [1, 3, 4]), NetMSA.Position(2, [1, 3, 4]), 0.0)
NetMSA.flydown
— Methodflydown(p, M; stride=1)
Fly down the given particle by stride
.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.flydown(p, M)
9×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'-' 'c' '-' '-'
'b' 'b' 'b' 'b'
'c' 'c' 'c' 'c'
'b' 'f' 'h' 'b'
'c' 'g' 'i' 'c'
'd' missing 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
julia> NetMSA.flydown(p, M; stride=3)
11×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'-' 'c' '-' '-'
'-' 'b' '-' '-'
'-' 'c' '-' '-'
'b' 'f' 'b' 'b'
'c' 'g' 'c' 'c'
'b' missing 'h' 'b'
'c' missing 'i' 'c'
'd' missing 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
NetMSA.full
— Methodaligned(row)::Bool
Return whether a row is full or not.
An aligned row r is called full if no gaps (—) are added in the row r . That is, the number of occurrences of the symbol in the row is equal to the number of columns in the matrix.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.full(M[1, :])
true
juila> NetMSA.full(M[2, :])
false
NetMSA.getposition
— Methodgetposition(value, rowindex, matrix)
Return the Position (rowindex, [colindex1, colindex2, ...]) of the Particle represented by value
, at the rowindex
in the matrix
.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.getposition('b', 2, M)
NetMSA.Position(2, [1, 3, 4])
NetMSA.mostfrequent
— Methodmostfrequent(row)
Return a tuple containing the most frequent element occuring in the row
, along with its frequency.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.mostfrequent(M[2, :])
(3, 'b')
NetMSA.objective
— Methodobjective(M, rowindex; endindex=0)
Return the objective score of the row, calculated as follows:
where $A(r)$ is the number of aligned rows in M from $r$ to the last row, $C(r)$ is the maximum number of matched charachters in the current row, $Gaps(r)$ is the number of gaps added to the matrix M from row $r$ to the last row, and $w(r)$ is the weight of the row $r$.
endindex
is used to reduce the search area for Gaps, and if it is not provided, it would default to size(M)[1]
.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.objective(M, 2)
2.625
NetMSA.rowalignment
— Methodrowalignment(rowindex, M)
Return Particle with best position that aligns (maximizes the objective score) the given row in the matrix.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.rowalignment(2, M)
NetMSA.Particle('c', 0, Main.NetMSA.Position(2, [2]), Main.NetMSA.Position(3, [1]), 9.0)
NetMSA.stopcriteria
— Methodstopcriteria(p::Particle, newindex, M; threshold::Int=5, debug=false)
Check whether a stopping criteria has been met. Two stopping criteria are checked in this function:
- Criteria 2: If a particle hasn't updated its best score in the last
threshold
turns. - Criteria 3: If a particle moves to a new row which already contains the same symbol as that of the particle.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.stopcriteria(p, 3, M; debug=true)
"Terminating because of criteria 3"
true
NetMSA.weight
— Methodweight(row; w1=0.25, w2=0.5, w3=1.0)
Return the weight of the row, calculated as:
where $n_s$ is the number of occurrences of the symbol $s$ in the aligned row $r$, and $c$ is the total number of columns in the row. The value of $x$ is equal to zero if every symbol in the row $r$ occurred at most once, otherwise $x$ is equal to the max number of occurrences (matches) of some symbol in $r$.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.weight(M[1, :])
1.0
juila> NetMSA.weight(M[2, :])
0.1875