NetMSA
NetMSA.NetMSANetMSA.ParticleNetMSA.PositionNetMSA.alignedNetMSA.createPeerMatrixNetMSA.createswarmNetMSA.flydownNetMSA.fullNetMSA.getpositionNetMSA.matrixalignmentNetMSA.mostfrequentNetMSA.objectiveNetMSA.rowalignmentNetMSA.stopcriteriaNetMSA.weight
NetMSA.NetMSA — ModuleThis module provides an implementation of NetMSA algorithm in Julia, which can be used for multiple sequence alignment.
NetMSA.createPeerMatrix — MethodcreatePeerMatrix(inputStrings::Vector{String})::Matrix{Union{Missing,Char}}Create and return a Peer matrix, containing charachters as elements, where each input sequence, provided in the inputStrings, is represented as a column. Missing values are represented in the matrix by the missing keyword.
Examples
julia> NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'NetMSA.matrixalignment — Methodmatrixalignment(M)Align the matrix using NetMSA algorithm.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.matrixalignment(M)
9×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' '-' 'b' 'b'
'c' 'c' 'c' 'c'
'b' 'b' '-' 'b'
'c' 'c' '-' 'c'
'd' 'f' 'h' 'j'
'e' 'g' 'i' 'k'
'm' '-' 'm' 'm'
'-' '-' 'n' '-'NetMSA.Particle — TypeA particle that is used for creating swarms.
Fields
- value::Char : Value of the particle, e.g. 'b' or 'c'
- updated::Int64 : Number of turns till last updated
- pos::Position : The original position of the particle
- best::Position : The best local position of the particle
- bestvalue::Float64 : Best local score
NetMSA.Position — TypeStore the position of a given particle. Position $x_{s_{i}}(r)$ of the particle $p_{s_{i}}$ is defined by using the row $r$ that contains the symbol $s_i$ as well as locations of the symbol $s_i$ in the different columns (indexes of the columns that contain $s_i$ ) in the row $r$.
NetMSA.aligned — Methodaligned(row)::BoolReturn whether a row is aligned or not.
A row is aligned if it only contains different occurrences of the same symbol.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.aligned(M[1, :])
true
juila> NetMSA.aligned(M[2, :])
falseNetMSA.createswarm — Methodcreateswarm(rowindex::Int64, M)Create a swarm containing unique Particles in the current row.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.createswarm(2, M)
2-element Array{NetMSA.Particle,1}:
NetMSA.Particle('c', 0, NetMSA.Position(2, [2]), NetMSA.Position(2, [2]), 0.0)
NetMSA.Particle('b', 0, NetMSA.Position(2, [1, 3, 4]), NetMSA.Position(2, [1, 3, 4]), 0.0)NetMSA.flydown — Methodflydown(p, M; stride=1)Fly down the given particle by stride.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.flydown(p, M)
9×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'-' 'c' '-' '-'
'b' 'b' 'b' 'b'
'c' 'c' 'c' 'c'
'b' 'f' 'h' 'b'
'c' 'g' 'i' 'c'
'd' missing 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
julia> NetMSA.flydown(p, M; stride=3)
11×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'-' 'c' '-' '-'
'-' 'b' '-' '-'
'-' 'c' '-' '-'
'b' 'f' 'b' 'b'
'c' 'g' 'c' 'c'
'b' missing 'h' 'b'
'c' missing 'i' 'c'
'd' missing 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'NetMSA.full — Methodaligned(row)::BoolReturn whether a row is full or not.
An aligned row r is called full if no gaps (—) are added in the row r . That is, the number of occurrences of the symbol in the row is equal to the number of columns in the matrix.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.full(M[1, :])
true
juila> NetMSA.full(M[2, :])
falseNetMSA.getposition — Methodgetposition(value, rowindex, matrix)Return the Position (rowindex, [colindex1, colindex2, ...]) of the Particle represented by value, at the rowindex in the matrix.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.getposition('b', 2, M)
NetMSA.Position(2, [1, 3, 4])NetMSA.mostfrequent — Methodmostfrequent(row)Return a tuple containing the most frequent element occuring in the row, along with its frequency.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.mostfrequent(M[2, :])
(3, 'b')NetMSA.objective — Methodobjective(M, rowindex; endindex=0)Return the objective score of the row, calculated as follows:
where $A(r)$ is the number of aligned rows in M from $r$ to the last row, $C(r)$ is the maximum number of matched charachters in the current row, $Gaps(r)$ is the number of gaps added to the matrix M from row $r$ to the last row, and $w(r)$ is the weight of the row $r$.
endindex is used to reduce the search area for Gaps, and if it is not provided, it would default to size(M)[1].
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.objective(M, 2)
2.625NetMSA.rowalignment — Methodrowalignment(rowindex, M)Return Particle with best position that aligns (maximizes the objective score) the given row in the matrix.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.rowalignment(2, M)
NetMSA.Particle('c', 0, Main.NetMSA.Position(2, [2]), Main.NetMSA.Position(3, [1]), 9.0)NetMSA.stopcriteria — Methodstopcriteria(p::Particle, newindex, M; threshold::Int=5, debug=false)Check whether a stopping criteria has been met. Two stopping criteria are checked in this function:
- Criteria 2: If a particle hasn't updated its best score in the last
thresholdturns. - Criteria 3: If a particle moves to a new row which already contains the same symbol as that of the particle.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> p = NetMSA.Particle('b', NetMSA.getposition('b', 2, M));
NetMSA.Particle('b', 0, Main.NetMSA.Position(2, [1, 3, 4]), Main.NetMSA.Position(2, [1, 3, 4]), 0.0)
julia> NetMSA.stopcriteria(p, 3, M; debug=true)
"Terminating because of criteria 3"
trueNetMSA.weight — Methodweight(row; w1=0.25, w2=0.5, w3=1.0)Return the weight of the row, calculated as:
where $n_s$ is the number of occurrences of the symbol $s$ in the aligned row $r$, and $c$ is the total number of columns in the row. The value of $x$ is equal to zero if every symbol in the row $r$ occurred at most once, otherwise $x$ is equal to the max number of occurrences (matches) of some symbol in $r$.
Examples
julia> M = NetMSA.createPeerMatrix(["abcbcdem", "acbcfg", "abchimn", "abcbcjkm"])
8×4 Array{Union{Missing, Char},2}:
'a' 'a' 'a' 'a'
'b' 'c' 'b' 'b'
'c' 'b' 'c' 'c'
'b' 'c' 'h' 'b'
'c' 'f' 'i' 'c'
'd' 'g' 'm' 'j'
'e' missing 'n' 'k'
'm' missing missing 'm'
juila> NetMSA.weight(M[1, :])
1.0
juila> NetMSA.weight(M[2, :])
0.1875