FASTA algorithm in bioinformatics | Bioinformatics course

 

so what is this faster remember I told

you faster is another very popular

similarity Search tool Okay so

fast all capital like blast is also

similarity Search tool and in this case

also we have a query sequence

we call it

so word you know word every single

matching word

the matching word

the matching word will be known as

what let me write it down

k tuples

catapults

okay that is the matching word K tuples

as neighbor word in the blast we call it

as a in case of blast whenever the word

is matching we call them Neighbors in

case of faster if we are choosing word

which is matching we call it K2 plus

okay

now in this case the first algorithm is

actually developed by Lipman and Pearson

Lipman and Pearson founded this and

formed this algorithm

and in this case for a protein sequence

what is the length that we take in query

one to two amino acid long

for nucleotide sequence we choose 5 to 6

nucleotides so compared to blast it's

smaller in case of blast nucleotide is

now in this case words match with the

database sequence and creates diagonals

so in this case the faster uses Dot Plot

remember the Dot Plot if you recall the

Dot Plot they will understand what I

mean if you don't recall Dot Plot and if

you haven't seen my video on Dot Plot

please watch that video otherwise you

cannot understand this okay

so what it happens exactly if there is a

match then the Dot Plot is created based

on the match we what we do in the dot

plots we have this x and y axis and

there is match there is match there is

match what we if you draw a straight

line and no there is no match again a

match here so there is another straight

line so based on the Dot Plot we have

what we have this sequence a in x axis

will have sequence B in the y axis this

is what we'll have in the Dot Plot here

okay

this is how it's done

there are four steps in the process of

faster okay so first is a we search for

identical region among the query and the

database sequence

identical

region what we do we do a

search

of identical region the second thing

so whenever we find The Identical region

they are scored

this code with time Matrix

now there are scoring system in

bioinformatics there are two systems

used spam Matrix and Blossom Matrix so

again you need to know what is Spam

Matrix and Blossom Matrix how the

scoring system works you can watch my

video on that the portion of lecture on

that then you can understand blast uses

Blossom 62 but faster uses spam Matrix

so the scoring is done

and the best score is kept aside

okay best core is kept aside

then what else we have

segments

are joined

by Gap and such gapped alignment score

is known as

okay

so what we'll do

the blanks

joined

banks will be joined by the Gap

and we get a gap alignment score

gapped alignment score which is not so

because if this 100 similarity we get a

maximum score but we know that if there

is no hundred percent similarity the

score will be lesser than the maximum so

whatever value if the Gap is increasing

in Number the score values also decrease

Gap score value yeah Gap score value if

you consider the Gap the number of Gap

into the value then it will increase but

actually

if there is a gap then the matching

value number is decreasing

okay and then what we do we put an

algorithm known as meat Waterman

algorithm of local alignment

based on dynamic programming is used to

find out the optimal alignment so

ultimately use a look

Post a Comment

Previous Post Next Post