Parsing Alignment Object #71

jpearl01 · 2018-10-05T15:27:17Z

First, thanks for implementing this, it has been very handy for me. I was wondering if there were methods available to iterate through an alignment object for each residue position and specifically look for differences between the query and target sequences. The way the alignment object looks to be structured, I can get access to the individual query and target sequences, but it looks like the only way to actually get the alignment is to parse the cigar string, and recreate the alignment from that? Is there a way to easily do that? My google foo is failing me here, but maybe you can point me in the right direction?

Thanks in advance!

danmaclean · 2018-10-05T16:27:40Z

Hi @jpearl01

Looks like we never implemented this. It is kind of complicated, but I can see why you'd want to do it.

I found this discussion on how it might be done https://www.biostars.org/p/112382/

This reference to a tool that does it https://www.biostars.org/p/110498/

and this repo for the tool, https://github.com/mlafave/sam2pairwise

Hope this is helpful. I don't think any of us have much time to implement this quickly (like even in the next couple of months ) but it seems like something we should think about.

Thoughts @homonecloco ?

homonecloco · 2018-10-08T10:33:32Z

Hi @jpearl01 ,
As @danmaclean , we haven't implemented a functionality like this, but I'd been messing a bit with CIGAR lines in other projects, so I may be able to get something on the library, but I can't promise a timeline. However, what do you think would be more useful? The easiest would be to return an array with two strings, or a SequenceHash from bioruby, but that would incur some overhead.

jpearl01 · 2018-10-18T01:18:09Z

Whoops, sorry for the delay. For our particular project just having multiple sequence alignments ended up working fine for us, so we ended up not pulling the alignments out of BAM, but I'm still very interested in having that kind of functionality. Personally I'd be fine just having a function that would return a normal array(s) - at that point if we wanted to pull it into a bioruby sequence object it would be relatively trivial. I'm not sure if that keeps with the philosophy of having a bioruby related package (i.e. would people want to stay within the ecosystem and expect a bioruby object?) but I would be totally fine with normal arrays, and we wouldn't need any further processing to do our specific analysis.

sam2pairwise is actually very close to what I was thinking about... Thanks for the links and comments! Will keep an eye on this.

homonecloco self-assigned this Nov 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing Alignment Object #71

Parsing Alignment Object #71

jpearl01 commented Oct 5, 2018

danmaclean commented Oct 5, 2018

homonecloco commented Oct 8, 2018

jpearl01 commented Oct 18, 2018

Parsing Alignment Object #71

Parsing Alignment Object #71

Comments

jpearl01 commented Oct 5, 2018

danmaclean commented Oct 5, 2018

homonecloco commented Oct 8, 2018

jpearl01 commented Oct 18, 2018