Thursday, February 19, 2015

I was getting ready to write something* to replace Biopython's SwissProt module today as it can't parse Uniprot entries yet. Just as I was starting to do so, I found BioServices, which appears to plug some of the holes in Biopython with regards to services like Uniprot and KEGG. These services all have REST APIs so they aren't difficult to retrieve individual entries from. Writing a new parser for each service, though, is tedious and I'm glad there's an off-the-shelf solution.

One caveat: the Uniprot search function is billed as "a bit unstable". I haven't been able to get it to complete any searches successfully. BioServices may be more useful if a list of accession IDs is already available, though that defeats the purpose for me.

Additional caveat: the REST API does most of the work already in accessing Uniprot, so if you're willing to get your hands dirty then you can access the database contents easily with BeautifulSoup and the right urls. A simple proteome query may look like this:
"http://www.uniprot.org/proteomes/?query=" + your_query_here + "&fil=reference%3Ayes&sort=score"

*And in the end, I wrote it myself anyway.

No comments :

Post a Comment