{"id":484,"date":"2017-03-03T10:32:47","date_gmt":"2017-03-03T18:32:47","guid":{"rendered":"http:\/\/genome.ucsc.edu\/blog\/?p=484"},"modified":"2021-10-30T22:15:30","modified_gmt":"2021-10-30T22:15:30","slug":"the-new-ncbi-refseq-tracks-and-you","status":"publish","type":"post","link":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/2017\/03\/03\/the-new-ncbi-refseq-tracks-and-you\/","title":{"rendered":"The new NCBI RefSeq tracks and You"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">The release of the new NCBI RefSeq track marks a major shift in how we include annotations from NCBI\u2019s Reference Sequence Database (RefSeq) in the UCSC Genome Browser. This new track is a composite track that contains the combined set of curated and predicted annotations from the RefSeq database for hg38\/GRCh38. It also contains tracks that break up the annotation set into a few subsets. These subsets include only the curated transcripts (NM, NR, or YP transcripts), only the predicted transcripts (XM or XR transcripts), all of the other annotations from RefSeq that don\u2019t fit into the curated or predicted subsets, and the alignments of the curated and predicted transcripts to the genome. All of the coordinates and alignments in these tracks are provided by the RefSeq group.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This new NCBI RefSeq composite also includes a \u201cUCSC RefSeq\u201d track that is based on our original method of producing the \u201cRefSeq Genes\u201d track. This \u201cUCSC RefSeq\u201d track is built by aligning RNAs obtained from the RefSeq Database to the genome. In the early days of the UCSC Genome Browser, only RNA sequences were provided by RefSeq, so we used BLAT to align them to the genome. This was a good solution in the past, but over time this method has led to some issues with transcripts matching to multiple places and our alignments of small exons or other regions differing slightly from those found in the RefSeq database. This type of minor alignment difference can be seen in the following <\/span><a href=\"http:\/\/genome.ucsc.edu\/cgi-bin\/hgTracks?hgS_doOtherUser=submit&amp;hgS_otherUserName=QAtester3&amp;hgS_otherUserSessionName=hg38.example\"><span style=\"font-weight: 400;\">session<\/span><\/a><span style=\"font-weight: 400;\">, where you can see that the RefSeq Curated (top) and UCSC RefSeq (bottom) tracks place the small fifth exon in transcript NM_001130970 at different locations due to the fact that there are multiple matches to this exon sequence in that region.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The new set of RefSeq tracks differs from the \u201cUCSC RefSeq\u201d track in a few key ways. First, as mentioned previously, the new tracks are based entirely on positions and alignments provided by RefSeq. Second, this track is currently only available for the hg38\/GRCh38 assembly. This means that if you obtain the hg38 coordinates for a RefSeq transcript from the UCSC Genome Browser, these coordinates should be the same as those from the entry found at NCBI\u2019s RefSeq Database. Lastly, these new NCBI RefSeq tracks include predicted transcripts, which were absent from our original RefSeq track.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This has been a long and exciting collaboration between the UCSC Genome Browser staff and NCBI\u2019s RefSeq group. We trust that this full complement of tracks from the Reference Sequence Database will be helpful to you, our Browser users. We hope to bring these tracks to more genome assemblies in the future.<\/span><\/p>\n<hr>\n<p>If after reading this blog post you have any public questions, please email <a href=\"mailto:genome@soe.ucsc.edu\" target=\"_blank\" rel=\"noopener\">genome@soe.ucsc.edu<\/a>. All messages sent to that address are archived on a <a href=\"https:\/\/groups.google.com\/a\/soe.ucsc.edu\/forum\/#!forum\/genome\">publicly accessible forum<\/a>. If your question includes sensitive data, you may send it instead to&nbsp;<a href=\"mailto:genome-www@soe.ucsc.edu\" target=\"_blank\" rel=\"noopener\">genome-www@soe.ucsc.edu<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The release of the new NCBI RefSeq track marks a major shift in how we include annotations from NCBI\u2019s Reference Sequence Database (RefSeq) in the UCSC Genome Browser. This new track is a composite track that contains the combined set of curated and predicted annotations from the RefSeq database for hg38\/GRCh38. It also contains tracks [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-484","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/posts\/484","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/comments?post=484"}],"version-history":[{"count":6,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/posts\/484\/revisions"}],"predecessor-version":[{"id":944,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/posts\/484\/revisions\/944"}],"wp:attachment":[{"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/media?parent=484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/categories?post=484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/genome-blog.gi.ucsc.edu\/blog\/wp-json\/wp\/v2\/tags?post=484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}