Unable to use urllib.request to download file from website


I am attempting to use the python urllib.request library to download .pdb (protein data bank) files with the full predicted molecular structure of a given protein from the alphafold website. In this example, I am attempting to download a protein with a uniprot ID of Q9BY15. The entry https://alphafold.ebi.ac.uk/entry/Q9BY15 contains a download link to the pdb file of the protein as shown below;

enter image description here

And the manually downloaded file has the following naming format;

enter image description here

Here is the block of code I am using (in its simplest form)

import os
import urllib
import urllib.request

url = 'https://alphafold.ebi.ac.uk/entry/'
prot = 'Q9BY15'
alphaname = 'AF-' + prot + '-F1-model_v2.pdb'
urllib.request.urlretrieve(url + prot, alphaname)

And here is the file that I get when I run the code;

enter image description here

As you can see, the file is far smaller than the actual size of the real file (despite having the exact same name), and is effectively empty when viewing it through protein identification programs. How would I rewrite this code to pull the actual file?


I’m not sure if this will solve your problem but the correct url for downloading the pdb file of Q9BY15 is

Try replacing /entry/ in the link with /files/.

Answered By – NerdyGamer

Answer Checked By – David Goodson (AngularFixing Volunteer)

Leave a Reply

Your email address will not be published.