Help:Uploading molecules

From Proteopedia

Jump to: navigation, search

This article gives step by step instructions for uploading an atomic coordinate file (a "molecule" or PDB file) for use in molecular scenes in Proteopedia.

This article assumes you are familiar with Proteopedia's Molecular Scene Authoring Tools (SAT). If you want to improve your familiarity with the SAT, the best place to start is with the Video Guides.

Contents

Don't upload files available from www.pdb.org

If the molecule you want to display is available from the Protein Data Bank, you don't need to upload it. In the load molecule dialog of the Molecular Scene Authoring Tools (SAT), simply give the PDB code, and Proteopedia will obtain the PDB file for you.

In fact, Proteopedia will automatically make a frozen copy of the current PDB entry for you. This will prevent your green links from being broken by future remediations of data files in the Protein Data Bank.

Don't upload a file unless you have permission!

Once a file is uploaded to Proteopedia, it becomes publicly available and anyone can view it and download it.

  • If the file is one you created entirely from your own work, it is OK to upload it.
  • If the file is the work of others, or derived from the work of others, be sure that you have permission to share it publicly before uploading it to Proteopedia. And of course you must properly cite the original work.
  • If the file is based upon, or derived from, files published in the Protein Data Bank (www.pdb.org) you do not need explicit permission from the authors, but of course you must properly cite the original work. This can be done either by providing links to the original PDB codes in your Proteopedia article, or by directly citing the relevant publications.

Please see also Proteopedia:Guidelines for Ethical Writing.

Uploading a file from www.pdb.org after modifying it

If you want to upload a modified version of a file available from the Protein Data Bank, please indicate the modification in the filename. For example, there are eight chains in the asymmetric unit of 3zsf, but it is believed that the functional form, the biological assembly, is a single chain. So you might want to delete chains B, C, D, E, F, G, and H from 3zsf.pdb (downloaded from the PDB), and upload a file containing only chain A. Please do not name your modified version 3zsf.pdb because that name conflicts with the published, eight-chain version of the file. Give your uploaded file a name that indicates the modification, such as 3zsf_chain_a.pdb.

How to upload a PDB file

These instructions refer to PDB files, but the format of the atomic coordinate file does not matter. It can be cif, xyz, or any of the 40-some atomic coordinate file formats readable by Jmol.

  1. Watch the short video that demonstrates uploading a file. The video concerns an image file, but will still be quite helpful. The steps below concern a PDB file or "molecule".
  2. Compress the PDB file, especially if it is more than ~200 kilobytes in size. Use WinZip in Windows, Finder's "compress" in Mac OS X, or gzip. Files compressed by these methods are recognized and decompressed automatically by Jmol. If you're not sure about your compression method, simply try dragging the compressed PDB file and dropping it into the black window of the Jmol application. If the molecule appears, Jmol decompressed it automatically.

  3. Click Upload file in the toolbox at the bottom of the left side of every page in Proteopedia. Choose the file to be uploaded from your disk. Do NOT click the upload button yet.

  4. Choose a file name carefully and put it in the Destination filename slot. The filename should be unique enough to distinguish it from all the other hundreds of files uploaded into Proteopedia, and descriptive enough to identify it. Do not include spaces in the filename -- use underscores or dashes to separate words. The filename should end in .pdb (or another file type that identifies the data format), or if compressed, .pdb.zip or the filetype that identifies the compression method.

    If (and only if!) you are uploading an unpublished PDB file that is to be hidden until the date of publication, the filename should begin with "Workbench_" (see Proteopedia:Workbench).

    Good filename examples (unique, descriptive):

    Poor filename examples (too general, not descriptive):

  5. Describe the contents of the file in the Summary box. You could tell how it was obtained or generated, cite the source, and specify any restrictions on the use of the file imposed by the copyright holder. If the filename includes a PDB code, explain how the contents differ from the published PDB entry.

  6. Click the Upload file button. You will next see a page about the uploaded file. In Proteopedia, every uploaded file (regardless of the type of content) is stored in the Image: namespace.

How to create a molecular scene showing the uploaded PDB file

  1. Copy the full filename as shown in the title of the uploaded file page.
    If you don't have the full uploaded filename in front of you, here is how you can find it. Be sure to sign in to Proteopedia and then click on your name at the top of the window. This takes you to your User page. In the toolbox (lower left on every page in Proteopedia) click on User contributions. On that page, change namespace from All to Image. Click Go.
  2. Edit the page where you want to show the molecule.

    If you are unclear about editing pages, inserting Jmol with the 3D button, and using the SAT, please review the relevant video guides.
  3. Show the Scene Authoring Tools.

  4. In the load molecule dialog, paste the full filename of your uploaded molecule into the slot labeled From Proteopedia uploaded file. The name should not include the final '.gz' extension; be careful too with underscores in the filename, which may have been converted to spaces; make sure to use the underscore. Click the load button. Momentarily, you should see your uploaded molecule in the SAT's Jmol. Color and render as desired, and save the scene to a green link as shown in the Video Guides.

  5. Theoretical models: If your model is a macromolecule generated from theory (ab initio, homology modeling), copy the template below and paste it into the very top of your page.

    {{Theoretical_model}}

    This will display a cautionary banner at the top of your page, like the one at the top of Structure of E. coli DnaC helicase loader. If your model was determined by empirical experiment, such as X-ray crystallography or NMR, or if it is a small molecule (less than 100 atoms), it is not necessary to include this caution.

Additional considerations for large files

For very large files, such as large complexes or morphs, you may wish to take steps to reduce file size further prior to using file compression. In the end, the scene using your upload PDB file should load for a user in a reasonable amount of time. The best way to do this is to exclude portions of the PDB file not critical to scenes you will generate. Generally, one can save just the alpha carbons of the protein residues as Jmol can still represent the proteins in many forms, including cartoon, backbone, trace, and ribbons.

  • Ways to extract only the alpha carbons include:
    • Use the 'Pick Cα, backbone and side chain atoms' option at PDB Goodies.
    • Use a filter in the load command in the Jmol application to load only the alpha carbons, and then write a new file with the coordinates. Here are instructions.
    • An option for Windows-users is downloading Eric Martz's PDB Tools and using an included MS-DOS program, alphac.exe. This program keeps alpha carbons and phosphorus atoms from nucleic acids.
    • Michael Palmer's MakeMultimer has an option to return only backbone atoms (4 atoms per amino acid instead of a single alpha carbon atom). It also returns nucleic acid backbones (6 atoms per nucleotide).
  • For important residues, the full information can be added back to the reduced PDB file, by re-inserting portions of the original source PDB file using a text editor.
  • Appropriate names for such a file may include the letters ca and the alterations should be clearly noted in the description. You should also note the modifications on the actual Proteopedia page where the file is displayed, as the modifications may substantially affect possible views that other users may generate using your file.

See 'Visualizing large molecules' for additional tips and solutions for dealing with issues related to PDB file size.

Proteopedia Page Contributors and Editors (what is this?)

Eric Martz, Wayne Decatur, Angel Herraez

Personal tools