1
$\begingroup$

https://alphafold.com/entry/Q9Y499

Starting at line 918 of the downloadable mmCIF file, it looks like there might be coordinates in x, y,z:

ATOM 1 N N . TRP A 1 1 ? 12.549 -2.565 -21.811 1.0 96.06 ? 1 TRP A N 1 Q9Y499 UNP 1 W ATOM 2 C CA . TRP A 1 1 ? 12.477 -2.290 -20.365 1.0 96.06 ? 1 TRP A CA 1 Q9Y499 UNP 1 W ATOM 3 C C . TRP A 1 1 ? 11.221 -2.928 -19.812 1.0 96.06 ? 1 TRP A C 1 Q9Y499 UNP 1 W ATOM 4 C CB . TRP A 1 1 ? 12.469 -0.785 -20.103 1.0 96.06 ? 1 TRP A CB 1 Q9Y499 UNP 1 W ATOM 5 O O . TRP A 1 1 ? 10.206 -2.922 -20.501 1.0 96.06 ? 1 TRP A O 1 Q9Y499 UNP 1 W ATOM 6 C CG . TRP A 1 1 ? 13.795 -0.114 -20.270 1.0 96.06 ? 1 TRP A CG 1 Q9Y499 UNP 1 W ATOM 7 C CD1 . TRP A 1 1 ? 14.286 0.508 -21.369 1.0 96.06 ? 1 TRP A CD1 1 Q9Y499 UNP 1 W ATOM 8 C CD2 . TRP A 1 1 ? 14.791 0.098 -19.228 1.0 96.06 ? 1 TRP A CD2 1 Q9Y499 UNP 1 W 

I am a bioinformatics noob (took an intro class in my undergrad CS degree at a Canadian university) but know some Unity game engine with which I want to build a standalone visualization tool for users with VR headsets.

$\endgroup$
5
  • 1
    $\begingroup$ "I want to build a standalone visualization tool" - Why, exactly? There are protein viewers in many languages (c - RasMol, Java - Jmol, C/Python - PyMol, Javascript = NGLViewer (?), etc) that will be able to show the structure in all sorts of detail. $\endgroup$ Commented Aug 5, 2022 at 22:39
  • $\begingroup$ more helpfully, I suppose the mmCIF format is here : ww1.iucr.org/iucr-top/cif/mm/index.html ; also yes, those look like the x/y/z coords (or Cartn_x, Cartn_y, Cartn_z) Again, there are mmCIF readers mmcif.wwpdb.org/docs/software-resources.html $\endgroup$ Commented Aug 6, 2022 at 14:43
  • $\begingroup$ @gilleain how about in VR? $\endgroup$ Commented Aug 6, 2022 at 14:48
  • $\begingroup$ Ok, that is slightly different, I see. For Jmol, at least, this was done a while ago - github.com/Jmol-OVR - in general, I would hope that some VR software could integrate with an external library. If you like, you could edit your question to focus on VR and I could make this comment into a proper answer $\endgroup$ Commented Aug 7, 2022 at 7:54
  • $\begingroup$ There are a few VR solution out there on the market, e.g. Nanome. The rate limiting step is the cost of the VR headset and user training so only biotech startups have them! $\endgroup$ Commented Aug 8, 2022 at 9:22

1 Answer 1

3
$\begingroup$

Don't

A protein is not only list of atomic coordinates. Bar for a proof of concept (i.e. learn to code in Unity VR), it is an endless spiral of hurt in a field where there is a lot of competition and were you to push a product you'd need to invest more manhours in publicity than in coding...

Connectivity

Here is the short of it. Chemistry is classically a graph network: nodes=atoms, edges=bonds. The connectivity of the atoms is dictated by their residue's 3-letter name, which is either

  • assumed —software know the 20 AAs (and several modifications, eg. SEP, 4x2 bases and some ligands, e.g. HOH— or
  • a special entry in the PDB or mmCIF file —in PDB it's a CONECT entry. E.g. https://www.rcsb.org/ligand/CFF

AlphaFold2 does not have ligands, AlphaFill does. But scientists (not PR specialists) use PDB entries or run their own modelling. AlphaFold2 from EBI is rarely used: oligomers with ligands is where it's at. Especially if doing a drug discovery campaign (and red biotech can and does afford VR headsets).

Format

mmCIF is a deposition standard, PDB is the workhorse standard. mmTF is a cool idea that is not catching on (cf XKCD comic strip about standard proliferation...). Using PDB is easier: https://www.wwpdb.org/documentation/file-format-content/format33/v3.3.html

Representation

Problems arise when offering representations —everyone has their favourite. See NGL gallery for an array: https://nglviewer.org/ngl/gallery/index.html Surface display has additional issues —PyMOL simply and openly reuses the Advance Poisson-Boltzmann Solvation tool as it's tricky.

Dragons in the minefield

In a PDB entry not from AF2, you can have disulfides (SSBOND and/or CIZ entry), isopeptide bonds (LINK), missing atoms, the UNK, UNX and UNL residues, alphatraces, insertion sequences, alt. occupancy, gaps, non-standard usages of all of these etc etc. and my favourite implied proximity bonding.

$\endgroup$
6
  • $\begingroup$ Okay, I get it. AlphaFold is not at all popular, heavily overhyped and has severe technical shortfalls. Good to know, thank you. $\endgroup$ Commented Aug 8, 2022 at 13:49
  • 1
    $\begingroup$ @M__ The algorithm (and its variants) are utterly-unutterably–undeniably phenomenal. It makes great complexes, thus making CoIP Westerns a thing of the stone age, does away with C-alpha traces in cryoEM and much more. It's the off-the-shelf EBI-AlphaFold2 models that are not as revolutionary as the publicity. The latest Trembl release is rather pointess in fact. Say one wants to see how a given pyrethrin-analogue drug binds in different arthropods? One would need to model the GABA channel as an oligomer. Norleucine insesitive Eco MetK? It's a missense in a patent not in Trembl. etc. $\endgroup$ Commented Aug 8, 2022 at 14:28
  • $\begingroup$ Thanks @MatteoFerla thats insightful. Pyrethrin-analogue in insects - this I do understand, E. coli metK dunno. CoIP Westerns being obsolete is a big statement - I did hear about the antibody binding prediction: I need to look into it in more detail and might forward this as a question in a month or so. Thanks again. $\endgroup$ Commented Aug 8, 2022 at 19:28
  • 1
    $\begingroup$ So basically "Don't bother for the above reasons" is what I'm hearing, is that correct? $\endgroup$ Commented Aug 8, 2022 at 20:57
  • $\begingroup$ Conditionally: if it's for learning, do go ahead but don't let feature crawl get you! If it's to make a business that will compete against Nanome, yes, don't bother. $\endgroup$ Commented Aug 8, 2022 at 21:01

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.