Biocurious is a weblog about biology, quantified.

Protein unfolding as a wormlike chain?

by PhilipJ on 22 August 2006

Wormlike chains do a reasonably good job at describing the force-extension relation for dsDNA, where at low forces the model reduces to a Hooke’s law form and at high forces a non-linear inverse square term describe the rapid increase in force as you start reaching the molecule’s contour length. At low forces you’re basically just removing the configurational entropy that a floppy polymer can have, and at high forces you’re starting to do work on the bonds holding the molecule together, but in all cases the form of the molecule stays the same, that of a isotropic rod.

Given that this is the model used to derive the wormlike chain, I have never been able to understand, then, why the people doing protein unfolding use wormlike chains to fit their data as well. A folded protein is nothing like an isotropic curled up rod, and configurational entropy has basically nothing to do with the folded structure.

The reason I’m thinking about this now is because there is an interesting paper in this week’s PNAS from the Rief lab building on the work André described from this year’s Biophysical Society meeting on the deformation response of single GFP molecules unfolded from different pairs of amino acids by atomic force microscopy (subscription required). The directions accessible to them are shown here:

As stated in their conclusions, ”[a] richness of of behaviors with widely varying unfolding forces and various directional spring constants could be observed for the case of GFP”, and carrying out similar studies on the myriad of protein types will surely lead to further exciting results in the future. My only complaint is that the WLC fitting of the data is sometimes absolutely horrid,

(where the y axis is force in pN, and the x axis is extension in nm) and I’m wondering if this is just a case of good data being hard to come by, or the fact that the WLC model really isn’t applicable to protein unfolding. In the case of DNA, the model miserably fails at around 20-30 pN, while here the model is applied all the way up to (in some cases) 600 pN. Perhaps our local AFM expert can comment!

  1. hendrik    4349 days ago    #

    Many peope have difficulties to recognize the regularity of force curves since unfolding forces fluctuate. That is why we overlaid the force extension trace with an equally spaced lattice of WLC traces (The grey lines are no Fits!) – as a guide to the eye. The individual peaks in the experimental trace on a GFP polyprotein sometimes fall below or above the elasticity behaviour predicted by the WLC. This has several reasons: Experimental fluctuations like tiny cantilever drifts or striae floating through the optical pathway etc etc. Additionally, the WLC does not describe polypeptide elasticity correctly. For instance, as you commented already, in order to reproduce the elasticity of unfolded protein chains in different force regimes one needs to use different values for the persistence length. The “persistence length” (if such a parameter exists) of a given real polypeptide of course does not change with force, it’s the model which is too simple and excludes for instance enthalpic contributions. In this case, we chose a persistence length of 0.5 nm which is okay for the force regime 50-150 pN, for forces 150-300 0.35 nm is fine, above even less.

    However, on the average, these fluctuations cancel out and the regularity in the traces becomes obvious. Example: take a look on the fourth and the last peak: there the WLC lattice is fairly on top, while for the peaks in between matching is worse. Nethertheless, we recognize that there are nine peaks in between and also nine closely matching WLC lines in between. The average contour length increase associated with an individual unfolding event is thus simply given by the spacing of the WLC lattice. It remains, of course, to interpret this average contour length increase. In the supporting information of the paper we show a figure of average contour length increases obtained by actually fitting the WLC model to individual peaks in the traces – one gets narrow and normally distributed histograms.

    We thought (maybe erroneously), this would be an intuitive way to present the data. It should show us that the WLC model is nothing more and nothing less than a useful scale bar.
    Hendrik (one of the authors)

  2. PhilipJ    4348 days ago    #

    Hi Hendrik, thanks for the reply!

    That clears things up quite a bit, and the use of the WLC curves as mostly just a scale bar makes more sense to me now. Perhaps I should look at the supplemental materials next time!

    Getting back to the applicability of the WLC model to proteins, whenever there is some self-interaction (with, say, ssDNA or ssRNA that is able to form secondary structure with itself), the f-z curves display all kinds of interesting substructure and the entire data set can’t be fit to a single curve as the contour length has effectively increased or decreased if secondary structure is formed or pulled apart. I understand that this is the analysis you’re using for the entire data set as well, but even inside individual curves this should be happening as domains are coming apart. Are the changes in the contour length of the entire molecule short enough that you need not worry about this in the analysis?

    I guess another difficulty with proteins is that they’re all so different, and that a generalised model probably won’t exist that will be applicable to them all. Are there any alternatives to the WLC for protein data?

  3. Uncle Al    4348 days ago    #

    Protein is not dumped into vacuum and then allowed to find its minima in water. Protein is linearly extruded from its ribosome into water. There may be post-transcription editing.

    Uncle Al would fold proteins the way Mother Nature does it, starting from one hydrated end. The engineering approach, a holistic eigenvector diddle, is academically elegant but mostly irrelevant until after the fact. Biochemistry is not merely physical organic chemistry writ large. Biochemistry pulls 100 pounds of results from a 10 pound bag by being locally mechanistic with feedback.

  4. hendrik    4348 days ago    #

    Hi Philipp,
    given that the single molecule we pull on has a modular structure, that is a linear chain of covalently linked, folded protein domains (in this case just a chain of several folded GFP domains), we interpret the regular sawtooth data as follows: the data point at which force peaks reflects at-once loss of tertiary structure of a single folded protein domain. Such unfolding replaces the stiff folded domain by a much more compliant polypeptide spacer consisting of the number of amino acids involved in the unfolding event. These spacer account for the WLC-like elastic behaviour of the stretched polyprotein. In fact, there is always some (unfolded) polypeptide spacer in the molecules right after pick up. Since stiffness of the folded domains is approx a factor of 1000 higher than the stiffness of unfolded polypeptide spacer, all the stretching and deformation goes into the unfolded spacers (springs in series). That’s why we don’t worry about the actual composition of the molecule (that is, how many folded domains, how long is the spacer), since the elastic behaviour is entirely spacer-dominated. The presence of folded domains in the stretched molecule becomes only clear when we observe those sudden lengthening events at certain stretching forces. I’m not sure if this is what you meant in your first question.

    The loss of tertiary structure in such an unfolding event seems to be very often two-state like. Only in very few cases we can detect partial unfolding of a folded protein domain (e.g the traces on 3,212 linked GFP show partial unfolding along this axis). One may think that only the fully folded protein domain does have mechanical stability, while secondary motivs alone do not. On the other hand, it may well be that after initiation (what is that?) of unfolding, all the subsequent structural rearrangements happen so fast that we cannot resolve them.

    Actually,I’m quite happy with the WLC model. I only use it to measure contour length gains. Once calibrated, one can infer structural information about the folded protein domains from that contour length gains.
    All the mechanical information about the folded domains resides in the forces at which unfolding events can be observed. With some modeling, one can extract barriers and potential widths of the folded protein domain. And currently, the protein domains investigated so far behave all mechanically quite different (in terms of unfolding forces) and there is currently (to my knowledge) no picture present which tells us why.

  5. ilya    4347 days ago    #

    Excellent discussion, gentlemen. I too saw the work presented at the biophysics conference.

    Correct me if I’m wrong, but isnt the WLC model used because it offers the fewest number of adjustable parameter, all more complicated models being too underdetermined?

    Also, your work doesnt account for possible twisting or torquing of the protein, is that correct?

    All in all, a very interesting paper and I am glad I sat in on that talk! I’m looking forward to seeing simultaneous flourescence quenching studies on GFP under force unfolding

  6. Andre    4347 days ago    #

    Hi Ilya,

    I think the reason for choosing the WLC is that it is often good enough, not that models incorporating an enthalpic contribution are necessarily under determined (the work on modelling DNA overstretching comes to mind).

    An important thing to remember is that even though it’s fairly simple, the WLC model has enough freedom that it loos “good enough” even when its inherent assumptions break down. This issue also came up several times during the same session at the Biophysical Society meeting although not in relation to Hendrik’s work.

    A paper by Igor Kulic and friends addresses the case of looping and sliding during DNA extension and its effect on the apparent persistence length from WLC fits. Using reasonable parameters they find that results can be off by an order of magnitude. Here’s the paper:

  7. Bob Farley    2775 days ago    #

    I am trying to find the persistence length of a polymer given a set of data points modeling the polymer as a WLC. I am trying to do this in Matlab and cannot get the right answer, can anyone help? So far I have been trying to use the wikipedia formula <t(s2).t(s1)>=exp(-s/A) where A is the persistence length.

  Textile help