Wolfram Research, Inc.
Registered: Oct 2003
Manuscript Review of “Understanding SARS with the New Kind of Science”
My overall view of paper is mixed with a slight positive bent. The author’s presentation is very poor, with a highly unsubstantiated conceptual foundation. However, some of the specific results given are quite promising, raising the possibility of an accidental fluke. After preliminary research was done, there is reason to believe the results are not accidental.
Presentation and Style
Overall presentation of the paper is very poor, even when discounting the language barrier issues. No care is given to insure that either the conceptual or practical components of the paper are clearly presented. Specifically:
- The author prominently relies on a self-coined notion called “Power-relationships”, without providing any explanation of the concept in the introduction of the paper. In the latter half the idea is better elucidated, yet the exposition remains ultimately lacking.
- One central component of the paper is the use of a cellular automaton to detect non-sequential homology between genomes. No complete explanation is given anywhere for the encoding of the DNA sequence into initial conditions. Only a partial explanation is given in the diagram legends section. This is non-trivial because the underlying theoretical foundation for why this CA works properly, particularly in relation to the author’s “Power-relationships” idea, is dependent on the encoding. I shall elaborate on this point in the Content section below.
The poor presentation of the paper is worth noting, because it betrays the author’s inability to differentiate between trivial and non-trivial claims, and the necessity to carefully substantiate non-obvious claims.
The situation is mixed in the content area, requiring closer analysis that ultimately yielded interesting possibilities. Quality of the content varied substantially between the larger conceptual claims, which are poor, and the more concrete conclusions relating to non-sequential homology between SARS and Equine Rhinovirus (ER.) On the latter point, my impression was quite positive and prompted further research.
Most of the problems in this area stem from the author’s poor and vague conceptualization of what is termed “Power-relationships”, an idea that I can best summarize as such:
Different nucleotides in a genomic sequence stand for different “powers”, represented by white or black cells, with the distribution of cells determining how the power dynamic results in a final nested pattern. Aggregations of one color of cells cause rays to emerge, with more powerful rays (resulting out of greater aggregations) destroying smaller rays of the opposing color. This interaction results in a distinctive nested pattern specific to a genome.
Below is a list of problems emerging from this poorly defined idea:
- No explanation is provided as to why this makes physio-chemical sense. There is no reason to believe that nucleotides behave in this way, particularly ones that are quite distant on a DNA strand.
- The author contends that the method provides for non-sequential analysis of the DNA sequence, because DNA segments far apart can affect each other quite dramatically in the model. The idea is interesting, but again no physio-chemical reasoning is provided as to why this specific “Power-relationship” model is the appropriate way to capture the non-localized interactions that are known to occur in genetic sequences.
- Exacerbating matters, the encoding of nucleotides into black/white cell sequences is not fully provided, making it unclear what the delimiting factor is. I expect that the grouping is being done amongst Purines and Pyrimidines, the two different biochemical classes of nucleotides. Even if this is the case, there stands no reason for this encoding to make biochemical sense.
- The CA studied results in three general patterns of behavior: nested, purely repetitive with left growth, and purely repetitive with right growth. All three possibilities are represented in the organisms under study. Based solely on this point, the author conjectures that because the three organisms come from phylogenetically different families, the high divergence of the CA behavior makes it a good categorization tool. There are many problems with this claim.
1. Three basic categories provide a very poor method for sorting viruses—there are a lot more than three viral families out there.
2. The author claims to have studied 63 viruses, when in reality only 4 different ones were studied. There are many variations of the basic 4 that result in the total figure of 63, but essentially only 4 genome families were considered. This is insufficient to suggest this method for phylogenetic analysis. It is quite possible that highly unrelated genomes would show similar behavior in this CA, and highly homologous ones would show very different behavior.
- The author further postulates, albeit not very explicitly, that the beginning and end of a nested triangle together represent a regulatory cycle. A white ray signifies the beginning of a protein production cycle, and a black ray signifies the end by closing off the triangle. As before, no reasons or physical evidence is given as to why this is the case.
Conclusions on specific claim of homology between SARS and ER
The author’s work in this area is much more interesting, and provides some evidence that his method maybe generalizable to other organisms. It is not entirely clear whether the positive results were accidental or not, but initial analysis suggests otherwise.
- The author claims that the studied CA suggests strong homology between SARS and Equine Rhinovirus, a surprising statement at first since the two viruses come from different families.
- CA presented in diagram 3 confirms his claim, which initially suggested that the CA is not working properly, since the two viruses are quite divergent phylogenetically.
- After preliminary research it was found that a particular protein, the 3CL-PRO Protease, is in fact shared between the two organisms, and that Pfizer is currently working on using a 3CL PRO Protease Inhibitor that works on ER to suppress SARS.
- Diagram 4 gives a general sketch of the SARS genome, and the various triangular structures that have analogues in ER’s genome. The paper provides no information as to whether the region of 3CL-PRO in SARS actually maps to ER.
- After further research, there is reason to suggest that this may in fact be the case. 3CL-PRO lies within the Replicase 1A protein, approximately at base 3500 and extending for 250 bases. The pattern of triangles produced in that region roughly maps to the same region of nesting that occurs in the ER genome.
The above points are non-trivial, and suggest that some non-sequential homology mapping is being done. The paper should have clearly included all the above information to strengthen its point that the 3CL-PRO is in fact the homologous region between the two genomes. Completeness notwithstanding however, the results are still interesting and warrant further investigation.
It may prove fruitful to raise with the author with the following points:
1. Specific issue of non-sequential homology, requesting more detailed analysis of the two genomes, including exact matching of the 3CL-PRO sequences.
2. More exhaustive comparisons beyond the 4 basic viruses, verifying whether the method is generalizable, and that observed results are not restricted to the genomes in question.
Report this post to a moderator | IP: Logged