The Genetic Code and Proteins of the Other Covid-19 Vaccines

Translations: 中文, 日本語

As a followup to Reverse Engineering the source code of the BioNTech/Pfizer SARS-CoV-2 Vaccine, here is a look at the genetic code behind some of the other vaccines. I recommend at least skimming the earlier post before delving into this one, unless you are already fluent in modified mRNA bases and protein expression mechanics.

To get an extremely full background on all vaccine work, I kindly refer you to excellent posts from Derek Lowe and Hilda Bastian. Derek and Hilda keep track of almost all relevant developments. Be sure to check their blogs for the latest updates.

In this post, I focus on the genetic differences between the major SARS-CoV-2 vaccines that are on or near the market.

Very briefly, there are four major vaccine kinds that are (nearly) ready: mRNA, viral vector, protein subunit and attenuated/inactivated virus.

The inactivated or attenuated variants basically contain a dead or harmless version of the SARS-CoV-2 virus, plus often an ‘adjuvant’ designed to scare our immune system into action. From a genetics standpoint, there is not a lot more to say about them, so I will leave this category to others.

The mRNA and viral vector vaccines both attempt to get some of our own cells to produce a small but vital part of the SARS-CoV-2 virus, the famous Spike protein. If this works, our bodies mount a full immune system response against both the S protein and crucially against key signs of cells producing this protein.

Protein subunit vaccines meanwhile inject the S protein directly, but formulated in such a way that it also leads to an immune system response.

In this page I will look at:

  • BioNTech/Pfizer/Fosun: BNT162b2, also known as Tozinameran also known as Comirnaty. A modified mRNA-in-lipid-nanoparticle vaccine, expressing a modified S protein.
  • Moderna: mRNA-1273. A modified mRNA-in-lipid-nanoparticle vaccine, expressing a modified S protein.
  • Curevac: CVnCoV, an unmodified mRNA-in-lipid-nanoparticle vaccine, expressing a modified S protein.
  • Oxford/AstraZeneca: AZD1222, aka Oxford–AstraZeneca vaccine, Covishield, or ChAdOx1 nCoV-19. A viral vector vaccine, expressing the unmodified S protein.
  • Janssen/Johnson & Johnson: Ad26.COV2-S aka JNJ-78436735. A viral vector vaccine, expressing a modified S protein.
  • Gamaleya Research Institute of Epidemiology and Microbiology: Sputnik V aka Гам-КОВИД-Вак aka Gam-COVID-Vac. A viral vector vaccine using two different viruses, S protein modification unknown. But there IS a Twitter account!
  • Novavax: NVX-CoV2373. A protein subunit vaccine containing a doubly modified S protein, using a special ‘adjuvant’.

From this you can see that there are modified and unmodified mRNA vaccines, and unmodified and modified S proteins. From this list, CVnCoV, Ad26.COV2-S and NVX-CoV2372 have not received any marketing authorization. The rest is in wide or somewhat wide use.

mRNA vaccines: BioNTech, Moderna, CureVac

In the previous post, I described how the BioNTech/Pfizer vaccine uses modified mRNA to evade the cell’s own immune system. Derek Lowe also covers this aspect in RNA Vaccines And Their Lipids:

“There are also the modified bases like pseudouridine/1-methylpseudouridine that get read off at the ribosome like their native cousins (in this case, good ol’ uridine, U) but make the mRNA strand both more stable and less likely to set off an immune response against itself.”

Both the Moderna and BioNTech/Pfizer vaccines use such modified mRNA, although I think they don’t use the exact same modification. But in many ways, the Moderna and BioNTech vaccines are very alike. Sadly Moderna has not published the RNA sequence of its vaccine, so we can’t compare directly.

CureVac’s CVnCoV vaccine candidate uses regular mRNA, but it has a rather specially designed tail. I wondered a bit about this and happened to chance on CureVac’s founder Ingmar Hoerr on Linkedin, so I asked him.

He responded in public (which is very nice of him):

“For what reason you should have chemical modifications? This makes sense for gene expression for proteins avoiding immune responses as published by Kariko and Weismann. In terms of vaccines I am not convinced to have these modifications”

Now, he founded a whole mRNA company, so we should assume he knows what he is talking about.

There are patents covering modified mRNA bases, so it is possible that there are intellectual property reasons not to use modified RNA. In a Tweet, CureVac has denied IPR played a role:

“Hello, as pioneers in mRNA technology we always focused on optimizing unmodified mRNA, which in our hands allows for a potent fine-tuning of mRNA characteristics. Our technology is not directed by IP related topics but is a powerful approach offering a differentiated action mode.”

Regular mRNA frightens our immune system, and is therefore a potential adjuvant that can enhance the efficacy of a vaccine tremendously. But simultaneously, research appears to show that modified mRNA is two orders of magnitude better at expressing proteins. So I’m not sure if using plain mRNA is such a great idea. The performance of CVnCoV in animal tests is somewhat “meh” so far.

Viral vector vaccines

As noted in my previous post on the BioNTech/Pfizer SARS-CoV-2 vaccine, mRNA vaccines trick our bodies into creating (modified) S proteins by slipping some mRNA into our cells. This mRNA then dutifully gets read and S proteins are expressed.

These mRNA vaccines work very well, and this is likely because they not only generate S proteins, they do so in a very virus-like way. This is extremely educational for our immune system - it very much looks like the real thing.

Another way to generate a very virus-like experience is to use an actual (harmless) virus, but to first equip that virus with instructions to make the SARS-CoV-2 S protein.

The vaccine then infects you in a limited way with a harmless virus that then produces the S protein, much like the mRNA vaccines do. And this then sets the immune system in motion etc.

The Oxford/AstraZeneca, Janssen and Sputnik vaccines all use modified Adenoviruses as a ‘vector’. The Oxford one is called ‘ChAdOx1’, and is derived from two different adenoviruses, one human, one simian (ape):

“ChAdOx1 has been derived from a simian adenovirus (ChAd) serotype Y25 engineered by λ red recombination to exchange the native E4 orf4, orf6 and orf6/7 genes for those from human adenovirus HAdV-C5.”

The Janssen one is based on Ad26, and it may please you to know this one was first isolated from a 9 month old baby:

“The adenovirus type 26 (Ad26) wild type virus was first isolated in 1956 from an anal specimen of a 9-month-old male child”

Bet you needed to know this.

There are some important differences between the mRNA vaccines and the viral vector versions. Adenoviruses are sturdy double stranded DNA viruses. mRNA vaccines come with stringent cooling and handling requirements.

DNA is extremely stable compared to RNA. We routinely extract intact DNA from 50000 year old skeletons, for example. Viral vector vaccines do not need special cooling, and you can also shake them at will, or even do vaccinations in full sunlight. These are very important advantages.

The viruses have also been modified to be extremely efficient to produce. Although a gram of virus is already a stunning amount, Janssen reports they use 1000 liter vats now.

Now, a worry might be that if you receive a viral vector vaccine, you would develop antibodies both against the viral vector itself and against the Spike protein in there. This could then effectively mean you could use this technique exactly once per person.

But apparently this is not an issue - perhaps the viral vector is modified enough that it doesn’t impress our immune system much:

[A] clear impact of natural- or vector-induced pre-existing immunity to Ad26 on vaccine immunogenicity has not been observed to date in clinical studies. Results from clinical trials assessing repeated administration of Ad26-based vaccination approaches showed that a second or subsequent dose of study vaccine was able to boost humoral and cellular immune responses - Vaccines based on replication incompetent Ad26 viral vectors: Standardized template with key considerations for a risk/benefit assessment.

One crucial modification to the vector viruses is that they do not reproduce - this may be important for not generating an (effective) immune response.

The Oxford/AstraZeneca vaccine

What can I say. This vaccine has had a rather chaotic testing process. There were several different trial regimes, and the results of these have been combined. Part of the data may be spoiled because the vaccine dose was miscalibrated. There have also been some ugly lawsuits over side effects. It is all in all not a very attractive story.

That does not necessarily mean the vaccine is bad though, but it is a somewhat messy situation.

On the 12th of January, AstraZeneca applied for conditional marketing authorization at the European Medicines Agency, and with some luck the data submitted will provide clarity. Various other governments have already approved the vaccine. From an early look at the data, it appears little is known about the efficacy of the vaccine for those aged 65 and over.

One thing does stand out - AZD1222 is the only vaccine to use the unmodified spike protein.

So why modify the protein at all?

If you look at a real SARS-CoV-2 particle, you can see the Spike protein as, well, a bunch of spikes:

SARS virus particles (Wikipedia)

SARS virus particles (Wikipedia)

This spike is involved in the ‘fusion’ process of a viral infection. I spoke a bit about this in my previous post, but I want to provide some more detail here.

These spikes are stable in the pre-fusion conformation (shape) as mounted on the outside of the SARS-CoV-2 virus. But if left alone, for example if generated by a vaccine that does not generate a whole virus, the conformation can easily change.

The way to see it is that if the Spike should be successful at its fusion thing, the post-fusion state must be lower energy. Or seen another away, the pre-fusion spike is actually ‘high strung’. Perhaps compare it to a mouse trap that is set. Once the protein gets a chance to fuse, that tension (energy) is released into the fusion process.

Given this, it is not a miracle that a freestanding Spike, not mounted on a virus body, might fall back to a lower energy state, much like a mouse trap that is jostled.

The problem now is that we would like our immune system to develop immunity against the pre-fusion Spike. But left alone, the Spike could collapse, and this might lead to a less useful set of antibodies - they are optimized for the wrong thing, a post-fusion Spike.

Now, the jury is out on how much of a problem this is. For some other viruses (like RSV), the impact of protein conformation has been huge.

Most SARS-CoV-2 vaccines have chosen to use a slightly modified S gene where two amino acids have been changed into Prolines, adding a lot of stability. In lab testing, this increased expression by a factor of fifty. Further ‘HexaPro’ modifications are even more impressive.

The Oxford/AstraZeneca vaccine contains the unmodified Spike, and we don’t know if that has been a factor in the somewhat disappointing performance as reported so far. We may wonder if intellectual property considerations have played a role in foregoing the use of the modification.

Note: AZD1222 might function well when administered intranasally. When taken nasally, it outperformed its injected performance in this study.

The Janssen Ad26 vaccine does contain the ‘2PP’ modification, and we anxiously await their numbers, supposedly due January 21st. Initial antibody numbers look promising.

It may be possible that the Janssen vaccine ends up as a very stable single shot solution, and that would simply be wonderful.

Finally, the Sputnik V vaccine uses not one but two different viral vectors, one derived from Ad5, one from Ad26. The trials of Sputnik V have been messy and some of the government approvals have been weird to say the least. It is also not well known what is in the vaccine, or if the S protein has been modified in any way. But, this may yet turn out to be a valuable vaccine.

Protein sub-unit

So this is somewhat of an odd one out on this page, as no DNA or RNA is part of the vaccine.

Whereas the other vaccines are somewhat of an IKEA assemble-it-yourself construct, NVX-CoV2373 from Novavax injects a stabilised S protein directly.

The proteins are part of a 27.2nm nanoparticle.

NVX-CoV2373 contains the modified S protein with the two Proline substitutions. On top of that, three amino acids are changed (682-RRAR-685 to 682-QQAQ-685) to protect the protein against proteases. This presumably allows the proteins to stick around long enough that the immune system has time to get to work.

Since no cells are actually getting infected, this vaccine needs an adjuvant to excite our immune system. This adjuvant goes by the exciting name of ‘Matrix-M’, and it is based on a saponin organic chemical, usually derived from a plant.

This combination also appears to work well. The actual S protein is in this case not made by our own cells, but by insect cells infected with a baculovirus.


There are lots of proven or promising vaccines. Some of them use unmodified RNA, some use modified RNA, some use adenovirus constructs.

All of them use the S protein, but one vaccine has no modifications, three vaccines have 1 modification, and one has 2 modifications. And for Sputnik V we don’t know, but we could perhaps ask its Twitter account.

We will shortly know how well some of the adenovirus candidates work, but some initial signs are already very promising. The same goes for NVX-CoV2373.