PeptidesDNA
๐Ÿ”ฌGenetics & DNA

What Your 23andMe Data Can Actually Tell You About Peptides

Your 23andMe raw file is a survey of 650,000 known variants. Here is which peptide-response SNPs it reliably genotypes, what gaps exist across chip versions, and why the data is more useful than most realize.

11 min readยทMay 26, 2026
P

PeptidesDNA Research

Editorial Team

โšก

TL;DR

  • 1.Your 23andMe file covers 650,000 data points out of 3 billion in your genome. It's a targeted survey, not a full readout of your DNA.
  • 2.The four gene variants most useful for peptide selection are reliably covered by 23andMe, AncestryDNA, and MyHeritage.
  • 3.One key drug-interaction gene is missing from older 23andMe chips (2013-2017). You need the current version or a clinical test to get it.
  • 4.If you tested between 2013 and 2017, your file has weaker coverage for peptide-relevant genes than newer chips.
  • 5.Your existing DNA data gives a solid first pass on peptide genetics. It covers the biggest factors well, but it's not a clinical-grade test.

Your 23andMe raw data file has roughly 650,000 rows. Your genome has approximately 3 billion base pairs and an estimated 4 to 5 million variable positions. That means your raw data covers about 0.02% of your DNA. For most health questions, that slice is adequate because the variants that matter most are common enough to be on the chip. For peptide response prediction specifically, the question is which 0.02% your chip actually measured, and whether that set includes the variants relevant to your protocol. The answer, it turns out, is mostly yes for the high-impact axes and no for a targeted set of rare variants that matter in specific circumstances.

650,000

Approximate number of SNPs in a typical 23andMe v5 raw data file, out of approximately 4 to 5 million variable positions in the human genome. A 2024 analysis in Frontiers in Pharmacology (Yin et al., PMID 38529185) confirmed that the Illumina Global Screening Array -- the chip behind 23andMe v5 -- covers 503 pharmacogenomic variants across 25 genes, with 99.48% inter-run concordance for the variants it does include.

The good news is structural: the four genetic axes most predictive of peptide response are all common variants that appear across every major consumer genotyping array. COMT, BDNF, SOD2, and MC4R variants are directly genotyped on 23andMe (v4 and v5), AncestryDNA, and MyHeritage. If you have a raw data file from any of these providers, you almost certainly have reliable calls for the variants that explain the most inter-individual variance in peptide response. The gaps are real, but they are concentrated in a specific set of drug-interaction loci and rare pathway variants that matter mainly in edge cases.

In plain English

Think of your 23andMe file as a questionnaire with 650,000 targeted questions about your DNA. The questionnaire was designed mostly for ancestry and common disease risk. Some of the most useful peptide questions happen to be on it, because those variants are common enough that every major study includes them. Others are not on the questionnaire at all. The skill is knowing which questions your existing file can already answer.

The chip version problem

Does your 23andMe version matter for peptide genetics?

Yes, and more than most people realize. 23andMe has deployed four meaningfully different genotyping arrays since 2007. The current chip (v5, in use since 2017) uses Illumina's Global Screening Array, which was designed with explicit pharmacogenomics coverage as a goal. The v4 chip (2013 to 2017) is the problem case: it dropped a significant number of SNPs relative to v3, including several pharmacogenomically relevant loci, reducing coverage of drug and variant pathway genes. If you tested between 2013 and 2017, your file has structurally different coverage from someone who tested before or after that window.

The most common consumer assumption is that two people both "have 23andMe data" and are starting from equivalent inputs. They are not. Depending on when you tested, your raw data file may or may not directly genotype certain metabolizer variants. That is not a failure of the service -- the earlier chips were designed for different primary use cases -- but it is a material consideration when using the data for peptide protocol design.

VersionYears activeSNPs on chipPGx coveragePeptide-relevant notes
v32010 to 2013~960,000GoodHigh coverage including many pharmacogenomics loci. Well-covered for core peptide axes.
v42013 to 2017~580,000ReducedDropped many SNPs from v3 including some pharmacogenomic variants. CYP3A4 *22 absent or unreliable. COMT and SOD2 still present.
v52017 to present~640,000StrongIllumina GSA design includes explicit PGx targets. Best consumer array for peptide-relevant coverage. CYP3A4 *22 directly genotyped.
v1/v22007 to 2010~550,000LimitedLegacy arrays. Not suitable for reliable peptide genetics without clinical confirmation on key variants.

How to find out which version you have

Log in to 23andMe, go to Settings, then select "23andMe Data", then click "View" next to your profile. Your chip version appears in the header of your raw data file when you download it. Alternatively, open the downloaded file and look at the comment lines at the top starting with #. If you tested between 2013 and 2017, you almost certainly have v4. Testing before 2013 or after 2017 means v3 or v5 respectively. Your version tells you whether the CYP3A4 *22 variant -- the most clinically important peptide-adjacent metabolizer variant -- was directly measured or was absent from the chip.

What it covers well

Which peptide-relevant SNPs does 23andMe reliably genotype?

The variants that matter most for peptide protocol design are common enough that all major consumer arrays include them. A 2024 pharmacogenomics workflow study (Yin et al., Frontiers in Pharmacology, PMID 38529185) independently validated that the GSA chip -- the foundation of 23andMe v5 -- covers 21 of the 35 Tier 1 "Very Important Pharmacogenes" classified by PharmGKB, with 99.48% inter-run concordance. For the SNPs below, that concordance figure means your genotype call is highly reliable.

COMT rs4680 (Val158Met)Directly genotyped on all major arrays. Determines catecholamine clearance rate in the prefrontal cortex. The primary genetic input for nootropic peptide selection, including semax, selank, and BPC-157 neurological protocols. Reliable on 23andMe v3/v4/v5, AncestryDNA, and MyHeritage.
BDNF rs6265 (Val66Met)Directly genotyped universally. Affects activity-dependent secretion of brain-derived neurotrophic factor. Met carriers have blunted neuroplasticity signaling and typically underperform on GH secretagogues alone. Reliable on all consumer arrays.
SOD2 rs4880 (Ala16Val)Directly genotyped on all major arrays. Val/Val genotype reduces mitochondrial antioxidant enzyme activity 30 to 40%. The primary predictor of GHK-Cu and SS-31 response magnitude. Reliable on 23andMe, AncestryDNA, and MyHeritage.
NOS3 rs1799983 (Glu298Asp) and MC4R rs17782313Both directly genotyped on all major arrays. NOS3 predicts nitric oxide production capacity for BPC-157 response. MC4R affects satiety signaling downstream of GLP-1 receptor activation. Universal coverage across consumer platforms.

GLP1R rs6923761 -- the variant creating a 58% difference in semaglutide response speed between genotypes -- is where coverage becomes less certain. Its presence on the v5 chip manifest has not been confirmed in independent public sources, meaning it may require statistical imputation even on v5 data. A 2025 prospective study documented why this specific variant matters enormously:

The GLP1R rs6923761 A/A genotype was associated with a significantly higher monthly rate of weight loss (1.64% vs 1.04% in G-allele carriers), representing an approximately 58% difference in weight loss velocity on semaglutide 2.4 mg. These findings suggest GLP1R genotype should be considered a clinically meaningful predictor of GLP-1 receptor agonist outcomes.

Phan et al., Obesity (Wiley), 2025, PMID 40384505

When this variant is directly genotyped, the result is reliable. When it is imputed from nearby markers, it carries a confidence interval that the reported genotype does not make explicit. If you want a confirmed call on this specific variant and you tested before 2017 or with AncestryDNA, a dedicated peptide genetics panel or 23andMe v5 test gives you a direct measurement. For the full picture of what GLP1R predicts about semaglutide response and which related variants also matter, the semaglutide genetics guide covers the pharmacogenomics in depth.

Gene / SNP23andMe v523andMe v4AncestryDNAMyHeritage
COMT rs4680DirectDirectDirectDirect
BDNF rs6265DirectDirectDirectDirect
SOD2 rs4880DirectDirectDirectDirect
NOS3 rs1799983DirectDirectDirectDirect
MC4R rs17782313DirectDirectDirectDirect
GLP1R rs6923761Uncertain (possibly imputed)ImputedImputedImputed
CYP3A4 *22 (rs35599367)DirectAbsent / unreliableSome versionsImputed
CYP2D6 copy number variantsNot detectableNot detectableNot detectableNot detectable
Where the gaps are

What does 23andMe miss for peptide genetics?

The false-positive nuance you need to understand

A landmark 2018 study in Genetics in Medicine (Tandy-Connor et al., PMID 29565420) found that 40% of medically classified variants reported by direct-to-consumer genetic tests were false positives when confirmed in a clinical laboratory. This number is real, but it applies almost entirely to rare pathogenic variants in cancer genes like BRCA1 and BRCA2 -- not to the common pharmacogenomic SNPs relevant to peptide response. For variants with a minor allele frequency above 1%, which includes all of COMT, BDNF, SOD2, and NOS3, the GSA chip's design accuracy is substantially higher. The 40% figure is not a reason to distrust your COMT genotype call. It is a reason to distrust any consumer array result for rare cancer variants, which is a completely different category of data.

The CYP metabolizer blind spots

CYP3A4 *22 (rs35599367) is the slow-metabolizer allele most relevant to drug interaction risk in peptide protocols. Someone combining research peptides with prescription medications needs to know if they are a CYP3A4 slow metabolizer, because reduced enzyme capacity raises serum concentrations of co-medications that clear through this pathway. On 23andMe v5, this variant is directly genotyped. On v4 chips, it is absent or unreliable. The coverage on AncestryDNA depends on which version of their array you tested on. If your protocol involves any prescription medications alongside peptides, knowing your v5 or clinical CYP3A4 status is a safety input, not just an optimization input. The CYP enzymes guide covers the full metabolizer phenotype spectrum and what each variant actually determines for a peptide protocol.

CYP2D6 is a more fundamental gap. The major star alleles that define ultrarapid and poor metabolizer phenotypes involve copy number variations -- gene duplications and deletions -- that no SNP genotyping chip can detect, because SNP arrays measure single-base changes, not structural gene rearrangements. A consumer array might detect a few CYP2D6 point mutations that suggest reduced activity, but it cannot confirm your full metabolizer phenotype without a gene dosage assay. For the peptides most commonly used in research protocols (BPC-157, TB-500, GHK-Cu, ipamorelin), this gap is less critical because these peptides clear primarily via proteolytic enzymes rather than CYP2D6. For anyone combining peptides with opioids, many antidepressants, or antipsychotics, a clinical CYP2D6 test provides information the consumer array cannot.

Rare tissue-repair pathway variants

The consumer genotyping arrays were designed to detect common variants present in at least 1% of the population. The tissue-repair pathway variants most relevant to healing peptide response -- including some MMP3, VEGF, IGF1R, and COL5A1 loci -- have some common SNPs on the arrays, but the rarer functional variants with the clearest experimental evidence are often below the chip's design frequency. What this means practically: for BPC-157 and TB-500 response, your 23andMe data gives a first-pass signal on the common variants in these pathways, but it cannot rule out functionally relevant rare variants that the chip never tested for. This is not a reason to dismiss the common-variant signal. It is a reason to understand what confidence level that signal carries.

Using your raw file

How do you actually extract peptide insights from your raw data?

The raw data file is a plain text file with roughly 650,000 rows, each formatted as an rsID, chromosome position, and two-letter genotype call (for example, AA, AG, or GG). No interpretation is included. Extracting peptide insights requires matching your rsIDs against known variant associations. You can do this manually for a handful of specific variants or through an analysis service that processes all relevant rsIDs together.

Step 1: Download your raw dataLog in to 23andMe. Go to Settings, then "23andMe Data". Click "View" next to your profile. Select "All DNA Raw Data" and download. The file arrives as a compressed .txt file. Given 23andMe's 2025 bankruptcy and acquisition, downloading your data now protects your access regardless of future platform changes.
Step 2: Check your chip versionOpen the file header, which contains lines starting with #. Your chip version appears in the metadata. v5 means strong peptide-relevant coverage including CYP3A4 *22. v4 (tested 2013 to 2017) means that specific metabolizer variant may be absent. Factor this into your confidence level on drug-interaction results specifically.
Step 3: Look up the five key rsIDs manuallyFor the five highest-priority peptide variants (COMT rs4680, BDNF rs6265, SOD2 rs4880, GLP1R rs6923761, NOS3 rs1799983), open the text file and search for each rsID. Your two-letter genotype call corresponds to a functional phenotype for each variant. This manual lookup covers the primary axes without needing an analysis service.
Step 4: Use a dedicated analysis for the full pictureManual lookup covers the primary axes but misses cross-variant interactions, evidence-tiered weighting, and CYP metabolizer phenotyping. A dedicated peptide genetics service processes all relevant rsIDs, applies functional annotation, and ranks peptide candidates by predicted response for your full genetic profile.

AncestryDNA and MyHeritage raw files use the same rsID-based structure as 23andMe but have different header formats and slightly different SNP coverage. The analysis logic is identical. If you have data from any of these providers, the same lookup process applies. The differences that matter are the coverage gaps described above, particularly for GLP1R and CYP3A4. See the DNA-first decision framework for a step-by-step walkthrough of how to apply the four genetic axes once you have your variant calls in hand.

The bankruptcy question

Is your 23andMe data still accessible after the 2025 bankruptcy?

23andMe filed for Chapter 11 bankruptcy on March 23, 2025. In June 2025, a US bankruptcy court approved a $305 million sale of 23andMe's genetic platform and data to TTAM Research Institute, a nonprofit organization co-founded by 23andMe co-founder Anne Wojcicki (source: Bloomberg Law, June 2025; NPR, June 2025). The sale closed in July 2025. Raw data downloads remained available through the 23andMe customer portal as of early 2026, but the ownership transition and the absence of federal genetic data protections in US bankruptcy proceedings created legitimate questions about long-term access.

The data security context matters separately. The 2023 23andMe credential-stuffing attack exposed profile data for approximately 7 million users. PIPEDA Findings #2025-001 (January 2025), a joint investigation by Canada's Privacy Commissioner and the UK's Information Commissioner's Office, concluded that 23andMe failed to adequately safeguard user data in that breach. The combination of the bankruptcy sale and the breach investigation underscores a practical recommendation: if you tested with 23andMe and have not downloaded your raw data file, do so now. The file is yours by right, and having a local copy preserves your access to your own genetic data regardless of what happens to the platform under new ownership.

15 min

The time it takes to download your 23andMe raw data file once you locate the option in account settings (Settings then "23andMe Data"). The file is roughly 20 MB compressed. Given the ongoing ownership transition, downloading your file is the single most practical step you can take to preserve access to your genetic data independent of platform changes.

The GHK-Cu peptide page includes the SOD2 variant context that is likely already in your 23andMe file, providing a concrete example of what a genetic match looks like once raw data is processed against a specific peptide's biological mechanism.

Calibrating your confidence

How reliable is a consumer-data peptide recommendation?

The honest answer depends on which peptide you are asking about. For semaglutide and GLP-1 class compounds, the key predictive variant (GLP1R rs6923761) may or may not be directly genotyped on your specific chip version -- but the related variants (MC4R rs17782313) that complete the picture are reliably present. For nootropic peptides, the COMT variant is directly genotyped everywhere. For antioxidant-modulating peptides like GHK-Cu and SS-31, the SOD2 variant is universally covered on all consumer arrays. These are not approximations. They are the same variant calls that pharmacogenomics researchers use in published studies, delivered through a consumer interface rather than a clinical lab.

Reliability drops at specific edges: CYP drug-interaction risk requires v5 or clinical testing for the most relevant loci, rare tissue-repair pathway variants are not on consumer chips, and imputation accuracy degrades for non-European ancestries because the reference panels used to fill in missing data are less complete for those populations. If you are in a population underrepresented in the 1000 Genomes or gnomAD reference panels, your imputed variants carry meaningfully more uncertainty than the same variants imputed for a European-ancestry individual. This is a structural limitation of consumer genomics broadly.

The appropriate framing is: your consumer genomic data gives a reliable, direct measurement for the common variants most predictive of peptide response. It gives a statistically inferred estimate for some less common variants. It gives nothing for structural variants and rare loci that were never on the chip. Knowing which category each relevant variant falls into is exactly what distinguishes a calibrated first pass from a false sense of precision.

What this means for you

Your existing DNA file is a solid starting point, not the whole picture.

The genetic variants most predictive of peptide response, specifically COMT, BDNF, SOD2, NOS3, and MC4R, are directly genotyped on every major consumer array. Your existing file almost certainly contains reliable data for these variants regardless of which provider you used. The meaningful gaps are CYP3A4 *22 on pre-2017 23andMe chips, GLP1R rs6923761 which may require imputation, and CYP2D6 structural variants that no SNP chip can detect. Download your raw data now if you have not already, check your chip version, and upload it to get a ranked peptide report built from your genetic profile at PeptidesDNA. If you do not yet have genetic data, order a saliva kit for complete coverage including the variants consumer arrays miss.

ShareXLinkedIn

Your DNA shapes how you respond to every peptide in this report.

A personalized report scores 25+ peptides against your unique genetic profile โ€” including the ones covered in this article.

Frequently asked questions

Can I use my 23andMe data to choose peptides?

Yes, with some important caveats about which variants your specific chip version genotypes. The four variants most predictive of peptide response -- COMT rs4680, BDNF rs6265, SOD2 rs4880, and NOS3 rs1799983 -- are directly genotyped on 23andMe v5, v4, AncestryDNA, and MyHeritage. Your existing raw data file gives you reliable first-pass information on these variants. CYP3A4 *22 (relevant to drug interactions) is on v5 but absent from v4. GLP1R rs6923761 (semaglutide response predictor) may require imputation even on v5.

Does 23andMe version matter for peptide genetics?

Yes. The v4 chip (2013 to 2017) has fewer pharmacogenomically relevant SNPs than v3 or the current v5 chip. CYP3A4 *22, the slow-metabolizer variant relevant to drug interactions, is absent or unreliable on v4. The four primary peptide-response axes (COMT, BDNF, SOD2, MC4R) are covered on both versions, but some metabolizer variants require v5. Check the header of your raw data file to identify your chip version.

What SNPs does 23andMe genotype that are relevant to peptides?

Reliably covered on all major versions: COMT rs4680, BDNF rs6265, SOD2 rs4880, NOS3 rs1799983, MC4R rs17782313. Covered on v5 (unreliable on v4): CYP3A4 *22 (rs35599367). GLP1R rs6923761 (semaglutide response) may require imputation even on v5. CYP2D6 structural variants (gene duplications and deletions defining ultrarapid and poor metabolizer phenotypes) are not detectable on any consumer array.

Is my 23andMe data accurate for pharmacogenomics?

For common variants (minor allele frequency above 1%), 23andMe v5 has high analytical accuracy -- a 2024 validation study (Yin et al., PMID 38529185) found 99.48% inter-run concordance for the GSA chip. The 40% false-positive rate cited in some critiques (Tandy-Connor et al., 2018, PMID 29565420) applies almost entirely to rare pathogenic cancer variants, not to the common SNPs relevant to peptide response. For COMT, SOD2, BDNF, and similar high-frequency variants, your consumer array result is reliable.

What happened to my 23andMe data after the 2025 bankruptcy?

23andMe filed for Chapter 11 bankruptcy in March 2025. In June 2025, a US bankruptcy court approved the sale of the company to TTAM Research Institute (a nonprofit co-founded by Anne Wojcicki) for $305 million. Raw data downloads remained available through the customer portal as of early 2026. Privacy regulators in Canada and the UK found (PIPEDA Findings #2025-001) that 23andMe failed to adequately safeguard user data in the 2023 breach. Downloading your raw data file now is the practical way to preserve your own access regardless of future platform changes.

Can AncestryDNA data be used for peptide genetics?

Yes, for the primary peptide-relevant axes. AncestryDNA directly genotypes COMT, BDNF, SOD2, NOS3, and MC4R. GLP1R rs6923761 is imputed rather than directly measured, adding statistical uncertainty to semaglutide-response predictions. CYP3A4 *22 coverage depends on which version of the AncestryDNA array you tested on. For most peptide selection decisions (nootropics, antioxidant-modulating compounds, healing peptides), AncestryDNA data is sufficient for a first-pass genetic profile.

What is the difference between directly genotyped and imputed variants?

Directly genotyped means your DNA was physically measured at that position -- the chip's probe hybridized to that exact location in your genome and recorded a signal. Imputed means the variant was statistically inferred from the pattern of nearby directly genotyped variants, using a reference population as the inference model. Imputed results carry a confidence score that is usually not shown in consumer reports. For common variants with good linkage disequilibrium in your ancestry group, imputation is accurate. For less common variants or non-European ancestries, imputation accuracy can degrade.

This article is for informational and educational purposes only. It is not medical advice and does not diagnose, treat, cure, or prevent any disease. Consult a qualified healthcare professional before starting any peptide protocol. Individual results vary.

Get Your DNA Kit โ€” $299