reginald_burke t1_ishxsag wrote
Reply to comment by Ferociousfeind in When it's said 99.9% of human DNA is the same in all humans, is this referring to only coding DNA or both coding and non-coding DNA combined? by PeanutSalsa
Don’t we have good definitions for this, such as the Levenshtein edit distance? For your example, Levenshtein would say 24 edits (via 24 additions).
Ferociousfeind t1_isib75n wrote
Single mutations can also involve the copying or deletion of large chunks of DNA. Levenshtein would be 23 edits off, because only one event was involved in adding a single 24-segment DNA piece. This is a simple thing to calculate, but it misses some of the behavior of mutation, and so misses a bit of the picture. The more true-to-life version is more complex, more nuanced, a bit more up to interpretation, and less capable of giving a single concrete percentage.
light24bulbs t1_isi9wop wrote
Truly that's just a count of the number of differing base pairs, which makes complete sense. This isn't that complicated. I'm sure you could argue it isn't the most RELEVANT figure that a geneticist would be concerned with, but, I think it's fair to say that's what they would take it to mean. I'd love to know if I'm wrong about that.
It's binary data, run a diff and give me the count. Since we are talking about the number 24, if there's 24 base pairs out of the total different, it's just total / 24 = variance ratio.
Likewise, the average is simply: take any two people, could the number of base pairs differing in each or present in one and not the other. Do that many times between different people, that's the average.
Viewing a single comment thread. View all comments