Submitted by terrykrohe t3_zyq3x7 in dataisbeautiful
Comments
Frogmarsh t1_j27e131 wrote
Why would one think missing persons are random?
terrykrohe OP t1_j27gjbq wrote
... the missing persons data for the fifty states shows random character: the t-test value of 0.96 indicates that the means of the Rep and Dem states can be attributed to random fluctuations
... when this data is compared with data for GDP, suicide, life expectancy, etc (see post 14Apr), the contrast is impressive; and the wonder of it all is "why is there a Rep and Dem difference in the data?" (note: Rep states are always on the negative side of the comparison: less GDP, more obese, more infant mortality, etc)
Frogmarsh t1_j27hgt0 wrote
Your t-tests show there is only a Rep-Dem difference with respect to rural-urban, but not missing persons. Is that the point you’re trying to highlight?
terrykrohe OP t1_j27j6ge wrote
... the t-test (0.96) quantifies only the missing persons Rep and Dem data
... there is no t-test reported for the 'rural-urban' metric: it would be very small because the means are very different)
... the point is that missing persons data is very different in character than the data of other metrics (e.g. 'rural-urban' is shown here, but GDP and others were posted previously would be similar to 'real-urban')
Frogmarsh t1_j27jrzf wrote
What is the upper right plot?
terrykrohe OP t1_j27z5ou wrote
... the upper right plot shows the 'rural-urban' values of the fifty states, ranked from more rural to more urban. This uses a definition described in the "sources" comment below.
... the bottom plot relates the top two plots: for each state its ('rural-urban', missing persons) coordinate is plotted. Is there a relationship? Do more missing persons come rural or urban states? The plot indicates that there is little (essentially none) relationship: tell me a state's 'rural-urban' value and I cannot tell you anything about that state's missing persons.
Frogmarsh t1_j29my6n wrote
So, those upper plots do not accompany a t-test?
terrykrohe OP t1_j2a47t4 wrote
from "sources" comment below:
Missing persons and 'rural-urban' metrics: note that missing persons t-test indicates that data fluctuations are probably "random" in character. (t-test = 0.96)
The large difference of 'rural-urban' means (> 1 SD) for Rep and Dem states indicate that Rep and Dem states are different Sample populations. (un-reported t-test = 0.000126)
more about the t-test:
https://en.wikipedia.org/wiki/Student%27s_t-test
Frogmarsh t1_j2acvs7 wrote
I’m guessing English isn’t your first language.
TangerineDream82 t1_j27rbg4 wrote
What, in layman's English is this data attempting to conclude?
terrykrohe OP t1_j27zsfh wrote
... is there a relationship between a state's rural/urban character and its missing persons, yes or no? Answer: No.
Note that this is a different answer than for a state's GDP: Dem states GDP Increases with increasing urban character; Rep states GDP Decreases with increasing urban character (posted 30Dec2021). This Rep/Dem differentiation repeats with all other metrics (suicide rate, obesity, infant mortality, etc) except for missing persons.
... I should have added a few other plots for comparison
terrykrohe OP t1_j279xgo wrote
sources
– missing persons https://namus.nij.ojp.gov
The National Missing and Unidentified Persons System (NamUs), US Census Bureau 2020 Population Data
– population density https://www.states101.com/populations (2014 population estimates)
– agriculture income https://data.ers.usda.gov/reports.aspx?ID=17839#P9dd070795569412d9525def18d45bde2_4_185iT0R0x0
method for "rural-urban" metric
– population density and agriculture income data values were converted to "standard scores", aka "z-scores": z-score = (data value \[Dash] mean)/SD (see Wikipedia, "Standard score")
– the z-scores were added and divided by 2; result = the 'rural-urban' metric z-score
– note1: 'urban' means "increasing population density"
'rural' means "increasing agriculture income as % of state GDP"
for the 'rural' metric to denote a "rural to urban" value,
the z-scores for agriculture income were 'reversed' by multiplying by "\[Dash]1"
before adding to the population density z-scores
– note2: "NCE" is "normal curve equivalent" (see Wikipedia, "Normal curve equivalent")
tool: Mathematica
***************
top two plots
Missing persons and 'rural-urban' metrics: note that missing persons t-test indicates that data fluctuations are probably "random" in character.
The large difference of 'rural-urban' means (> 1 SD) for Rep and Dem states indicate that Rep and Dem states are different Sample populations.
the bottom plot
– Missing persons VS 'rural-urban" predictor metric: the r-value of -0.11 indicates that the data is essentially "noise" about the best-fit line.
– Note that purple is used for best-fit line, mean, and SD because the Rep and Dem states data are NOT different Sample populations.
[deleted] t1_j28dxkd wrote
[removed]
terrykrohe OP t1_j27a6ot wrote
other comments for missing persons VS 'rural-urban'
i) The missing persons metric is the only metric which can be described as "random":
thus, it provides contrast for non-random metrics
– the non-random character of other metrics is emphasized when visualized against the missing persons visual
ii) the Alaska outlier point is curious: probably due to boating and winter incidents for which no bodies are found.