gbradburd / construct Goto Github PK

View Code? Open in Web Editor NEW

35.0 35.0 14.0 8.92 MB

method for modeling continuous and discrete population genetic structure

R 37.51% C++ 59.32% Stan 2.93% Makefile 0.18% Shell 0.06%

construct's People

Contributors

Stargazers

Watchers

Forkers

gjanzen1 kdm9 biobenkj ttriche catherinecsun grillino cooplab khjia pythseq anjumanarif kizbaolin sethmusker canales-aguirrecb karolisr andrjohns

construct's Issues

Resuming an unfinished CV run?

Hi there,

I've been attempting to run a rather long CV (~ 10k SNPS, K1:7, nreps=10), and during the last run the power cut out right as the final replicate was completing its final non-spatial analysis. Is there any way to pick up a CV run from where it left off, with the files that are saved during the process?

Thanks,
Sean

Edit: Alternatively, there were 9 replicates that were finished (spatial and non-spatial) before the CV procedure was interrupted. Is there a way to just incorporate those nine replicates in a CV analysis, assuming that the 10th cannot be picked up where it left off?

conStruct analysis

can not understand what is this error while running conStruct."Error in if (any(args[["geoDist"]] < 0)) { :
missing value where TRUE/FALSE needed".** I have checked everything is fine. Please help

Installing covariance_fix issue

Hi,

Having some weird issues with installing the covariance_fix issue with install_github("gbradburd/conStruct", ref="covariance_fix"). When this runs, I'm linked out of R to a CNet article about installing command line tools

I've updated my Xcode command line tools etc through the Mac AppStore. Any ideas?

Cheers,
James

In cross-validation, two best-performing replicates of the same model show very different ancestry

Dear Prof. Bradburd,
Left column are two best performing replicates from the K=2 in non-spatial model, and right column in spatial model K=6, MCMC is 1e5, replicates are 10 for each model (19 populations, and 20K SNPS).

After population orders are kept the same, the two replicates don't show similar ancestry assignment. So in data interpretation, which one should I use?

Best,
Dahn-young

Piemap plotting query

Hi Gideon,

Thanks again for a fantastic method. This is a question rather than a bug report, in the vain hope that you either have or are aware of someone else having solving this issue.

We're doing up some final figures on an upcoming paper that uses construct, and I'm having issues getting a publication quality piemap-like plot. Specifically, we'd like to plot pie charts of the admixture proportions on a country border-style map of Europe. I'm only really conversant in the extended ggplot universe flavour of graphics, so I'm using ggmap for the base layer and ggforce to generate a "piemap".

The issue is that these two plot components have fundamentally different coordinate spaces: a pie chart should be circular, but the "circle" of radius 1° around a point has a range of sizes, shapes and circularities across any geodetic coordinate system. And conversely, plotting a map on the Cartesian plane leads to weird artefacts, e.g. a very fat northern Europe. We've also tried plotting them separately and combining in Illustrator or inkscape, but with no better results.

Do you have any thoughts, tips, or other gems of wisdom you can share?

Best,
Kevin

Examples:

WGS84 coordinate space for the map, but with squished pie charts in the correct locations

The pie charts themselves with Cartesian coordinates, which are now circular, but in the wrong locations. If the map were plotted underneath this it would be stretched very wide.

about the conStruct restlt

Hi bradburd,
Thank you for your detailed introduction to conStruct. I am trying to run it and I would like to ask two questions about the result.

In the Cross− Validation result, it seems to be stable in K2 and maximum in K3. However, in the layer contribution statistics, the layer contribution is very low. Can I determine whether the optimal K value is 2?
cv.pdf
spLayerContribution.pdf

2.How do I get the αD value to see if there is isolation by distance?

Thank you,
G

High Rhat and ESS issues for K>1 with spatial analyses

Hi there,

I'm encountering the same issues described in threads #25 & #31. Namely, when I run spatial models with k=1 (iter = 5000, chains = 3), the models work well and the diagnostic plots look good. No warnings.

When I increase k to 2, even when I increase iterations to 10,000 and chains to 5, I get these warnings:

Warning messages:
1: There were 1230 divergent transitions after warmup...
2: Examine the pairs() plot to diagnose sampling problems
3: The largest R-hat is 1.69, indicating chains have not mixed...
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable...
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable...

In thread #31 , Gideon wrote, "warnings - as long as the trace plots look ok, and you're getting consistent results across independent runs, I wouldn't worry that much about the warnings. Good mixing makes things more efficient, but your results aren't necessarily suspect if the mixing is poor. It's hard for me to eyeball the traceplots you sent, but it looks like things are broadly consistent (similar log posterior probabilities and parameter estimates), so I wouldn't worry too much about inefficient mixing in any given run."

To my eye, it looks like my trace plots are ok and I'm getting consistent results across independent runs. Can I confirm that these models are trustworthy despite the warnings?

sp_K2_iter10000_chains5_trace.plots.chain_3.pdf
sp_K2_iter10000_chains5_trace.plots.chain_4.pdf
sp_K2_iter10000_chains5_trace.plots.chain_5.pdf
sp_K2_iter10000_chains5_trace.plots.chain_1.pdf
sp_K2_iter10000_chains5_trace.plots.chain_2.pdf

(also attaching one chain's model fit and one layer cov curve plot because they're pretty similar among chains)
sp_K2_iter10000_chains5_model.fit.CIs.chain_1.pdf
sp_K2_iter10000_chains5_layer.cov.curves.chain_1.pdf

Lastly, I'm running a non-spatial analysis with k=2 and am expecting more warnings. If the model output is similar to that described above (i.e., trace plots are ok, consistent results across independent runs...) despite more of the same warnings, can I trust those results as well?

Thanks so much!
Sophie

Error when running conStruct: cannot open file 'test_trace.plots.chain_1.pdf'

Hello,
I have run several conStruct analyses testing different numbers of iterations but every time my run is complete I get the following error: Error in grDevices::pdf(file = paste0(prefix, "trace.plots.chain", chain.no, :
cannot open file 'test_trace.plots.chain_1.pdf'
and the conStruct run does not automatically run "make.all.the.plots" so I manually tried running it after my conStruct run was complete and I get the same error, that it cannot open the file 'test_trace.plots.chain_1.pdf'.

Any help would be appreciated, and I can provide any information that is needed to clarify or assist.

Thank you!

Cross-validation error: check.data.partitions.arg()

Hi again,

I'm having trouble getting a cross-validation started. I've tried both the regular and parallelized options, and the error message continues to be:

Error in check.data.partitions.arg(args <- as.list(environment())) : 
there must be more loci in each partition than there are samples

However, I'm not sure what the issue is. I have almost 1000 loci and 256 samples. I've pasted my xval code, as well as the the heads of my allele frequency and geoDist data, below. Thanks so much!

x.validation(train.prop = 0.9,
                         n.reps = 10,
                         K = 1:5,
                         freqs = allele_frqs,
                         data.partitions = NULL,
                         geoDist = geoDist,
                         coords = coords,
                         prefix = "pilot",
                         n.iter = 10000,
                         make.figs = TRUE,
                         save.files = TRUE,
                         parallel = FALSE,
                         n.nodes = NULL)

head(allele_frqs)
                [,1] [,2] [,3] [,4] [,5] [,6] [,7]
1.2_1A10_sorted    0    1  1.0  0.0    0    0    0
                [,8] [,9] [,10] [,11] [,12] [,13]
1.2_1A10_sorted  1.0  0.0   0.5   0.5   0.5   0.0
                [,14] [,15] [,16] [,17] [,18] [,19]
1.2_1A10_sorted   0.0   1.0     0     0   0.0   0.0
                [,20] [,21] [,22] [,23] [,24] [,25]
1.2_1A10_sorted     0   0.5   0.0   0.0     0     1
                [,26] [,27] [,28] [,29] [,30] [,31]
1.2_1A10_sorted   0.0   0.5   0.5     1     1     1
                [,32] [,33] [,34] [,35] [,36] [,37]
1.2_1A10_sorted     1     0   0.0   1.0   0.0     0
                [,38] [,39] [,40] [,41] [,42] [,43]
1.2_1A10_sorted   0.5   0.5     0   0.5     0     1
                [,44] [,45] [,46] [,47] [,48] [,49]
1.2_1A10_sorted     0   0.0   1.0     0   0.5   0.5
                [,50] [,51] [,52] [,53] [,54] [,55]
1.2_1A10_sorted   0.5   1.0   1.0   0.0   1.0   0.5
                [,56] [,57] [,58] [,59] [,60] [,61]
1.2_1A10_sorted   0.5   0.5   0.5   1.0   0.0   0.5
                [,62] [,63] [,64] [,65] [,66] [,67]
1.2_1A10_sorted     1     1   0.0   1.0     0     0
                [,68] [,69] [,70] [,71] [,72] [,73]
1.2_1A10_sorted   0.0     0     0     0     0   0.5
                [,74] [,75] [,76] [,77] [,78] [,79]
1.2_1A10_sorted   1.0   0.5   0.5   0.5   0.5   0.5
                [,80] [,81] [,82] [,83] [,84] [,85]
1.2_1A10_sorted     0   0.5   0.5     1   0.5     1
                [,86] [,87] [,88] [,89] [,90] [,91]
1.2_1A10_sorted     1     1     0   0.0   1.0     0
                [,92] [,93] [,94] [,95] [,96] [,97]
1.2_1A10_sorted     0   1.0     0     1   0.0     0
                [,98] [,99] [,100] [,101] [,102]
1.2_1A10_sorted     0     1      1      0    0.0
                [,103] [,104] [,105] [,106] [,107]
1.2_1A10_sorted    0.5    0.5    0.0      0      1
                [,108] [,109] [,110] [,111] [,112]
1.2_1A10_sorted      1      0    0.5    0.0      1
                [,113] [,114] [,115] [,116] [,117]
1.2_1A10_sorted      0      1      0    0.0    0.5
                [,118] [,119] [,120] [,121] [,122]
1.2_1A10_sorted    1.0    0.0    0.0      0    1.0
                [,123] [,124] [,125] [,126] [,127]
1.2_1A10_sorted      1      0      0    0.5    1.0
                [,128] [,129] [,130] [,131] [,132]
1.2_1A10_sorted      0    1.0    1.0    0.0      0
                [,133] [,134] [,135] [,136] [,137]
1.2_1A10_sorted      1      0    0.5      0      0
                [,138] [,139] [,140] [,141] [,142]
1.2_1A10_sorted    1.0      0      1      0    0.5
                [,143] [,144] [,145] [,146] [,147]
1.2_1A10_sorted      1      0      1      1      1
                [,148] [,149] [,150] [,151] [,152]
1.2_1A10_sorted      1      0      1      0      0
                [,153] [,154] [,155] [,156] [,157]
1.2_1A10_sorted      0      0    1.0    0.5      1
                [,158] [,159] [,160] [,161] [,162]
1.2_1A10_sorted      0    0.5    0.5    1.0    1.0
                [,163] [,164] [,165] [,166] [,167]
1.2_1A10_sorted    1.0      0      0      0    0.0
                [,168] [,169] [,170] [,171] [,172]
1.2_1A10_sorted      0      0      1    0.0    0.0
                [,173] [,174] [,175] [,176] [,177]
1.2_1A10_sorted      0      0      0      0    1.0
                [,178] [,179] [,180] [,181] [,182]
1.2_1A10_sorted      0    0.5    0.0      1      0
                [,183] [,184] [,185] [,186] [,187]
1.2_1A10_sorted      0    1.0    0.0      1      0
                [,188] [,189] [,190] [,191] [,192]
1.2_1A10_sorted    1.0    0.5      0    1.0    1.0
                [,193] [,194] [,195] [,196] [,197]
1.2_1A10_sorted    0.0      0      0      0    0.0
                [,198] [,199] [,200] [,201] [,202]
1.2_1A10_sorted    0.0    0.0      0      1      1
                [,203] [,204] [,205] [,206] [,207]
1.2_1A10_sorted      1      0      0      0    0.0
                [,208] [,209] [,210] [,211] [,212]
1.2_1A10_sorted      1      0      0    0.0      1
                [,213] [,214] [,215] [,216] [,217]
1.2_1A10_sorted      0      1      1    0.5      0
                [,218] [,219] [,220] [,221] [,222]
1.2_1A10_sorted      0    0.0    0.0      0    1.0
                [,223] [,224] [,225] [,226] [,227]
1.2_1A10_sorted      0      0      0      0      0
                [,228] [,229] [,230] [,231] [,232]
1.2_1A10_sorted    0.0    0.0      0    0.0    1.0
                [,233] [,234] [,235] [,236] [,237]
1.2_1A10_sorted      1      0      1    0.5      0
                [,238] [,239] [,240] [,241] [,242]
1.2_1A10_sorted      1      0      1    0.5    1.0
                [,243] [,244] [,245] [,246] [,247]
1.2_1A10_sorted      0    0.5      0      0    1.0
                [,248] [,249] [,250] [,251] [,252]
1.2_1A10_sorted      0      1      0      0    1.0
                [,253] [,254] [,255] [,256] [,257]
1.2_1A10_sorted    0.0    0.0      0    0.5    0.0
                [,258] [,259] [,260] [,261] [,262]
1.2_1A10_sorted    0.5    0.5    1.0    0.0      0
                [,263] [,264] [,265] [,266] [,267]
1.2_1A10_sorted    0.5      1      1    0.0    1.0
                [,268] [,269] [,270] [,271] [,272]
1.2_1A10_sorted    0.0    0.0    0.5      0    1.0
                [,273] [,274] [,275] [,276] [,277]
1.2_1A10_sorted    0.0    1.0    0.0    1.0    1.0
                [,278] [,279] [,280] [,281] [,282]
1.2_1A10_sorted    0.5      0      0      0    1.0
                [,283] [,284] [,285] [,286] [,287]
1.2_1A10_sorted    0.0    1.0    1.0      1    0.5
                [,288] [,289] [,290] [,291] [,292]
1.2_1A10_sorted    0.5    0.5    0.5      0    1.0
                [,293] [,294] [,295] [,296] [,297]
1.2_1A10_sorted    0.0      1      0      1      1
                [,298] [,299] [,300] [,301] [,302]
1.2_1A10_sorted      0      0      0    0.0      0
                [,303] [,304] [,305] [,306] [,307]
1.2_1A10_sorted    0.0      0      0    1.0      0
                [,308] [,309] [,310] [,311] [,312]
1.2_1A10_sorted    0.0    1.0    0.5    0.5    1.0
                [,313] [,314] [,315] [,316] [,317]
1.2_1A10_sorted    1.0      0    0.0    0.5    1.0
                [,318] [,319] [,320] [,321] [,322]
1.2_1A10_sorted      0    0.0    1.0    0.5    0.5
                [,323] [,324] [,325] [,326] [,327]
1.2_1A10_sorted      0      0    0.5    0.5    0.0
                [,328] [,329] [,330] [,331] [,332]
1.2_1A10_sorted    0.0    0.5      0    0.5      0
                [,333] [,334] [,335] [,336] [,337]
1.2_1A10_sorted    0.5    0.5      0    0.0      0
                [,338] [,339] [,340] [,341] [,342]
1.2_1A10_sorted    0.5    0.5      0    0.5      1
                [,343] [,344] [,345] [,346] [,347]
1.2_1A10_sorted    0.0    0.5    0.5    1.0    0.0
                [,348] [,349] [,350] [,351] [,352]
1.2_1A10_sorted    1.0    1.0    1.0      0      1
                [,353] [,354] [,355] [,356] [,357]
1.2_1A10_sorted      0      0      0    1.0    1.0
                [,358] [,359] [,360] [,361] [,362]
1.2_1A10_sorted    0.5      0    0.5    1.0      0
                [,363] [,364] [,365] [,366] [,367]
1.2_1A10_sorted      0    0.0    0.5    0.5    0.0
                [,368] [,369] [,370] [,371] [,372]
1.2_1A10_sorted    1.0      0      0      0      0
                [,373] [,374] [,375] [,376] [,377]
1.2_1A10_sorted      0      0    0.0    0.0    0.5
                [,378] [,379] [,380] [,381] [,382]
1.2_1A10_sorted      0    1.0    0.0    0.5    0.5
                [,383] [,384] [,385] [,386] [,387]
1.2_1A10_sorted    0.0    1.0      0    0.5    0.5
                [,388] [,389] [,390] [,391] [,392]
1.2_1A10_sorted    0.5    1.0    0.0      0    0.5
                [,393] [,394] [,395] [,396] [,397]
1.2_1A10_sorted    1.0      1    1.0    1.0      1
                [,398] [,399] [,400] [,401] [,402]
1.2_1A10_sorted      0    1.0    1.0      0      0
                [,403] [,404] [,405] [,406] [,407]
1.2_1A10_sorted    1.0    1.0      0    1.0    1.0
                [,408] [,409] [,410] [,411] [,412]
1.2_1A10_sorted    1.0      1      1      0    0.0
                [,413] [,414] [,415] [,416] [,417]
1.2_1A10_sorted      1      1    0.0    0.0      0
                [,418] [,419] [,420] [,421] [,422]
1.2_1A10_sorted    1.0      1    1.0    0.0      0
                [,423] [,424] [,425] [,426] [,427]
1.2_1A10_sorted      1      1    0.0    0.0    0.5
                [,428] [,429] [,430] [,431] [,432]
1.2_1A10_sorted    0.5      1      1      0    1.0
                [,433] [,434] [,435] [,436] [,437]
1.2_1A10_sorted    0.0      0      0    0.0      1
                [,438] [,439] [,440] [,441] [,442]
1.2_1A10_sorted    1.0      1      0      0    1.0
                [,443] [,444] [,445] [,446] [,447]
1.2_1A10_sorted      0      0    0.0      1      0
                [,448] [,449] [,450] [,451] [,452]
1.2_1A10_sorted      0      0    0.5    0.5      0
                [,453] [,454] [,455] [,456] [,457]
1.2_1A10_sorted      0      1      0      0    0.0
                [,458] [,459] [,460] [,461] [,462]
1.2_1A10_sorted    0.0      0      1      0    0.5
                [,463] [,464] [,465] [,466] [,467]
1.2_1A10_sorted    0.5      0      0    1.0    0.5
                [,468] [,469] [,470] [,471] [,472]
1.2_1A10_sorted    1.0    0.0    0.0    1.0    1.0
                [,473] [,474] [,475] [,476] [,477]
1.2_1A10_sorted    0.0    1.0    0.0    1.0    0.0
                [,478] [,479] [,480] [,481] [,482]
1.2_1A10_sorted    0.0    1.0    1.0      0      0
                [,483] [,484] [,485] [,486] [,487]
1.2_1A10_sorted      1    0.0    0.0    0.0      0
                [,488] [,489] [,490] [,491] [,492]
1.2_1A10_sorted      1      1    0.0    1.0    1.0
                [,493] [,494] [,495] [,496] [,497]
1.2_1A10_sorted      0    1.0      0      1      1
                [,498] [,499] [,500] [,501] [,502]
1.2_1A10_sorted      0    0.5      1    1.0    1.0
                [,503] [,504] [,505] [,506] [,507]
1.2_1A10_sorted      0      1      1      0      1
                [,508] [,509] [,510] [,511] [,512]
1.2_1A10_sorted    0.0      0      0      0      1
                [,513] [,514] [,515] [,516] [,517]
1.2_1A10_sorted      0      1      0    0.5      0
                [,518] [,519] [,520] [,521] [,522]
1.2_1A10_sorted      0    0.0      0    0.5    0.5
                [,523] [,524] [,525] [,526] [,527]
1.2_1A10_sorted    0.5    0.0    0.0      0      1
                [,528] [,529] [,530] [,531] [,532]
1.2_1A10_sorted      0    0.0      0    0.0    1.0
                [,533] [,534] [,535] [,536] [,537]
1.2_1A10_sorted      0      1      1    1.0    1.0
                [,538] [,539] [,540] [,541] [,542]
1.2_1A10_sorted    0.0    0.0      1      1      1
                [,543] [,544] [,545] [,546] [,547]
1.2_1A10_sorted      0      0      1      0      0
                [,548] [,549] [,550] [,551] [,552]
1.2_1A10_sorted    1.0      0      0      1    0.0
                [,553] [,554] [,555] [,556] [,557]
1.2_1A10_sorted      0      1    0.0      0      0
                [,558] [,559] [,560] [,561] [,562]
1.2_1A10_sorted      0      0    0.5    0.0    1.0
                [,563] [,564] [,565] [,566] [,567]
1.2_1A10_sorted      1      0      1    0.5      0
                [,568] [,569] [,570] [,571] [,572]
1.2_1A10_sorted    0.5    1.0    0.0    1.0      0
                [,573] [,574] [,575] [,576] [,577]
1.2_1A10_sorted      0    0.0      0      1      0
                [,578] [,579] [,580] [,581] [,582]
1.2_1A10_sorted    0.5    0.5    0.0    0.0    0.0
                [,583] [,584] [,585] [,586] [,587]
1.2_1A10_sorted    0.0      0    1.0    1.0    1.0
                [,588] [,589] [,590] [,591] [,592]
1.2_1A10_sorted    0.0    0.0      0      0      1
                [,593] [,594] [,595] [,596] [,597]
1.2_1A10_sorted      0    0.0    0.0      0      0
                [,598] [,599] [,600] [,601] [,602]
1.2_1A10_sorted    0.0    0.5    0.0    1.0    1.0
                [,603] [,604] [,605] [,606] [,607]
1.2_1A10_sorted    0.0      0      0      0    0.0
                [,608] [,609] [,610] [,611] [,612]
1.2_1A10_sorted    1.0      0    1.0    1.0    0.0
                [,613] [,614] [,615] [,616] [,617]
1.2_1A10_sorted      0    0.5    0.0    1.0    1.0
                [,618] [,619] [,620] [,621] [,622]
1.2_1A10_sorted    1.0      0      0      0    0.0
                [,623] [,624] [,625] [,626] [,627]
1.2_1A10_sorted    1.0    0.0      0    0.5      0
                [,628] [,629] [,630] [,631] [,632]
1.2_1A10_sorted      0      0      1      0      0
                [,633] [,634] [,635] [,636] [,637]
1.2_1A10_sorted    0.5    0.0    0.0    1.0    0.0
                [,638] [,639] [,640] [,641] [,642]
1.2_1A10_sorted      0    0.5    0.5    1.0    1.0
                [,643] [,644] [,645] [,646] [,647]
1.2_1A10_sorted      0    0.0    0.0    1.0    0.0
                [,648] [,649] [,650] [,651] [,652]
1.2_1A10_sorted    0.0    0.5    0.5      0      0
                [,653] [,654] [,655] [,656] [,657]
1.2_1A10_sorted    0.5    0.5      1      0      1
                [,658] [,659] [,660] [,661] [,662]
1.2_1A10_sorted    0.0    0.0    1.0      0      0
                [,663] [,664] [,665] [,666] [,667]
1.2_1A10_sorted    1.0      0      1      0      0
                [,668] [,669] [,670] [,671] [,672]
1.2_1A10_sorted      0      1      0      1      0
                [,673] [,674] [,675] [,676] [,677]
1.2_1A10_sorted      0      0      0      0      0
                [,678] [,679] [,680] [,681] [,682]
1.2_1A10_sorted      0      1      0      1      1
                [,683] [,684] [,685] [,686] [,687]
1.2_1A10_sorted    1.0    0.5      1    1.0      0
                [,688] [,689] [,690] [,691] [,692]
1.2_1A10_sorted      1    0.5      0    0.5      0
                [,693] [,694] [,695] [,696] [,697]
1.2_1A10_sorted      0      1      0      0      0
                [,698] [,699] [,700] [,701] [,702]
1.2_1A10_sorted    1.0      0      1      0      0
                [,703] [,704] [,705] [,706] [,707]
1.2_1A10_sorted    0.5      0      1    0.5    0.5
                [,708] [,709] [,710] [,711] [,712]
1.2_1A10_sorted    1.0    1.0      1    0.5      1
                [,713] [,714] [,715] [,716] [,717]
1.2_1A10_sorted      0      0    0.5    0.5    0.0
                [,718] [,719] [,720] [,721] [,722]
1.2_1A10_sorted    0.0      0    1.0      0      0
                [,723] [,724] [,725] [,726] [,727]
1.2_1A10_sorted    0.5    0.5    0.0    1.0      0
                [,728] [,729] [,730] [,731] [,732]
1.2_1A10_sorted      0      0      0    0.0    1.0
                [,733] [,734] [,735] [,736] [,737]
1.2_1A10_sorted    1.0      0      1    1.0      0
                [,738] [,739] [,740] [,741] [,742]
1.2_1A10_sorted    0.0    1.0    0.5    1.0    0.0
                [,743] [,744] [,745] [,746] [,747]
1.2_1A10_sorted    0.0      0      0      1      0
                [,748] [,749] [,750] [,751] [,752]
1.2_1A10_sorted    1.0    0.0      0    1.0    1.0
                [,753] [,754] [,755] [,756] [,757]
1.2_1A10_sorted    1.0      0    0.5    1.0    0.5
                [,758] [,759] [,760] [,761] [,762]
1.2_1A10_sorted      0    0.5    0.5      0      0
                [,763] [,764] [,765] [,766] [,767]
1.2_1A10_sorted      0    1.0    0.5      0      0
                [,768] [,769] [,770] [,771] [,772]
1.2_1A10_sorted      1    1.0      1      0    0.0
                [,773] [,774] [,775] [,776] [,777]
1.2_1A10_sorted    0.0      0      0      1      0
                [,778] [,779] [,780] [,781] [,782]
1.2_1A10_sorted      0      0      1      0      0
                [,783] [,784] [,785] [,786] [,787]
1.2_1A10_sorted    0.5    0.5      1    0.5    0.5
                [,788] [,789] [,790] [,791] [,792]
1.2_1A10_sorted      0    0.0      0      0      0
                [,793] [,794] [,795] [,796] [,797]
1.2_1A10_sorted      0      1      0      1      1
                [,798] [,799] [,800] [,801] [,802]
1.2_1A10_sorted      1      0      0      0    0.5
                [,803] [,804] [,805] [,806] [,807]
1.2_1A10_sorted      0      0      0      0      0
                [,808] [,809] [,810] [,811] [,812]
1.2_1A10_sorted    1.0    1.0      0      0      0
                [,813] [,814] [,815] [,816] [,817]
1.2_1A10_sorted      1    0.5    0.5      1      1
                [,818] [,819] [,820] [,821] [,822]
1.2_1A10_sorted    0.5      1      0    0.0    1.0
                [,823] [,824] [,825] [,826] [,827]
1.2_1A10_sorted      0      0    0.0    1.0    1.0
                [,828] [,829] [,830] [,831] [,832]
1.2_1A10_sorted    0.0      0      1      0      0
                [,833] [,834] [,835] [,836] [,837]
1.2_1A10_sorted      1    0.5      0      1    1.0
                [,838] [,839] [,840] [,841] [,842]
1.2_1A10_sorted      0      0      0      0      0
                [,843] [,844] [,845] [,846] [,847]
1.2_1A10_sorted      0      0      0      0      0
                [,848] [,849] [,850] [,851] [,852]
1.2_1A10_sorted    0.5    1.0    1.0      1      0
                [,853] [,854] [,855] [,856] [,857]
1.2_1A10_sorted      0      0      1    0.0      1
                [,858] [,859] [,860] [,861] [,862]
1.2_1A10_sorted      0    0.0    0.5    0.5    0.5
                [,863] [,864] [,865] [,866] [,867]
1.2_1A10_sorted      1      0      0      0      0
                [,868] [,869] [,870] [,871] [,872]
1.2_1A10_sorted      0    0.0      0      1    1.0
                [,873] [,874] [,875] [,876] [,877]
1.2_1A10_sorted    1.0    1.0    0.0      0      0
                [,878] [,879] [,880] [,881] [,882]
1.2_1A10_sorted      0      0      1      0      0
                [,883] [,884] [,885] [,886] [,887]
1.2_1A10_sorted      1    0.0    0.0    0.0      0
                [,888] [,889] [,890] [,891] [,892]
1.2_1A10_sorted      1    0.5    0.0    0.0      1
                [,893] [,894] [,895] [,896] [,897]
1.2_1A10_sorted      0      0      0      0    0.5
                [,898] [,899] [,900] [,901] [,902]
1.2_1A10_sorted    0.5      0      1      0      0
                [,903] [,904] [,905] [,906] [,907]
1.2_1A10_sorted    1.0    1.0      0    1.0    0.0
                [,908] [,909] [,910] [,911] [,912]
1.2_1A10_sorted      0    0.5      0      1      0
                [,913] [,914] [,915] [,916] [,917]
1.2_1A10_sorted      0    0.0      0    1.0      0
                [,918] [,919] [,920] [,921] [,922]
1.2_1A10_sorted      1    0.0      0      0    0.5
                [,923] [,924] [,925] [,926] [,927]
1.2_1A10_sorted    0.5    1.0    1.0      0      0
                [,928] [,929] [,930] [,931] [,932]
1.2_1A10_sorted    1.0      0    1.0    0.0      0
                [,933] [,934] [,935] [,936] [,937]
1.2_1A10_sorted    0.0    0.0    1.0    0.0    1.0
                [,938] [,939] [,940] [,941] [,942]
1.2_1A10_sorted    0.0    0.0      0      0    1.0
                [,943] [,944] [,945] [,946] [,947]
1.2_1A10_sorted    1.0    0.0      0      0      0
                [,948] [,949] [,950] [,951] [,952]
1.2_1A10_sorted      0    0.0    1.0    0.0    0.0
                [,953] [,954] [,955] [,956] [,957]
1.2_1A10_sorted    0.5    1.0    1.0      0    0.0
                [,958] [,959] [,960] [,961] [,962]
1.2_1A10_sorted      0      0    1.0    0.0    1.0
                [,963] [,964] [,965] [,966] [,967]
1.2_1A10_sorted    0.5      0    1.0    0.0      0
                [,968] [,969] [,970] [,971] [,972]
1.2_1A10_sorted      0      1      0      0      0

head(geoDist)
          [,1]      [,2]      [,3]         [,4]
[1,]  0.000000 13.727390  9.892015 2.381399e+01
[2,] 13.727390  0.000000  9.937058 1.106926e+01
[3,]  9.892015  9.937058  0.000000 1.626971e+01
          [,5]         [,6]      [,7]      [,8]
[1,] 16.905986 1.828637e+01  8.317481 31.363702
[2,]  8.187294 1.004428e+01 10.916122 17.903873
[3,]  8.058421 8.970985e+00  1.867842 24.315685
          [,9]    [,10]    [,11]     [,12]
[1,]  5.964326 45.92081 43.45138  9.154768
[2,] 19.600855 32.41610 29.85356 22.485798
[3,] 13.978948 38.56634 36.40676 15.783890
         [,13]     [,14]     [,15]    [,16]
[1,]  8.317481 16.468504  5.964326 43.40656
[2,] 10.916122  5.371396 19.600855 29.96805
[3,]  1.867842  9.118442 13.978948 35.94834
         [,17]    [,18]    [,19]     [,20]
[1,] 30.406730 12.24447 39.08124  9.883878
[2,] 16.787384 23.64079 25.57166 21.401822
[3,] 23.839923 15.00909 31.88374 13.101854
         [,21]     [,22]    [,23]    [,24]
[1,]  9.142378  5.964326 10.97315 39.08124
[2,] 20.627234 19.600855 24.44317 25.57166
[3,] 12.420918 13.978948 17.80078 31.88374
         [,25]     [,26]     [,27]    [,28]
[1,] 21.583312 11.745072 30.406730  1.48979
[2,]  9.010918  7.528755 16.787384 15.13899
[3,] 14.141029  3.077151 23.839923 11.24280
          [,29]    [,30]        [,31]        [,32]
[1,] 17.1660898 39.08124 1.828637e+01 2.381399e+01
[2,]  7.8109772 25.57166 1.004428e+01 1.106926e+01
[3,]  8.5568906 31.88374 8.970985e+00 1.626971e+01
         [,33]      [,34]     [,35]     [,36]
[1,] 32.655466 17.1660898  9.892015 11.745072
[2,] 20.380216  7.8109772  9.937058  7.528755
[3,] 24.133420  8.5568906  0.000000  3.077151
         [,37]    [,38]    [,39]     [,40]
[1,] 16.804830 48.49421 56.31489  5.650492
[2,]  3.896618 34.97805 42.77823  8.256005
[3,] 10.760120 41.12646 48.90084  6.197529
         [,41]     [,42]     [,43]     [,44]
[1,]  5.650492  9.142378 32.655466 21.583312
[2,]  8.256005 20.627234 20.380216  9.010918
[3,]  6.197529 12.420918 24.133420 14.141029
         [,45]    [,46]     [,47]     [,48]
[1,] 21.583312 50.52262  9.892015  9.883878
[2,]  9.010918 37.26605  9.937058 21.401822
[3,] 14.141029 42.61930  0.000000 13.101854
        [,49]    [,50]      [,51]    [,52]
[1,] 39.08124 56.31489 17.1660898 43.45138
[2,] 25.57166 42.77823  7.8109772 29.85356
[3,] 31.88374 48.90084  8.5568906 36.40676
         [,53]    [,54]    [,55]        [,56]
[1,]  3.810523 49.70464  1.48979 1.828637e+01
[2,] 17.537094 36.42498 15.13899 1.004428e+01
[3,] 12.872313 41.84917 11.24280 8.970985e+00
        [,57]     [,58]     [,59]     [,60]
[1,] 49.70464 16.468504  4.861119 15.837669
[2,] 36.42498  5.371396  9.151953  2.255406
[3,] 41.84917  9.118442  6.204685 10.972246
         [,61]     [,62]     [,63]     [,64]
[1,]  8.199477 11.745072 32.655466 11.745072
[2,] 21.623524  7.528755 20.380216  7.528755
[3,] 15.196454  3.077151 24.133420  3.077151
         [,65]    [,66]     [,67]     [,68]
[1,]  6.174407 43.40656 16.905986  2.923925
[2,] 15.363908 29.96805  8.187294 11.791398
[3,]  7.244576 35.94834  8.058421  6.972162
        [,69]     [,70]    [,71]     [,72]
[1,] 39.08124 13.727390 10.97315  5.234699
[2,] 25.57166  0.000000 24.44317 18.914828
[3,] 31.88374  9.937058 17.80078 13.568196
         [,73]     [,74]     [,75]     [,76]
[1,] 16.025114  6.626549  0.000000 25.546411
[2,]  6.074307 17.386262 13.727390 12.333971
[3,]  8.129785  9.435412  9.892015 18.414704
            [,77]     [,78]     [,79]     [,80]
[1,] 1.828637e+01 30.406730 13.727390 12.391455
[2,] 1.004428e+01 16.787384  0.000000  6.048940
[3,] 8.970985e+00 23.839923  9.937058  4.614962
         [,81]     [,82]     [,83]     [,84]
[1,]  9.366031 11.745072 16.025114 31.363702
[2,]  4.663915  7.528755  6.074307 17.903873
[3,]  6.284298  3.077151  8.129785 24.315685
         [,85]    [,86]     [,87]    [,88]
[1,] 18.332730 56.92666 16.468504 35.14007
[2,]  6.684932 43.22762  5.371396 21.68886
[3,] 10.769003 50.16342  9.118442 27.92420
         [,89]    [,90]     [,91]     [,92]
[1,]  5.650492 43.40656 26.064500  5.234699
[2,]  8.256005 29.96805 13.791249 18.914828
[3,]  6.197529 35.94834 17.891471 13.568196
        [,93]     [,94]     [,95]     [,96]
[1,] 56.92666 18.332730  6.174407 16.804830
[2,] 43.22762  6.684932 15.363908  3.896618
[3,] 50.16342 10.769003  7.244576 10.760120
         [,97]     [,98]     [,99]    [,100]
[1,]  9.366031 15.837669 32.655466 21.583312
[2,]  4.663915  2.255406 20.380216  9.010918
[3,]  6.284298 10.972246 24.133420 14.141029
        [,101]   [,102]    [,103]    [,104]
[1,] 14.952789  1.48979  0.000000  6.626549
[2,]  6.618463 15.13899 13.727390 17.386262
[3,]  6.603231 11.24280  9.892015  9.435412
        [,105]       [,106]    [,107]    [,108]
[1,] 12.391455 1.828637e+01 16.905986  5.234699
[2,]  6.048940 1.004428e+01  8.187294 18.914828
[3,]  4.614962 8.970985e+00  8.058421 13.568196
        [,109]     [,110]    [,111]    [,112]
[1,]  6.174407 17.1660898  8.199477  1.665943
[2,] 15.363908  7.8109772 21.623524 12.812012
[3,]  7.244576  8.5568906 15.196454  8.242690
        [,113]    [,114]    [,115]    [,116]
[1,] 16.025114  9.142378 21.583312 14.952789
[2,]  6.074307 20.627234  9.010918  6.618463
[3,]  8.129785 12.420918 14.141029  6.603231
        [,117]       [,118]    [,119]    [,120]
[1,] 16.804830 2.381399e+01  9.154768 32.655466
[2,]  3.896618 1.106926e+01 22.485798 20.380216
[3,] 10.760120 1.626971e+01 15.783890 24.133420
        [,121]    [,122]    [,123]    [,124]
[1,]  8.317481  5.234699  8.199477  5.650492
[2,] 10.916122 18.914828 21.623524  8.256005
[3,]  1.867842 13.568196 15.196454  6.197529
        [,125]       [,126]    [,127]    [,128]
[1,] 26.064500 2.381399e+01  5.964326  4.861119
[2,] 13.791249 1.106926e+01 19.600855  9.151953
[3,] 17.891471 1.626971e+01 13.978948  6.204685
        [,129]   [,130]   [,131]   [,132]    [,133]
[1,] 16.804830 43.40656 10.97315 56.92666 15.837669
[2,]  3.896618 29.96805 24.44317 43.22762  2.255406
[3,] 10.760120 35.94834 17.80078 50.16342 10.972246
       [,134]    [,135]   [,136]    [,137]
[1,]  1.48979  9.892015  1.48979 26.064500
[2,] 15.13899  9.937058 15.13899 13.791249
[3,] 11.24280  0.000000 11.24280 17.891471
        [,138]    [,139]       [,140]    [,141]
[1,]  5.234699 15.837669 2.381399e+01  2.923925
[2,] 18.914828  2.255406 1.106926e+01 11.791398
[3,] 13.568196 10.972246 1.626971e+01  6.972162
       [,142]   [,143]    [,144]    [,145]
[1,] 43.45138 49.70464 30.406730 16.905986
[2,] 29.85356 36.42498 16.787384  8.187294
[3,] 36.40676 41.84917 23.839923  8.058421
        [,146]    [,147]    [,148]    [,149]
[1,]  9.883878  9.366031 26.064500 14.952789
[2,] 21.401822  4.663915 13.791249  6.618463
[3,] 13.101854  6.284298 17.891471  6.603231
        [,150]    [,151]    [,152]    [,153]
[1,]  1.665943  6.174407  9.142378  0.000000
[2,] 12.812012 15.363908 20.627234 13.727390
[3,]  8.242690  7.244576 12.420918  9.892015
         [,154]    [,155]    [,156]    [,157]
[1,] 17.1660898  9.892015  5.650492 16.025114
[2,]  7.8109772  9.937058  8.256005  6.074307
[3,]  8.5568906  0.000000  6.197529  8.129785
        [,158]    [,159]    [,160]    [,161]
[1,]  9.142378 16.468504 13.727390 16.804830
[2,] 20.627234  5.371396  0.000000  3.896618
[3,] 12.420918  9.118442  9.937058 10.760120
       [,162]   [,163]    [,164]    [,165]   [,166]
[1,] 43.40656 43.45138  6.174407 15.837669 26.61110
[2,] 29.96805 29.85356 15.363908  2.255406 40.33422
[3,] 35.94834 36.40676  7.244576 10.972246 34.36716
       [,167]   [,168]   [,169]   [,170]   [,171]
[1,] 31.96628 56.31489 35.14007 28.31295 26.01825
[2,] 38.31249 42.77823 21.68886 41.83503 37.81521
[3,] 41.05036 48.90084 27.92420 34.53871 35.83058
       [,172]   [,173]   [,174]   [,175]   [,176]
[1,] 67.64837 19.10550 60.65089 45.59188 10.74314
[2,] 54.01648 23.04208 74.37614 33.90323 24.08198
[3,] 60.44171 26.92198 67.73380 43.75003 17.22092
       [,177]   [,178]   [,179]    [,180]   [,181]
[1,] 36.96603 44.52613 34.13337 12.391455 35.14007
[2,] 31.59103 57.94499 43.49140  6.048940 21.68886
[3,] 40.36560 50.17720 43.93824  4.614962 27.92420
       [,182]   [,183]   [,184]   [,185]   [,186]
[1,] 11.19847 37.75311 62.16315 25.31214 66.14583
[2,] 24.91824 48.11047 48.44166 29.71801 79.76410
[3,] 19.18610 47.64360 56.00100 33.56386 74.15484
       [,187]    [,188]   [,189]   [,190]   [,191]
[1,] 38.59104  4.861119 34.71303 44.45755 14.63601
[2,] 26.81432  9.151953 46.51066 57.68184 25.20191
[3,] 36.67593  6.204685 44.48320 49.58293 24.46573
        [,192]   [,193]   [,194]   [,195]   [,196]
[1,] 31.363702 12.24447 14.05146 37.70531 59.67875
[2,] 17.903873 23.64079 27.69184 46.93784 46.04929
[3,] 24.315685 15.00909 22.72827 47.50656 54.20303
       [,197]   [,198]   [,199]    [,200]   [,201]
[1,] 33.87966 58.61775 27.68830  6.626549 50.03334
[2,] 33.06748 72.00214 14.55139 17.386262 63.37370
[3,] 39.91026 67.20534 24.03137  9.435412 58.73818
       [,202]   [,203]    [,204]    [,205]   [,206]
[1,] 42.85174 18.63858 31.363702 12.391455 12.93478
[2,] 56.37785 31.93623 17.903873  6.048940 25.62269
[3,] 48.87025 27.79568 24.315685  4.614962 22.65471
       [,207]   [,208]    [,209]   [,210]   [,211]
[1,] 16.72664 53.84188 10.205682 51.64401 22.93071
[2,] 30.12551 40.16037  6.952792 65.33436 13.51734
[3,] 22.94684 48.13919 12.329305 59.40656 23.12970
        [,212]   [,213]   [,214]   [,215]    [,216]
[1,]  6.626549 45.50503 39.80516 10.43120 26.064500
[2,] 17.386262 59.18336 53.31059 20.96019 13.791249
[3,]  9.435412 53.38628 45.78558 20.17329 17.891471
       [,217]    [,218]   [,219]   [,220]   [,221]
[1,] 18.76328  9.356426 20.95983 46.33814 46.81335
[2,] 30.93437 22.715832 33.85329 32.62234 60.53615
[3,] 22.34050 18.775875 25.78314 40.43785 54.29847
       [,222]    [,223]   [,224]   [,225]    [,226]
[1,] 32.58176 25.546411 34.64744 38.55904  8.054848
[2,] 23.33911 12.333971 48.33804 51.78508 16.078471
[3,] 33.09390 18.414704 41.55780 43.75400 16.749428
       [,227]   [,228]   [,229]   [,230]   [,231]
[1,] 56.31489 19.75452 16.54676 18.97530 53.60073
[2,] 42.77823 31.34020 28.44776 32.62860 40.88777
[3,] 48.90084 22.36143 19.78772 26.03506 45.02233
        [,232]   [,233]   [,234]   [,235]   [,236]
[1,]  9.996153 57.55152 41.80524 49.70464 34.58300
[2,] 10.860142 71.22565 32.05643 36.42498 48.18354
[3,] 15.002789 64.12424 41.94031 41.84917 41.00031
       [,237]   [,238]    [,239]   [,240]   [,241]
[1,] 43.67069 32.48324 16.905986 12.55566 27.93393
[2,] 43.10997 45.65015  8.187294 23.81914 40.82077
[3,] 49.99539 37.62373  8.058421 15.09442 37.30109
       [,242]   [,243]    [,244]   [,245]   [,246]
[1,] 23.73464 35.21033 10.410578 51.51119 42.19076
[2,] 37.34514 24.37808 13.710858 64.86173 35.59032
[3,] 32.19555 25.84186  3.785468 56.90540 44.82119
       [,247]   [,248]   [,249]   [,250]   [,251]
[1,] 49.70464 32.98215 38.58851 56.92666 31.44390
[2,] 36.42498 46.65220 41.32337 43.22762 44.29246
[3,] 41.84917 41.04709 46.40958 50.16342 35.90875
       [,252]   [,253]   [,254]   [,255]   [,256]
[1,] 61.99722 29.32784 66.46809 40.01624 45.81092
[2,] 48.64336 28.32399 80.05419 26.34009 41.82967
[3,] 54.13418 35.14166 72.54891 34.55246 50.21709

covariance matrix is somehow not positive definite

@davidcannatella has a dataset (reported over in #20: see link to the data here) for which the covariance matrix is not positive defnite. The problem arises at the step of setting the diagonal to 1/4; in this example, the diagonal values are all between 0.2512437 and 0.2512690, but setting the diagonals to 0.25 drops the smallest eigenvalue from 9.788625e-04 to -0.0002842281 (trouble!).

I'll need to remind myself of where the math here is coming from to figure out what to do about it.

number of samples is not consistent across entries

Hi,

I'm getting the following error when trying to execute conStruct:

>conStruct(spatial=TRUE,K=2,freqs=taftmat,geoDist=pop_dists,coords=pop_coords,prefix="spK2")
checking data.block

Error in validate.n.samples(data.block) :
the number of samples is not consistent
across entries in the data.block

I'm attaching my distance matrix, coordinates, and allele frequency matrix:
geoDist.txt
pop_coords.txt
taftmat.txt

Thanks for any advice!

Predictive accuracy increasing over all values of K from X-Val

Hi,

I performed the cross-validation for a dataset of 71 individuals and ~88k SNPs. It seems that the predictive accuracy steadily increases until it approaches 0 at the highest value of K used in the analysis.

I had previously checked this dataset in ADMIXTURE and DAPC for population structure and both suggest K=1, and when I check the layer contributions in conStruct, it supports K=1. So, what could be the reason that I am not seeing the cross-validation figure plateauing at K=1?

Thank you for your help!

Alex

Error in check.data.partitions.covmats(args)

Hi,

I try to run the cross validation analysis the following error pop up:
Error in check.data.partitions.covmats(args) :
you have specified an invalid data partition "data" element is not positive definite.

these the code:
my.xvals <- x.validation(train.prop = 0.9, n.reps = 10, K = 1:5,
freqs = freqs,
data.partitions = NULL,
geoDist = geoDist, coords = coords,
prefix = "Crau_K1_5", n.iter = 1000,
make.figs = FALSE, save.files = TRUE,
parallel = FALSE, n.nodes = 1)

Please, can you help what is going on error? thank you in advance,

error creating the vignettes in conStruct

Hi,

I am trying the install the development version of conStruct in R 4.1 on Ubuntu 20.04 LTS.

However, it is no possible create the vignettes included in the package.

This is the output of the installation:

E  creating vignettes (10m 16.1s)
   --- re-building ‘format-data.Rmd’ using rmarkdown
   Warning: The vignette title specified in \VignetteIndexEntry{} is different from the title in the YAML metadata. The former is "format-data", and the latter is "How to format data for a conStruct analysis". If that is intentional, you may set options(rmarkdown.html_vignette.check_title = FALSE) to suppress this check.
   --- finished re-building ‘format-data.Rmd’
   
   --- re-building ‘model-comparison.Rmd’ using rmarkdown
   Warning: The vignette title specified in \VignetteIndexEntry{} is different from the title in the YAML metadata. The former is "model-comparison", and the latter is "How to compare conStruct model runs". If that is intentional, you may set options(rmarkdown.html_vignette.check_title = FALSE) to suppress this check.
   --- finished re-building ‘model-comparison.Rmd’
   
   --- re-building ‘run-conStruct.Rmd’ using rmarkdown
   Warning: The vignette title specified in \VignetteIndexEntry{} is different from the title in the YAML metadata. The former is "run-conStruct", and the latter is "How to run a conStruct analysis". If that is intentional, you may set options(rmarkdown.html_vignette.check_title = FALSE) to suppress this check.
   --- finished re-building ‘run-conStruct.Rmd’
   
   --- re-building ‘visualize-results.Rmd’ using rmarkdown
   Quitting from lines 209-219 (visualize-results.Rmd) 
   Error: processing vignette 'visualize-results.Rmd' failed with diagnostics:
   there is no package called 'maps'
   --- failed re-building ‘visualize-results.Rmd’
   
   SUMMARY: processing the following file failed:
     ‘visualize-results.Rmd’
   
   Error: Vignette re-building failed.
   Execution halted
Error: Failed to install 'conStruct' from GitHub:
  System command 'R' failed, exit status: 1, stdout & stderr were printed

I try to install the package without the vignettes and works fine. The problem seems in the different names of the vignettes.

There is some solution to this error?

Thanks in advance.

Is parallelization possible?

Is it possible to parallelize the models? I'm very new to parallelization and am trying to get my bearings.

For example, I'd like to parallelize this to run on multiple cores:

conStruct(spatial = FALSE, 
                    K = 2, 
                    freqs = allele_frqs, 
                    geoDist = NULL, 
                    coords = coords,
                    prefix = "nsp_K2_iter5000_chains5",
                    n.chains = 5,
                    n.iter = 5000)

Thanks so much,
Sophie

divergent transitions - change adapt_delta

Hi everyone,

First, thanks for such a great package with great documentation. I appreciate it.

I just had one 'hopefully' quick question. I am able to successfully run conStruct with my data, but I get the following warning message:

Warning messages:
1: There were 193 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See
http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
2: Examine the pairs() plot to diagnose sampling problems

I checked the website the warning message recommends, and it says that adapt_delta should be a parameter in any of the stan functions.

I just wanted to check with you, if there is a quick way of doing this from your functions, or should I clone the package and try to change it internally?

Thank you!

element that is not positive definite, xvalidation

Hi,
I'm having an issue that I see others have had, in that I'm trying to run a cross validation analysis and am getting the error message "Error in check.data.partitions.covmats(args) : you have specified an invalid data partition "data" element that is not positive definite". I ran this dataset with the regular conStruct analyses with no issues.
I realize a solution is to remove samples with lots of missing data, but I'm to the point of removing samples with less than 50% missing data and am wondering if there might be some other issue at play. This is a large dataset (3,564 loci).
Any help is greatly appreciated!
Pg_xvalid.const.RData.zip

thom

View admixture proportions with associated sample names

Hello,

Thank you for this great program! I'd like to investigate a few of the individuals that are more genetically similar to a different site than where they were sampled. I can view the admixture proportions, but they aren't associated with a sample name. I was able to sort the structure plot by my sample names, so they must be associated in the back somewhere, but I'm not sure how to link this information. Is there a way I can view the admixture proportions with the sample ID for different K with the spatial and non-spatial models?

I appreciate your help,
Quinn

Chain Does Not Update from Iteration 1

Hey Gideon,

I'm excited to try out conStruct! I've got a 3RAD dataset of 3297 "unlinked" SNPs (one SNP per RAD locus) from 230 snakes. As usual, I want to determine for a range of Ks whether a spatial model or non-spatial model will best describe the data. (I suspect spatial will be better, as IBD is a strong feature in this dataset.)

My plan is to run conStruct(n.chains = 1, K = 5) (k=5 was the most likely K using Structure) to figure out the number of iterations needed to get the MCMC to converge, then running x.validate for a range of Ks and 20 repititions per K with that number of iterations.

Unfortunately, when running the initial conStruct command, it seems to hang at the first iteration. Here's the output from running conStruct:

k5 <- conStruct(spatial = TRUE, 
+                     K = 5, 
+                     freqs = construct.data, 
+                     geoDist = geoDist, 
+                     coords = coords, 
+                     prefix = "spK5-1snplocus", 
+                     n.chains = 1, 
+                     n.iter = 1000, 
+                     make.figs = TRUE, 
+                     save.files = TRUE)

checking data.block

	reading 230 samples
	reading 2883 loci

checking specified model


user has specified a spatial model with 5 layer(s)


SAMPLING FOR MODEL 'space_multiK' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 0.156347 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1563.47 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:   1 / 1000 [  0%]  (Warmup)

1563 seconds is ~26 minutes, but even after an hour nothing has happened. The iteration is still at step 1/1000, and the percentage has not increased. This happens with different subsets of the dataset (up to 12k SNPs), different individuals, and regardless of whether I choose 1k iterations or 25k. With 25k iterations, I left it running for 8 hours, but still the iteration counter did not increase. I'm running conStruct on R version 4.1.0 on a Mac.

Any idea why this might be? I'm happy to send data/scripts along if necessary.

I hope you and the family are well, Gideon!

-Alex

Using multiallelic data

Dear Dr. Bradburd,

Your manuscript states that conStruct can be used with biallelic SNPs. I have a dataset with 1000 multiallelic (range: 2-20) microhaplotypes. Is it possible to use this type of data in your software?
Thank you,
Matt Hopken

extract log-likelihoods from unfinished cross-validation

Dear Gideon,

thanks a lot for this wonderful program. I have a request or question rather reporting an issue here. I am running ConStruct with >250 population samples and 10,000 marker SNPs in 10-fold replication with k=1:10 to identify the optimal number of spatial layers. As you can imagine, this takes some time - particularly both the sp and the nsp models are calculated consecutively before the logLikelihood table is calculated.

Thus, I wanted to ask if it is possible to either output the sp likelihood table before the calculation of the nsp model start, or if it is possible to manually calculate to likelihoods from the model.fit objects.

Thanks for your help!

Best, Martin

structure2conStruct error

Hi Gideon, I am having an issue using the "structure2conStruct" file conversion function. I have two species I want to run conStruct on, one with 40 and another with over 200. The one with more specimens works without errors, but when I attempt to run the following command (the only differences are the file prefixes between this command and the one that worked):
structure2conStruct(infile = "brevirostris-nohead.str",
onerowperind = TRUE,
start.loci = 3,
missing.datum = 0,
outfile = "brev-conStuct-input")
and I received this error message: "Error in sample.int(length(x), size, replace, prob) : invalid first argument construct". I've used this function before this too and not had any issues after getting the flags specifying where loci start, etc., correct. The only difference I can think of between the file I'm recovering errors with and the other files is that it happens to have far fewer specimens in it than any other.

I have attached the .structure files in question and please let me know if you have any follow-up questions:
brevirostris-nohead.str.txt
distichus-nohead.str.txt

Thanks,
Tanner

Error in validate.n.samples(data.block)

Hi,
When I try to run conStruct, after I built myConStructData, it pop up an error: checking data.block
Error in validate.n.samples(data.block): the number of samples is not consistent across entries in the data.block.
Here is my file
denovo30.txt
coords.txt
I do not the where is it wrong with the file.
Thank you in advance,
r

Error Running ConStruct

Hi, I'm getting the following error as I work through the vignette for how to run conStruct. It is the same when I copy and past the code for the spatial and nonspatial model. I am running R Studio 1.1.423 and R 3.4.3.

Thanks for your help. I appreciate what you have done.
Dawn

Error in compileCode(f, code, language = language, verbose = verbose) :
Compilation ERROR, function(s)/method(s) not created! Warning message:
running command 'make -f "C:/PROGRA~~1/R/R-34~~1.3/etc/x64/Makeconf" -f "C:/PROGRA~~1/R/R-34~~1.3/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="file2bc05c09657a.dll" WIN=64 TCLBIN=64 OBJECTS="file2bc05c09657a.o"' had status 127
In addition: Warning messages:
1: running command '"C:/PROGRA~~1/R/R-34~~1.3/bin//R" CMD config CXX' had status 1
2: running command 'C:/PROGRA~~1/R/R-34~~1.3/bin/x64/R CMD SHLIB file2bc05c09657a.cpp 2> file2bc05c09657a.cpp.err.txt' had status 1

Comparing structure model runs

Hi Gideon,

I'm running into an error while trying to compare model runs using x.validation. If I set parallel and n.nodes to False and Null (respectively), I run into this issue:
Error in if (args[["parallel"]] & args[["n.nodes"]] == 1) { :
argument is of length zero
and if I set parallel to FALSE and n.node = 1, I run unto this issue:
Error in check.data.partitions.covmats(args) :
you have specified an invalid data partition "data" element is not positive definite

I'm not wanting to run this in parallel or on multiple nodes (my dataset isn't large enough to warrant this, so how would I best go about getting this run with parallel and n.nodes set to default values?

Thanks,
James

Entire code below:
my.xvals <- x.validation(train.prop = 0.9, n.reps = 8, K = 1:5, freqs = AiconStruct.data, data.partitions = NULL, geoDist = geoDist, coords = Aipoints, prefix = "mdl", n.iter = 1e3, make.figs = TRUE, save.files = FALSE, parallel = FALSE, n.nodes = NULL)

Questions about results interpretation

Hello again, thank you/apologies in advance for the number of questions. I greatly appreciate any guidance you can provide!

The paper says to use unlinked loci; I have ddRAD snp data and retained only a single snp per locus. Should I consider pruning for LD? The Admixture manual suggests pruning data for LD using plink, removing snps where r^2 > 0.50
a. Number of snps before pruning: 29556
b. Number of snps after pruning: 22966
Layer contributions
a. When K = 3, the third layer has an extremely small contribution (~0.1%). However, at K = 4, the fourth layer contributes
moderately. Is this worth considering or should I ‘stop’ at the first value of K where layers begin to contribute marginally?
(see Picture 1)
b. I’m seeing increasing subdivision among layers at increasing values of K in my spatial model, but less so in the non-spatial
model. How might I interpret this? The paper addressed spurious clusters in non-spatial models because it’s trying to
partition clinal differences into discrete groups, but I’m less sure how to interpret the opposite case, which I seem to have
(see Pictures 1 and 2)
c. I reviewed the phi values for the spatial models at K = 3 and K = 4, based on an earlier question
(conStruct.results$chain_1$MAP$layer.params$layer_k) (#48)
------> Phi values for K = 3 spatial model: K = 1: 2.41 x e-4, K = 2: 7.53 x e-5, K = 3: 5.07 x e-4
------> Phi values for K = 4 spatial model: K = 1: 1.64 x e-5, K = 2: 4.09 x e-5, K = 3: 6.58 x e-5, K = 4: 9.36 x e-5

d. Are the individual layers consistent across values of K for a specific model?
------> For example, the contribution of layer 3 (when K = 3 and K = 4) is ~0.1%, while layer 4 contributes 27% when K = 4
------> I used the function match.layers.x.runs prior to plotting the layer contributions, by doing so, I take it that, yes,
layer 3 is the same across all values of K (for which it was assessed)
Layer covariance curves
a. Is this mostly useful to visualize isolation by distance among my layers? Should I be identifying other important patterns?
b. Looking at the graph of allelic covariance against distance, it looks like this spurious 3rd layer at K = 3 is driven by a single
datapoint (Picture 3). Spurious in that it has such a marginal contribution. Combined with the small contribution of this
layer, it seems that K = 3 is not biologically meaningful to describe my data; to point 2a above, should I give credence to K
= 4 then?
------> What matrix is used to build this graph? Can I look up what data point is driving the third layer? I see a point at
approximately the same value in each of your figures as well. Is this the allelic covariance of the sample with itself? l
looked up one potentially applicable matrix (conStruct.results$chain_1$MAP$par.cov), and each sample, when
compared with itself, had a value approximately at this value; I’m not sure if this is what is being graphed, however
c. If my layer covariance curves are mostly overlapping, how should I interpret this (Picture 3; layers 1 and 2)?
d. Additionally, although the CV predictive accuracy of the spatial model outperformed the non-spatial model (indicative of
IBD; Picture 4), this was marginal at K = 2. I don’t see a decay in correlation with distance in the layer covariance
curves (Picture 3). Is this indicative of IBD in my system, then?
e. Can I edit the output of the graphs? In the paper, for example, the data points on the layer covariance curves are color-
coded
Compare two runs
a. A naive R question, I think: how can I load two results datasets? I ran into trouble trying to ‘load’ results and assign the
object a name, so I’ve only been able to load a single object at a time
How is variability among replicates for each value of K summarized?
a. Ex., pie charts for a model at the same value of K look different among replicates; the code to graph pie charts (from the
vignette) uses the ‘results’ object and MAP to plot the pie chart. If I understand correctly, this graphs the results with the
maximum posterior probability across the replicates?

Please let me know if you need any clarification on what I've asked above, and thanks again. I've also attached these questions in a Word document in case the formatting doesn't transfer well.

Questions_conStruct_Results.docx

Picture1.pdf
Picture2.pdf
Picture3.pdf
Picture4.pdf

Messy data with lots of errors..

Hello Giideon and Peter,

Apologies in advance for the long message. I'm hoping to give all the context to help understand the issues I'm having.

Background: I'm working with ddRAD-Seq data where I have approx. 490 snps for ~200 plants from 18 varieties sampled across 70 locations in Southwest USA. I know this is very few snps for a very large number individuals. I exported snps from stacks in a structure format file and converted them to conStruct file format successfully. I've used these snps in structure and adegenet. The structure results look ok with most individuals clustering according to the location they were sampled at. However, PCA in adegenet didn't show the best clustering with a lot of sites/varieties overlapping with one another and the first PC explaining only 10% of the variation.

My first issues arise when I try to run conStruct with both spatial and non-spatial analysis.

"Error in pos.def.check(obsCov)".

Fair enough, I have a lot of individuals with missing data and not that many snps. I filtered my dataset to only 70 individuals with the max missing data being 25% and I still receive this error. I can't seem to identify any one or pair of individuals that is causing this.

As for the next issue, I've only been able to run conStruct with a subset of 40 individuals at a time. After running conStruct I receive the following warning messages:

"There were 306 divergent transitions after warmup." I don't think I've ever had a run with less than 280 divergences
R-hat too large (2.23)
bulk and tail ESS being too low.

When I run with K = 1 the only warning I get is for tail ESS being too small

I've looked into your other threads for these topics and have tried the following with no luck
-increasing n.iter to 100000
-increasing adapt_delta to 0.99
I have attached the trace plots and I'm not sure I am interpreting them correctly but they don't seem comparable across chains?

Last issue! When I am able to run construct, the dimensions of my allele freq matrix is 41 x 489 however the data block is reading 185 loci. My assumption is that it's dropping 300 of my loci that it's reading as invariant but when I run unique(allele_freqs, MARGIN=2) I'm retaining 440 loci. Is this likely because of missing data? I have attached my subset data as well.

Please let me know if there is any other information that I can provide. Thank you for your diligence on this forum!

_trace.plots.chain_1.pdf
_trace.plots.chain_2.pdf
_trace.plots.chain_3.pdf

subset_allele_freq.Robj.zip

Extracting all iterations from model.fit object

Hello,

Following the instructions from this closed issue (#16), I was trying to access all iterations of the MCMC chain in order to look at the entire likelihood curve, rather than just the 250 samples. However, from what I can tell, the model.fit object only contains those 250 samples, and not the full run information. Am I missing something? Is there a different way to access the entire run? In brief, here's my code:

load('out_model.fit.Robj')
df_of_draws <- as.data.frame(model.fit)
nrow(df_of_draws)

[1] 250

I was playing around with the rstan extract function and was unable to make any progress, so any suggestions would be very much appreciated.

Thanks in advance for your help!

Error freqs using RADseq data

Dear Gideon,

I'm trying to run conStruct with snps obtained using RADseq, I have 180 samples with 1249 loci and I got the following error:

"you have specified an invalid data partition "data" element that is not positive definite"

I tried the solutions that you recommended in previous issues, however I still obtain the same error. Do you have extra recommendations. I attach my freqs table if you have time to give it a look.

Thanks a lot,

Fabian

freqs.txt

AF sometimes 1

To run construct with alt allele frequency at the population level while being consistent across populations, sometimes the frequency is 1.
This throws an error. Is there any way to get around that?

Error when running conStruct analysis

Hello Gideon,

I am running into an issue when I'm finally ready to begin the conStruct analysis. I successfully ran structure2conStruct to generate the allele frequency matrix and I have both matrices with matching row numbers for the geographic sampling coordinates as well as the geographic distances (calculated with rdist.earth). However, when I run the complete function, I receive the following error:

"Error in process.freq.data(freqs) :

After dropping invariant loci, one or more pairs of samples have no genotyped loci in common, so relatedness between them cannot be assessed."

Have you come across this before? Any tips?

Thank you so much!

Clara

(Non-Issue) Question about parallelization in cross-validation analysis...

Hi there,

I've noticed that with built-in parallelization functions for the cross-validation analysis, each replicate is treated as a "thread" and is assigned a single worker. If I were to run a CVA with 10 replicates but I have access to 30 cores, would it be possible to parallelize the task so that each replicate is conducted across three cores in parallel?

Thanks!

Sean

trace plot limited to 250 iterations

The number of iterations shown on the trace plots that are automatically generated during a run stops at 250 even when the number of iterations is much higher. There doesn't seem to be a way to manual change this.

The code I used:
conStruct(spatial = TRUE, K = 2, freqs = freqs, geoDist = geoDist, coords = coords, prefix = "spK2_100k", n.chains = 1, n.iter = 100000)

The resulting plot:
spK2_100k_trace.plots.chain_1.pdf

Issue compiling stan models

Hi folks,

There's something funny going on with compiling stan model blocks on our HPC system. Namely, compiling while running parallel R instances (using either multiple Rscripts with gnu parallel, or with foreach & doFuture), conStruct/rstan irreproducibly fails to compile the stan model, or makes a stan model that "has no samples", or just quits with exit status 1, and has no output (no files except the _data.block.Robj are created). Running each model serially does work. There doesn't seem to be any rhyme or reason to which individual inputs cause this to happen.

As an aside, have you considered something along the lines of mvuorre/bmlm#3 ?

If you think that would help my current issue, I'm happy to have a go doing that if you think it would be relatively simple. The stan model blocks don't change with inputs do they? it's always just the same 4 (one/multiK & spatial/nonspatial)?

Cheers,
K

Most differentiated samples grouped in the same layer

Dear Gideon,

first of all, thanks for the wonderful software and documentation. I'm facing a situation where the spatial model groups genetically most differentiated samples/populations in the same layer. You described this possibility also in the original publication, mentioning that it can result from shared demographic history but also from certain values of the layer-specific parameters.

In my case, shared demographic history for these grouped populations would be very surprising. What would be the best way to further explore and understand what is causing this result and whether it really has biological significance?

I'll attach details of my analyses below:

171 individuals, ~52 000 SNPs
Spatial and non-spatial analysis with train.prop = 0.9, n.reps = 4, K = 1:4, n.iter = 5000. Trace plots look OK.

Spatial analysis supporting K=2, which groups the most northern and southern samples to the same layer (even with K=3 or K=4 these samples don't split out to different layers):

Non-spatial analysis with K=3, reflecting the known, quite continuous genetic differentiation (likely mostly caused by IBD) in the samples:

R installation failed

When following your readme instructions to install conStruct using install_github() [on R 3.2.4], I got the following error: Error: Does not appear to be an R package (no DESCRIPTION).

(pedantry, feel free to ignore) Large files in repo

Hi Gideon,

I'm not sure if this was intentional, but it looks like there are still large files in the git history of the repo. Even thought the files no longer exist when checking out master, the whole repo (hidden history and all) is around 700mb. If you want to, you can run the following which will remove the ./data, ./sims and ./writeup directories from all history, leaving only about 10mb of stuff in the code directory (which is now the root directory of the repo). Git's a complete mindbender at times!

# Create tracking branches of all branches
for remote in `git branch -r | grep -v /HEAD`; do git checkout --track $remote ; done

git checkout master

# Remove the  $removeme directory from all commits, then remove the refs to the old commits
# (repeat these two commands for as many directories that you want to remove)
for removeme in data sims writeup
do
    git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch ./$removeme" --prune-empty --tag-name-filter cat -- --all                                                                                                  
    git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
done

# Ensure all old refs are fully removed
rm -Rf .git/logs .git/refs/original

# Perform a garbage collection to remove commits with no refs
git gc --prune=all --aggressive

Then, after checking all's well, you can force-push to github.

I've run this on a local copy of the repo, and the history seems to have been preserved perfectly

Hope that's useful,
Kevin

Installation error

Hi,
I am trying to install conStruct using the instructions provided. I am using the following command:

library(devtools)
install_github("gbradburd/conStruct",build_vignettes=TRUE)

However, after it looks like it finished compiling, I get the following error:

** testing if installed package can be loaded
Error: package or namespace load failed for ‘conStruct’ in .doLoadActions(where, attach):
 error in load action .__A__.1 for package conStruct: is(module, "character"): object 'm' not found
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Library/Frameworks/R.framework/Versions/3.5/Resources/library/conStruct’
Installation failed: Command failed (1)

I am using R version 3.5.1

Do you have any idea of what could be causing this issue, and how to go about correctly installing conStruct.
Thanks!

Divergent transition warning

Hello,
I'm hoping you might have some advice for diagnosing this problem. I have a data set with 9 populations and ~1600 snps. DAPC analysis finds 5-7 clusters. When I run conStruct with K>1 I always get the warning about divergent transitions. I've increased the adapt_delta parameter without success. Using the parcoord() function suggested in the Stan manual I don't identify any one parameter that is clearly associated with divergences. That said, in most analyses I've run the ancestry proportions tend to bounce from 0 to 1 with very few samples from intermediate proportions. I can send more details (output, Stan diagnostic plots, etc.) if needed.

Thanks very much for any help,
Brian D.

High Rhat across replicate runs

Hi @oscaredd - I think this is an important point so I'm splitting it out into its own issue so other users can find it more easily. I'm quoting your original post below, and then I'll respond to it in it's own thread.

Hi there!

I would like to get your advice for a problem I have with the mixing of chains in my analysis. My dataset contains ~600 SNPs (those have shown nice results with traditional clustering methods) with 64 "samples", with allele frequencies calculated from one or several individuals.
For both discrete and spatial methods in conStruct, I keep getting both R-hat (it has reached ~4.5 in some cases) and the "divergent transitions after warmup" warnings. So far I've tried increasing iterations from 10K to 100K, increasing adapt _delta up to 0.99, run 2 or 4 chains. In all cases I've ran 2 independent analyses for each K (1-8). To this moment, I can only think of these possibilities: 1) need more SNPs (I could go up to ~43K SNPs, but then some of those will probably by linked), and 2) need to reduce complexity by reducing the number of samples.

Thanks in advance for your help!

Originally posted by @oscaredd in #17 (comment)

r-hat and Tail and Bulk Effective Sample Size Warnings

Hi,
I'm able to run my conStruct analysis, but I'm struggling to eliminate the following warnings:

3: The largest R-hat is 1.08, indicating chains have not mixed.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#r-hat
4: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#bulk-ess
5: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
Running the chains for more iterations may help. See
http://mc-stan.org/misc/warnings.html#tail-ess

The solution would seem to be increasing the number of iterations, but I've found that these warnings do not change linearly as I increase the iterations. Some drop out as I try more iterations, but then come back when I try an even larger number of iterations. I've gone as high as 20,000 iterations and still got the r-hat and tail ESS warning. So far, I haven't been able to run an analysis without some of these issues. Any suggestions on how to address this? Is this even something to be especially concerned with?
Thanks for the help!

Error with mc.cores > 1

Hi Gideon,

I came across a weird issue today running the tutorial dataset on a fresh install of construct & Rstan after setting options(mc.cores = 4).

> library(conStruct)
> options(mc.cores = parallel::detectCores())
> data("conStruct.data")
> conStruct(spatial = TRUE, 
+           K = 3, 
+           freqs = conStruct.data$allele.frequencies, 
+           geoDist = conStruct.data$geoDist, 
+           coords = conStruct.data$coords, 
+           prefix = "spK3", 
+           n.chains = 10, 
+           n.iter = 10000, 
+           make.figs = TRUE, 
+           save.files = TRUE)

checking data.block

	reading 36 samples
	reading 10000 loci

checking specified model


user has specified a spatial model with 3 layer(s)

starting worker pid=15009 on localhost:11794 at 13:47:35.298
starting worker pid=15024 on localhost:11794 at 13:47:35.565
starting worker pid=15039 on localhost:11794 at 13:47:35.841
starting worker pid=15054 on localhost:11794 at 13:47:36.108
Error in checkForRemoteErrors(val) : 
  4 nodes produced errors; first error: the compiled object from C++ code for this model is invalid, possible reasons:
  - compiled with save_dso=FALSE;
  - compiled on a different platform;
  - does not exist (created from reading csv files).

Re-running with options(mc.cores=1) (and n.chains=1 to conStruct) succeeds and produces reasonable output (I think). Subsequently running with options(mc.cores=4) after this initial serial execution doesn't work either.

I coudn't find any documentation suggesting this is expected, but forgive me if it is.

Is there any way the models can be compiled ahead of time, so that independent chains may be run in parallel?

Cheers,
Kevin

Error: Incorrect number of subscripts

Hi Gideon,

I seem to be running into a problem running spatial analyses, with both my dataset and also the example dataset provided in the vignette.

The output is:
SAMPLING FOR MODEL 'space_multiK' NOW (CHAIN 1).
Chain 1:
Chain 1: Gradient evaluation took 0.000278 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.78 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1:
Chain 1:
[1] "Error in sampler$call_sampler(args_list[[i]]) : " " c++ exception (unknown reason)"
error occurred during calling the sampler; sampling not done
Stan model 'space_multiK' does not contain samples.
Stan model 'space_multiK' does not contain samples.
Stan model 'space_multiK' does not contain samples.
Error in par.cov[, i, j] <- rstan::extract(model.fit, pars = my.par, inc_warmup = TRUE, :
incorrect number of subscripts

Cheers,
James

High Rhat and ESS issues for K>1 with spatial analysis

Hello,

I am currently running conStruct with a fairly small dataset of 74 individuals and 525 SNPs, I cannot use my full data set of 1113 neutral SNPs because there seems to be too much missing data, I could only get the analysis to run after filtering it down this much.

I have run conStruct at K=1, n.chains=3 and n.iter=1e4 iterations without any Tail and Bulk ESS issues or Rhat problems. When ever I run K>1 I get warnings. I tried finding a solution in other issues with the same problem (for example here and here), but as of yet nothing seems to have worked.

My latest run took all weekend with k=1:2, but K=2 but still gives me these warnings:
The largest R-hat....
Bulk Effective Sampling Sizes (ESS)...
Tail Effective Sampling Sizes (ESS)...

Here is the code I used and associated trace save files.

k01sp <- conStruct(spatial = TRUE, 
                 K = 1, 
                 freqs = constructdata,
                 geoDist = geo_dist_con, 
                 coords = xy_construct,
                 prefix = "k01sp",
                 n.chains= 6,
                 n.iter = 3e4,
                 save.files = T,
                 control = setNames(list(0.95),"adapt_delta")) # use if R-hat is misbehaving, values of 0.9-0.99
k02sp <- conStruct(spatial = TRUE, 
                 K = 2, 
                 freqs = constructdata,
                 geoDist = geo_dist_con, 
                 coords = xy_construct,
                 prefix = "k2sp",
                 n.chains= 6,
                 n.iter = 3e4,
                 save.files = T,
                control = setNames(list(0.95),"adapt_delta")) # use if R-hat is misbehaving, values of 0.9-0.99

k01sp_trace.plots.chain_1.pdf
k01sp_trace.plots.chain_2.pdf
k01sp_trace.plots.chain_3.pdf
k01sp_trace.plots.chain_4.pdf
k01sp_trace.plots.chain_5.pdf
k01sp_trace.plots.chain_6.pdf

k2sp_trace.plots.chain_1.pdf
k2sp_trace.plots.chain_2.pdf
k2sp_trace.plots.chain_3.pdf
k2sp_trace.plots.chain_4.pdf
k2sp_trace.plots.chain_5.pdf
k2sp_trace.plots.chain_6.pdf

The trace plots for K=2 don't seem to agree for any iteration, but they look pretty good for K=1.

I would like to test higher values of K (up to 8), but at ... n.chains= 6, n.iter = 3e4, ... it would take a long time. I have run a non-spatial analysis in the past with this data, but that gave me similar warnings. I am mainly interested in the spatial analysis. This is because so far with this data there seems to be an effect of IBD. I have run a hierarchical AMOVA that indicates a panmictic population, with a small percentage of the variation coming from one of my stratifications. My STRUCTURE analysis fails to find a solution (multiple "peaks" of deltaK , and all with relatively low magnitude).

I do wonder, could the poor mixing and ESS warnings at K>1 be because there is no ancestral K value greater than 1 in my data? Is this an unreasonable assumption or can I move forward ignoring those warnings and see what the cross-validation and contribution layers say?

Let me know what other information or data you may need. I appreciate any assistance you can provide.

Make thin value adjustable?

Hi!

In my opinion, it would be really cool if we could easily tune the value of the thin parameter used in the rstan::sampling function (inside the conStruct command). The default ifelse(n.iter/500 > 1,floor(n.iter/500),1) is nice, but I very often get a warning message that my ESS value is too small (despite visually reaching a "fuzzy caterpillar" stage, but with only 250 values), and I think changing the thin parameter could help.
I think I could write the code if needed, it should not be a lot of work from what I see in the function.

What do you think?

structure2conStruct

Hello,

I'm trying to use conStruct to re-analyze my data as the reviewers in the Molecular ecology suggested for my paper.
I used spider to convert my vcf file to structure and in addition Stacks give a structure data format.
The problem is that the arrangement is two rows for each individual and one column for loci. As you said in the manual there has to be two columns for each loci and one for an individual and I do not know who to do this? How you got your structure data to be arrange like that?
There is another way to convert data to use your program per example convert vcf data?
I will appreciate a lot if you can help me with this issue
Thanks

gbradburd / construct Goto Github PK

construct's People

Contributors

Stargazers

Watchers

Forkers

construct's Issues

Recommend Projects

Recommend Topics

Recommend Org