gravatar for Pierre Lindenbaum

2 hours ago by

France/Nantes/Institut du Thorax - INSERM UMR1087

using bioalcidaejdk: lindenb.github.io/jvarkit/BioAlcidaeJdk.html

and the following script:

final Map<String,Counter<GenotypeType>> sample2count = new HashMap<>();
stream().flatMap(V->V.getGenotypes().stream()).forEach(G->{
    final String sn = G.getSampleName();
    Counter<GenotypeType> c= sample2count.get(sn);
    if(c==null) {
                c=new Counter<GenotypeType>();
                sample2count.put(sn,c);
                }
    c.incr(G.getType());
    });


out.print("#name");
for(GenotypeType gt:GenotypeType.values()) out.print("t"+gt.name());
out.println();

for(final String sn: sample2count.keySet()) {
  out.print(sn);
  Counter<GenotypeType> c= sample2count.get(sn);
 for(GenotypeType gt:GenotypeType.values()) out.print("t"+c.count(gt));
  out.println();
}

usage:

java -jar dist/bioalcidaejdk.jar -f  biostar.code src/test/resources/rotavirus_rf.vcf.gz

will print the types of genotypes for each sample:

#name  NO_CALL  HOM_REF  HET  HOM_VAR  UNAVAILABLE  MIXED
S3     0        30       7    8        0            0
S4     0        31       7    7        0            0
S5     0        37       0    8        0            0
S1     0        36       7    2        0            0
S2     0        30       7    8        0            0



Source link