using bioalcidaejdk: lindenb.github.io/jvarkit/BioAlcidaeJdk.html
and the following script:
final Map<String,Counter<GenotypeType>> sample2count = new HashMap<>();
stream().flatMap(V->V.getGenotypes().stream()).forEach(G->{
final String sn = G.getSampleName();
Counter<GenotypeType> c= sample2count.get(sn);
if(c==null) {
c=new Counter<GenotypeType>();
sample2count.put(sn,c);
}
c.incr(G.getType());
});
out.print("#name");
for(GenotypeType gt:GenotypeType.values()) out.print("t"+gt.name());
out.println();
for(final String sn: sample2count.keySet()) {
out.print(sn);
Counter<GenotypeType> c= sample2count.get(sn);
for(GenotypeType gt:GenotypeType.values()) out.print("t"+c.count(gt));
out.println();
}
usage:
java -jar dist/bioalcidaejdk.jar -f biostar.code src/test/resources/rotavirus_rf.vcf.gz
will print the types of genotypes for each sample:
#name NO_CALL HOM_REF HET HOM_VAR UNAVAILABLE MIXED
S3 0 30 7 8 0 0
S4 0 31 7 7 0 0
S5 0 37 0 8 0 0
S1 0 36 7 2 0 0
S2 0 30 7 8 0 0