Created Object Name | Type | Description |
---|---|---|
10158.6.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
10158.6.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
10158.6.bb6.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.8.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.8.bb6.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9108.2.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
9108.2.bb6.fq_reads |
10158.6.MEGAHIT.assembly |
10158.6.QC.MEGAHIT.assembly |
10158.6.trim150.MEGAHIT.assembly |
10158.6.ftrimmed.MEGAHIT.assembly |
10158.6.ktrimmed.MEGAHIT.assembly |
10158.6.atrimmed.MEGAHIT.assembly |
10158.6.aqbtrimmed.MEGAHIT.assembly |
10158.6.aqtrimmed.MEGAHIT.assembly |
10158.6.qbtrimmed.MEGAHIT.assembly |
10158.6.qtrimmed.MEGAHIT.assembly |
10158.6.bb1.MEGAHIT.assembly |
10158.6.bb2.MEGAHIT.assembly |
10158.6.bb3.MEGAHIT.assembly |
10158.6.bb4.MEGAHIT.assembly |
10158.6.bb5.MEGAHIT.assembly |
10158.6.bb6.MEGAHIT.assembly |
9117.8.MEGAHIT.assembly |
9117.8.QC.MEGAHIT.assembly |
9117.8.trim150.MEGAHIT.assembly |
9117.8.ftrimmed.MEGAHIT.assembly |
9117.8.ktrimmed.MEGAHIT.assembly |
9117.8.atrimmed.MEGAHIT.assembly |
9117.8.aqbtrimmed.MEGAHIT.assembly |
9117.8.aqtrimmed.MEGAHIT.assembly |
9117.8.qbtrimmed.MEGAHIT.assembly |
9117.8.qtrimmed.MEGAHIT.assembly |
9117.8.bb1.MEGAHIT.assembly |
9117.8.bb2.MEGAHIT.assembly |
9117.8.bb3.MEGAHIT.assembly |
9117.8.bb4.MEGAHIT.assembly |
9117.8.bb5.MEGAHIT.assembly |
9117.8.bb6.MEGAHIT.assembly |
9108.2.MEGAHIT.assembly |
9108.2.QC.MEGAHIT.assembly |
9108.2.trim150.MEGAHIT.assembly |
9108.2.ftrimmed.MEGAHIT.assembly |
9108.2.ktrimmed.MEGAHIT.assembly |
9108.2.atrimmed.MEGAHIT.assembly |
9108.2.aqbtrimmed.MEGAHIT.assembly |
9108.2.aqtrimmed.MEGAHIT.assembly |
9108.2.qbtrimmed.MEGAHIT.assembly |
9108.2.qtrimmed.MEGAHIT.assembly |
9108.2.bb1.MEGAHIT.assembly |
9108.2.bb2.MEGAHIT.assembly |
9108.2.bb3.MEGAHIT.assembly |
9108.2.bb4.MEGAHIT.assembly |
9108.2.bb5.MEGAHIT.assembly |
9108.2.bb6.MEGAHIT.assembly |
Created Object Name | Type | Description |
---|---|---|
9117.7.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.7.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.7.bb6.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
11306.3.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
11306.3.bb6.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.trim150.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.trim150.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.trim150.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.ftrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.ftrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.ftrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.ktrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.ktrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.ktrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.atrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.atrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.atrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.aqbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.aqbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.aqbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.aqtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.aqtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.aqtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.qbtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.qbtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.qbtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.qtrimmed.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.qtrimmed.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.qtrimmed.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb1.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb1.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb1.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb2.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb2.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb2.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb3.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb3.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb3.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb4.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb4.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb4.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb5.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb5.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb5.fq_reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb6.fq_reads | PairedEndLibrary | Imported Reads |
Created Object Name | Type | Description |
---|---|---|
9117.4.bb6.MEGAHIT.assembly | Assembly | Assembled contigs |
9117.4.bb6.fq_reads |
#MAG counts were correlated with read counts of trimmed and decontaminated reads
#MAG counts were correlated with read counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_bins ~ tMreads", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_bins ~ tMreads', data=df).fit()
model2 = ols('r_bins ~ tMreads', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.146)
#adjusted Pearson's r = 0.382
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'tMreads', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'tMreads', fig=fig)
#post estimation
model3 = smf.mixedlm("td_bins ~ tMreads", data=df, groups=df["Mix_Group"])
modelf3 = model.fit(reml=False)
print(modelf3.summary())
Mixed Linear Model Regression Results ======================================================= Model: MixedLM Dependent Variable: td_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 27.4833 Min. group size: 16 Likelihood: -306.4298 Max. group size: 16 Converged: Yes Mean group size: 16.0 ------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ------------------------------------------------------- Intercept 9.712 12.537 0.775 0.439 -14.860 34.284 tMreads 2.095 0.337 6.222 0.000 1.435 2.755 Group Var 231.903 29.013 ======================================================= OLS Regression Results ============================================================================== Dep. Variable: td_bins R-squared: 0.077 Model: OLS Adj. R-squared: 0.068 Method: Least Squares F-statistic: 7.888 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.00605 Time: 22:25:31 Log-Likelihood: -392.51 No. Observations: 96 AIC: 789.0 Df Residuals: 94 BIC: 794.1 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 39.3912 13.602 2.896 0.005 12.384 66.399 tMreads 1.1758 0.419 2.809 0.006 0.345 2.007 ============================================================================== Omnibus: 35.897 Durbin-Watson: 0.385 Prob(Omnibus): 0.000 Jarque-Bera (JB): 9.857 Skew: 0.506 Prob(JB): 0.00724 Kurtosis: 1.800 Cond. No. 297. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_bins R-squared: 0.185 Model: OLS Adj. R-squared: 0.146 Method: Least Squares F-statistic: 4.771 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.0404 Time: 22:25:31 Log-Likelihood: -103.89 No. Observations: 23 AIC: 211.8 Df Residuals: 21 BIC: 214.1 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 21.9212 25.577 0.857 0.401 -31.269 75.111 tMreads 1.6644 0.762 2.184 0.040 0.080 3.249 ============================================================================== Omnibus: 1.524 Durbin-Watson: 2.039 Prob(Omnibus): 0.467 Jarque-Bera (JB): 1.220 Skew: 0.536 Prob(JB): 0.543 Kurtosis: 2.652 Cond. No. 178. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. Mixed Linear Model Regression Results ======================================================= Model: MixedLM Dependent Variable: td_bins No. Observations: 96 Method: ML No. Groups: 6 Scale: 27.1879 Min. group size: 16 Likelihood: -308.9553 Max. group size: 16 Converged: Yes Mean group size: 16.0 ------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ------------------------------------------------------- Intercept 9.925 12.192 0.814 0.416 -13.970 33.820 tMreads 2.088 0.334 6.251 0.000 1.433 2.743 Group Var 191.725 22.176 =======================================================
<Figure size 864x576 with 0 Axes>
#MAG counts were correlated with base counts of trimmed and decontaminated reads
#MAG counts were correlated with base counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_bins ~ Bbases", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_bins ~ Bbases', data=df).fit()
model2 = ols('r_bins ~ Bbases', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.146)
#adjusted Pearson's r = 0.382
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'Bbases', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'Bbases', fig=fig)
Mixed Linear Model Regression Results ======================================================= Model: MixedLM Dependent Variable: td_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 21.6539 Min. group size: 16 Likelihood: -296.4177 Max. group size: 16 Converged: Yes Mean group size: 16.0 ------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ------------------------------------------------------- Intercept 14.715 9.563 1.539 0.124 -4.028 33.458 Bbases 1.320 0.154 8.565 0.000 1.018 1.622 Group Var 226.343 31.822 ======================================================= OLS Regression Results ============================================================================== Dep. Variable: td_bins R-squared: 0.101 Model: OLS Adj. R-squared: 0.092 Method: Least Squares F-statistic: 10.61 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.00157 Time: 22:25:41 Log-Likelihood: -391.24 No. Observations: 96 AIC: 786.5 Df Residuals: 94 BIC: 791.6 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 36.3272 12.685 2.864 0.005 11.142 61.513 Bbases 0.8649 0.266 3.257 0.002 0.338 1.392 ============================================================================== Omnibus: 46.989 Durbin-Watson: 0.336 Prob(Omnibus): 0.000 Jarque-Bera (JB): 10.364 Skew: 0.498 Prob(JB): 0.00562 Kurtosis: 1.736 Cond. No. 413. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_bins R-squared: 0.184 Model: OLS Adj. R-squared: 0.146 Method: Least Squares F-statistic: 4.749 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.0409 Time: 22:25:41 Log-Likelihood: -103.90 No. Observations: 23 AIC: 211.8 Df Residuals: 21 BIC: 214.1 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 22.0775 25.566 0.864 0.398 -31.090 75.245 Bbases 1.0994 0.505 2.179 0.041 0.050 2.149 ============================================================================== Omnibus: 1.517 Durbin-Watson: 2.039 Prob(Omnibus): 0.468 Jarque-Bera (JB): 1.218 Skew: 0.535 Prob(JB): 0.544 Kurtosis: 2.647 Cond. No. 268. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Average MAG completeness was not correlated with read counts of trimmed and decontaminated reads
#Average MAG completeness was not correlated with read counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_Mean_Completeness ~ tMreads", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_Mean_Completeness ~ tMreads', data=df).fit()
model2 = ols('r_Mean_Completeness ~ tMreads', data=df2).fit()
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'tMreads', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'tMreads', fig=fig)
Mixed Linear Model Regression Results ================================================================== Model: MixedLM Dependent Variable: td_Mean_Completeness No. Observations: 96 Method: REML No. Groups: 6 Scale: 5.9304 Min. group size: 16 Likelihood: -226.7723 Max. group size: 16 Converged: Yes Mean group size: 16.0 -------------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] -------------------------------------------------------------------- Intercept 57.240 4.077 14.039 0.000 49.249 65.232 tMreads -0.049 0.125 -0.393 0.694 -0.294 0.196 Group Var 1.826 0.610 ================================================================== OLS Regression Results ================================================================================ Dep. Variable: td_Mean_Completeness R-squared: 0.001 Model: OLS Adj. R-squared: -0.010 Method: Least Squares F-statistic: 0.06334 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.802 Time: 22:28:24 Log-Likelihood: -230.60 No. Observations: 96 AIC: 465.2 Df Residuals: 94 BIC: 470.3 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 55.0264 2.518 21.850 0.000 50.026 60.027 tMreads 0.0195 0.078 0.252 0.802 -0.134 0.173 ============================================================================== Omnibus: 0.800 Durbin-Watson: 1.696 Prob(Omnibus): 0.670 Jarque-Bera (JB): 0.887 Skew: -0.119 Prob(JB): 0.642 Kurtosis: 2.593 Cond. No. 297. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results =============================================================================== Dep. Variable: r_Mean_Completeness R-squared: 0.010 Model: OLS Adj. R-squared: -0.037 Method: Least Squares F-statistic: 0.2155 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.647 Time: 22:28:24 Log-Likelihood: -68.304 No. Observations: 23 AIC: 140.6 Df Residuals: 21 BIC: 142.9 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 53.8117 5.443 9.887 0.000 42.493 65.131 tMreads 0.0753 0.162 0.464 0.647 -0.262 0.412 ============================================================================== Omnibus: 0.192 Durbin-Watson: 1.901 Prob(Omnibus): 0.908 Jarque-Bera (JB): 0.159 Skew: 0.159 Prob(JB): 0.923 Kurtosis: 2.745 Cond. No. 178. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Average MAG completeness was not correlated with base counts of trimmed and decontaminated reads
#Average MAG completeness was not correlated with base counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_Mean_Completeness ~ Bbases", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_Mean_Completeness ~ Bbases', data=df).fit()
model2 = ols('r_Mean_Completeness ~ Bbases', data=df2).fit()
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'Bbases', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'Bbases', fig=fig)
Mixed Linear Model Regression Results ================================================================== Model: MixedLM Dependent Variable: td_Mean_Completeness No. Observations: 96 Method: REML No. Groups: 6 Scale: 5.9308 Min. group size: 16 Likelihood: -227.2845 Max. group size: 16 Converged: Yes Mean group size: 16.0 -------------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] -------------------------------------------------------------------- Intercept 57.331 3.383 16.945 0.000 50.700 63.962 Bbases -0.035 0.070 -0.503 0.615 -0.173 0.102 Group Var 1.774 0.587 ================================================================== OLS Regression Results ================================================================================ Dep. Variable: td_Mean_Completeness R-squared: 0.000 Model: OLS Adj. R-squared: -0.010 Method: Least Squares F-statistic: 0.02058 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.886 Time: 22:28:41 Log-Likelihood: -230.62 No. Observations: 96 AIC: 465.2 Df Residuals: 94 BIC: 470.4 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 55.3173 2.380 23.240 0.000 50.591 60.043 Bbases 0.0071 0.050 0.143 0.886 -0.092 0.106 ============================================================================== Omnibus: 0.847 Durbin-Watson: 1.695 Prob(Omnibus): 0.655 Jarque-Bera (JB): 0.917 Skew: -0.117 Prob(JB): 0.632 Kurtosis: 2.583 Cond. No. 413. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results =============================================================================== Dep. Variable: r_Mean_Completeness R-squared: 0.010 Model: OLS Adj. R-squared: -0.037 Method: Least Squares F-statistic: 0.2185 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.645 Time: 22:28:41 Log-Likelihood: -68.302 No. Observations: 23 AIC: 140.6 Df Residuals: 21 BIC: 142.9 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 53.7971 5.438 9.893 0.000 42.489 65.105 Bbases 0.0502 0.107 0.467 0.645 -0.173 0.273 ============================================================================== Omnibus: 0.191 Durbin-Watson: 1.902 Prob(Omnibus): 0.909 Jarque-Bera (JB): 0.158 Skew: 0.158 Prob(JB): 0.924 Kurtosis: 2.745 Cond. No. 268. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Average MAG contamination was not correlated with read counts of trimmed and decontaminated reads
#Average MAG contamination was not correlated with read counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_Mean_Contamination ~ tMreads", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_Mean_Contamination ~ tMreads', data=df).fit()
model2 = ols('r_Mean_Contamination ~ tMreads', data=df2).fit()
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'tMreads', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'tMreads', fig=fig)
Mixed Linear Model Regression Results =================================================================== Model: MixedLM Dependent Variable: td_Mean_Contamination No. Observations: 96 Method: REML No. Groups: 6 Scale: 48.3367 Min. group size: 16 Likelihood: -331.4206 Max. group size: 16 Converged: Yes Mean group size: 16.0 ---------------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ---------------------------------------------------------------------- Intercept 47.905 15.522 3.086 0.002 17.482 78.329 tMreads 0.660 0.443 1.492 0.136 -0.207 1.528 Group Var 217.245 20.703 =================================================================== OLS Regression Results ================================================================================= Dep. Variable: td_Mean_Contamination R-squared: 0.152 Model: OLS Adj. R-squared: 0.143 Method: Least Squares F-statistic: 16.82 Date: Sun, 28 Mar 2021 Prob (F-statistic): 8.72e-05 Time: 22:28:56 Log-Likelihood: -393.32 No. Observations: 96 AIC: 790.6 Df Residuals: 94 BIC: 795.8 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 13.3043 13.717 0.970 0.335 -13.931 40.540 tMreads 1.7316 0.422 4.102 0.000 0.893 2.570 ============================================================================== Omnibus: 13.222 Durbin-Watson: 0.517 Prob(Omnibus): 0.001 Jarque-Bera (JB): 14.931 Skew: 0.963 Prob(JB): 0.000573 Kurtosis: 3.151 Cond. No. 297. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ================================================================================ Dep. Variable: r_Mean_Contamination R-squared: 0.115 Model: OLS Adj. R-squared: 0.073 Method: Least Squares F-statistic: 2.733 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.113 Time: 22:28:56 Log-Likelihood: -101.59 No. Observations: 23 AIC: 207.2 Df Residuals: 21 BIC: 209.4 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 31.1563 23.136 1.347 0.192 -16.957 79.270 tMreads 1.1395 0.689 1.653 0.113 -0.294 2.573 ============================================================================== Omnibus: 1.877 Durbin-Watson: 1.151 Prob(Omnibus): 0.391 Jarque-Bera (JB): 0.610 Skew: 0.223 Prob(JB): 0.737 Kurtosis: 3.662 Cond. No. 178. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Average MAG contamination was not correlated with base counts of trimmed and decontaminated reads
#Average MAG contamination was not correlated with base counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_Mean_Contamination ~ Bbases", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_Mean_Contamination ~ Bbases', data=df).fit()
model2 = ols('r_Mean_Contamination ~ Bbases', data=df2).fit()
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'Bbases', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'Bbases', fig=fig)
Mixed Linear Model Regression Results =================================================================== Model: MixedLM Dependent Variable: td_Mean_Contamination No. Observations: 96 Method: REML No. Groups: 6 Scale: 48.5993 Min. group size: 16 Likelihood: -332.4019 Max. group size: 16 Converged: Yes Mean group size: 16.0 ---------------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ---------------------------------------------------------------------- Intercept 55.584 12.528 4.437 0.000 31.029 80.139 Bbases 0.288 0.230 1.251 0.211 -0.163 0.738 Group Var 224.759 21.324 =================================================================== OLS Regression Results ================================================================================= Dep. Variable: td_Mean_Contamination R-squared: 0.136 Model: OLS Adj. R-squared: 0.127 Method: Least Squares F-statistic: 14.83 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.000215 Time: 22:29:16 Log-Likelihood: -394.19 No. Observations: 96 AIC: 792.4 Df Residuals: 94 BIC: 797.5 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 19.1996 13.079 1.468 0.145 -6.770 45.169 Bbases 1.0544 0.274 3.851 0.000 0.511 1.598 ============================================================================== Omnibus: 12.843 Durbin-Watson: 0.551 Prob(Omnibus): 0.002 Jarque-Bera (JB): 14.479 Skew: 0.950 Prob(JB): 0.000718 Kurtosis: 3.108 Cond. No. 413. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ================================================================================ Dep. Variable: r_Mean_Contamination R-squared: 0.115 Model: OLS Adj. R-squared: 0.073 Method: Least Squares F-statistic: 2.722 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.114 Time: 22:29:16 Log-Likelihood: -101.59 No. Observations: 23 AIC: 207.2 Df Residuals: 21 BIC: 209.5 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 31.2555 23.121 1.352 0.191 -16.827 79.338 Bbases 0.7528 0.456 1.650 0.114 -0.196 1.702 ============================================================================== Omnibus: 1.868 Durbin-Watson: 1.152 Prob(Omnibus): 0.393 Jarque-Bera (JB): 0.604 Skew: 0.222 Prob(JB): 0.739 Kurtosis: 3.658 Cond. No. 268. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Good MAG counts were correlated with read counts of trimmed and decontaminated reads
#Good MAG counts were correlated with read counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_good_bins ~ tMreads", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_good_bins ~ tMreads', data=df).fit()
model2 = ols('r_good_bins ~ tMreads', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.297)
#adjusted Pearson's r = 0.545
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'tMreads', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'tMreads', fig=fig)
Mixed Linear Model Regression Results ========================================================== Model: MixedLM Dependent Variable: td_good_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 3.4594 Min. group size: 16 Likelihood: -204.1157 Max. group size: 16 Converged: Yes Mean group size: 16.0 ----------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ----------------------------------------------------------- Intercept 4.835 4.017 1.204 0.229 -3.037 12.708 tMreads 0.399 0.122 3.276 0.001 0.160 0.638 Group Var 3.774 1.555 ========================================================== OLS Regression Results ============================================================================== Dep. Variable: td_good_bins R-squared: 0.020 Model: OLS Adj. R-squared: 0.009 Method: Least Squares F-statistic: 1.902 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.171 Time: 22:29:30 Log-Likelihood: -215.55 No. Observations: 96 AIC: 435.1 Df Residuals: 94 BIC: 440.2 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 14.7670 2.153 6.858 0.000 10.492 19.042 tMreads 0.0914 0.066 1.379 0.171 -0.040 0.223 ============================================================================== Omnibus: 0.619 Durbin-Watson: 1.508 Prob(Omnibus): 0.734 Jarque-Bera (JB): 0.550 Skew: 0.182 Prob(JB): 0.760 Kurtosis: 2.926 Cond. No. 297. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_good_bins R-squared: 0.329 Model: OLS Adj. R-squared: 0.297 Method: Least Squares F-statistic: 10.31 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.00419 Time: 22:29:30 Log-Likelihood: -68.680 No. Observations: 23 AIC: 141.4 Df Residuals: 21 BIC: 143.6 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept -0.3598 5.532 -0.065 0.949 -11.865 11.146 tMreads 0.5293 0.165 3.211 0.004 0.187 0.872 ============================================================================== Omnibus: 0.823 Durbin-Watson: 2.044 Prob(Omnibus): 0.663 Jarque-Bera (JB): 0.827 Skew: 0.372 Prob(JB): 0.661 Kurtosis: 2.444 Cond. No. 178. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Good MAG counts were correlated with base counts of trimmed and decontaminated reads
#Good MAG counts were correlated with base counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_good_bins ~ Bbases", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_good_bins ~ Bbases', data=df).fit()
model2 = ols('r_good_bins ~ Bbases', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.296)
#adjusted Pearson's r = 0.544
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'Bbases', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'Bbases', fig=fig)
Mixed Linear Model Regression Results ========================================================== Model: MixedLM Dependent Variable: td_good_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 2.9316 Min. group size: 16 Likelihood: -197.6067 Max. group size: 16 Converged: Yes Mean group size: 16.0 ----------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ----------------------------------------------------------- Intercept 3.079 2.863 1.076 0.282 -2.531 8.690 Bbases 0.309 0.058 5.359 0.000 0.196 0.421 Group Var 4.218 1.763 ========================================================== OLS Regression Results ============================================================================== Dep. Variable: td_good_bins R-squared: 0.059 Model: OLS Adj. R-squared: 0.049 Method: Least Squares F-statistic: 5.911 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.0169 Time: 22:29:45 Log-Likelihood: -213.59 No. Observations: 96 AIC: 431.2 Df Residuals: 94 BIC: 436.3 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 12.9052 1.993 6.474 0.000 8.948 16.863 Bbases 0.1014 0.042 2.431 0.017 0.019 0.184 ============================================================================== Omnibus: 0.552 Durbin-Watson: 1.429 Prob(Omnibus): 0.759 Jarque-Bera (JB): 0.576 Skew: 0.175 Prob(JB): 0.750 Kurtosis: 2.854 Cond. No. 413. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_good_bins R-squared: 0.328 Model: OLS Adj. R-squared: 0.296 Method: Least Squares F-statistic: 10.26 Date: Sun, 28 Mar 2021 Prob (F-statistic): 0.00428 Time: 22:29:45 Log-Likelihood: -68.700 No. Observations: 23 AIC: 141.4 Df Residuals: 21 BIC: 143.7 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept -0.3125 5.533 -0.056 0.955 -11.818 11.193 Bbases 0.3497 0.109 3.203 0.004 0.123 0.577 ============================================================================== Omnibus: 0.824 Durbin-Watson: 2.042 Prob(Omnibus): 0.662 Jarque-Bera (JB): 0.831 Skew: 0.369 Prob(JB): 0.660 Kurtosis: 2.432 Cond. No. 268. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>
#Medium MAG counts were correlated with read counts of trimmed and decontaminated reads
#Medium MAG counts were correlated with read counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
td_medium_bins = [32, 33, 32, 24, 26, 35, 33, 26, 29, 18, 35, 27, 18, 33, 32, 28, 22, 20, 20, 23, 23, 21, 25, 22, 21, 22, 22, 23, 22, 22, 26, 21, 28, 27, 30, 26, 25, 27, 29, 23, 24, 21, 28, 23, 20, 26, 26, 24, 36, 34, 31, 26, 29, 36, 33, 29, 27, 21, 39, 24, 22, 37, 32, 28, 28, 28, 31, 26, 32, 30, 32, 29, 28, 29, 31, 28, 30, 32, 30, 27, 22, 23, 28, 26, 24, 23, 24, 25, 27, 23, 25, 27, 25, 25, 25, 25]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_medium_bins = [24, 9, 37, 28, 22, 36, 41, 14, 34, 26, 28, 27, 33, 22, 16, 36, 22, 32, 28, 31, 14, 16, 20]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_medium_bins': td_medium_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_medium_bins': r_medium_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_medium_bins ~ tMreads", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_medium_bins ~ tMreads', data=df).fit()
model2 = ols('r_medium_bins ~ tMreads', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.297)
#adjusted Pearson's r = 0.545
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'tMreads', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'tMreads', fig=fig)
Mixed Linear Model Regression Results ============================================================ Model: MixedLM Dependent Variable: td_medium_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 10.0592 Min. group size: 16 Likelihood: -254.6484 Max. group size: 16 Converged: Yes Mean group size: 16.0 ------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ------------------------------------------------------------- Intercept -1.702 6.623 -0.257 0.797 -14.682 11.279 tMreads 0.883 0.200 4.422 0.000 0.492 1.275 Group Var 12.867 2.927 ============================================================ OLS Regression Results ============================================================================== Dep. Variable: td_medium_bins R-squared: 0.119 Model: OLS Adj. R-squared: 0.110 Method: Least Squares F-statistic: 12.73 Date: Sun, 11 Apr 2021 Prob (F-statistic): 0.000568 Time: 12:21:36 Log-Likelihood: -274.12 No. Observations: 96 AIC: 552.2 Df Residuals: 94 BIC: 557.4 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 12.7682 3.963 3.222 0.002 4.900 20.636 tMreads 0.4352 0.122 3.568 0.001 0.193 0.677 ============================================================================== Omnibus: 3.754 Durbin-Watson: 1.392 Prob(Omnibus): 0.153 Jarque-Bera (JB): 3.763 Skew: 0.463 Prob(JB): 0.152 Kurtosis: 2.711 Cond. No. 297. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_medium_bins R-squared: 0.243 Model: OLS Adj. R-squared: 0.207 Method: Least Squares F-statistic: 6.745 Date: Sun, 11 Apr 2021 Prob (F-statistic): 0.0168 Time: 12:21:36 Log-Likelihood: -78.203 No. Observations: 23 AIC: 160.4 Df Residuals: 21 BIC: 162.7 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 4.5665 8.370 0.546 0.591 -12.840 21.974 tMreads 0.6476 0.249 2.597 0.017 0.129 1.166 ============================================================================== Omnibus: 1.144 Durbin-Watson: 2.336 Prob(Omnibus): 0.564 Jarque-Bera (JB): 1.006 Skew: 0.461 Prob(JB): 0.605 Kurtosis: 2.554 Cond. No. 178. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
/kb/runtime/lib/python3.6/site-packages/statsmodels/graphics/regressionplots.py:221: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance. ax = fig.add_subplot(2, 2, 1) /kb/runtime/lib/python3.6/site-packages/statsmodels/graphics/regressionplots.py:231: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance. ax = fig.add_subplot(2, 2, 2) /kb/runtime/lib/python3.6/site-packages/statsmodels/graphics/regressionplots.py:238: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance. ax = fig.add_subplot(2, 2, 3) /kb/runtime/lib/python3.6/site-packages/statsmodels/graphics/regressionplots.py:251: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance. ax = fig.add_subplot(2, 2, 4)
<Figure size 864x576 with 0 Axes>
#Medium MAG counts were correlated with base counts of trimmed and decontaminated reads
#Medium MAG counts were correlated with base counts of raw reads at alpha = 0.5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
import statsmodels.formula.api as smf
from statsmodels.formula.api import ols
#data
Mix_Group = ['10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '10158.6', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9117.8', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9108.2', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '9117.7', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '11306.3', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4', '9117.4']
td_read_files = ['10158.6_raw', '10158.6_qc', '10158.6_trim150', '10158.6_ftrim', '10158.6_ktrim', '10158.6_atrim', '10158.6_aqbtrim', '10158.6_aqtrim', '10158.6_qbtrim', '10158.6_qtrim', '10158.6_bb1', '10158.6_bb2', '10158.6_bb3', '10158.6_bb4', '10158.6_bb5', '10158.6_bb6', '9117.8_raw', '9117.8_qc', '9117.8_trim150', '9117.8_ftrim', '9117.8_ktrim', '9117.8_atrim', '9117.8_aqbtrim', '9117.8_aqtrim', '9117.8_qbtrim', '9117.8_qtrim', '9117.8_bb1', '9117.8_bb2', '9117.8_bb3', '9117.8_bb4', '9117.8_bb5', '9117.8_bb6', '9108.2_raw', '9108.2_qc', '9108.2_trim150', '9108.2_ftrim', '9108.2_ktrim', '9108.2_atrim', '9108.2_aqbtrim', '9108.2_aqtrim', '9108.2_qbtrim', '9108.2_qtrim', '9108.2_bb1', '9108.2_bb2', '9108.2_bb3', '9108.2_bb4', '9108.2_bb5', '9108.2_bb6', '9117.7_raw', '9117.7_qc', '9117.7_trim150', '9117.7_ftrimmed', '9117.7_ktrimmed', '9117.7_atrimmed', '9117.7_aqbtrimmed', '9117.7_aqtrimmed', '9117.7_qbtrimmed', '9117.7_qtrimmed', '9117.7_bb1', '9117.7_bb2', '9117.7_bb3', '9117.7_bb4', '9117.7_bb5', '9117.7_bb6', '11306.3_raw', '11306.3_qc', '11306.3_trim150', '11306.3_ftrimmed', '11306.3_ktrimmed', '11306.3_atrimmed', '11306.3_aqbtrimmed', '11306.3_aqtrimmed', '11306.3_qbtrimmed', '11306.3_qtrimmed', '11306.3_bb1', '11306.3_bb2', '11306.3_bb3', '11306.3_bb4', '11306.3_bb5', '11306.3_bb6', '9117.4_raw', '9117.4_qc', '9117.4_trim150', '9117.4_ftrimmed', '9117.4_ktrimmed', '9117.4_atrimmed', '9117.4_aqbtrimmed', '9117.4_aqtrimmed', '9117.4_qbtrimmed', '9117.4_qtrimmed', '9117.4_bb1', '9117.4_bb2', '9117.4_bb3', '9117.4_bb4', '9117.4_bb5', '9117.4_bb6']
td_tMreads = [36.0129894, 35.2337896, 36.0129894, 36.0129894, 35.8983284, 35.933862, 34.552143, 31.2682706, 34.3282984, 30.6449696, 35.8442254, 31.2345964, 30.615651, 35.3058416, 34.4731868, 34.2527702, 28.2058246, 26.5787996, 28.2058246, 28.2058246, 27.66738, 27.6666568, 27.2469874, 25.5238858, 27.2469874, 25.2340502, 26.886397, 24.8035162, 24.519662, 26.7855648, 26.4733418, 26.3827484, 39.9148934, 37.3791976, 39.9148934, 39.9148934, 38.7122998, 38.708094, 37.7905962, 34.6650054, 37.6107554, 34.1456456, 37.8858836, 33.921016, 33.4111558, 37.7734844, 36.9821596, 36.8056446, 33.4711394, 32.6004118, 33.4711394, 33.4711394, 33.04149, 33.0383456, 30.3924568, 32.5038064, 30.0542808, 32.3956334, 32.9696938, 30.3601654, 30.025172, 32.8646394, 32.4398904, 32.3335082, 31.8428354, 31.3942576, 31.8428354, 31.8428354, 31.637986, 31.6358996, 29.2361886, 31.0176556, 28.9929926, 30.9364282, 31.6358006, 29.2361776, 28.9929822, 31.5146638, 31.0175998, 30.9363772, 34.9441596, 32.3312534, 34.9441596, 34.9441596, 33.4723724, 33.4683444, 30.8153706, 32.9387816, 30.4748512, 32.8289102, 32.6738296, 30.0963168, 29.7659662, 32.561628, 32.1535464, 32.0471204]
td_Bbases = [54.379613994, 52.641056249, 54.0194841, 51.498574842, 50.780358538, 53.669962132, 51.549471661, 45.964065176, 48.624647571, 42.845175292, 53.535374086, 45.919226343, 42.80751312, 53.311820816, 51.43795566, 48.522733084, 42.590795146, 40.133987396, 42.3087369, 40.334329178, 39.44260618, 41.667343504, 40.776782835, 37.725547193, 40.776782835, 35.427126679, 40.490156646, 36.660246851, 34.422663647, 40.446202848, 39.621053097, 37.463971015, 60.271489034, 56.442588376, 59.8723401, 57.078297562, 55.202040634, 58.310667004, 56.435790641, 51.019814812, 53.310062558, 47.762707137, 57.069909524, 49.92030233, 46.728770205, 57.037961444, 55.229109813, 52.166971365, 50.541420494, 49.226621818, 50.2067091, 47.863729342, 47.116772654, 49.7656626, 44.974095956, 48.667868199, 42.241274113, 46.022382453, 49.6630349, 44.930324539, 42.203344352, 49.625605494, 48.576089344, 45.937352987, 48.082681454, 47.024019995, 47.7642531, 45.535254622, 45.103175218, 47.640851402, 43.590661125, 46.584185484, 41.025529046, 44.063678425, 47.640703902, 43.590646205, 41.025515524, 47.587142338, 46.584108342, 44.063610873, 52.765680996, 48.820192634, 52.4162394, 49.970148228, 47.71891569, 50.406042552, 45.62052235, 49.326011714, 42.846059312, 46.640424087, 49.207539094, 44.559568777, 41.852323333, 49.16805828, 48.15227162, 45.531321684]
td_bins = [78, 85, 82, 83, 85, 83, 82, 78, 79, 63, 90, 72, 67, 78, 85, 83, 65, 62, 64, 52, 55, 59, 59, 56, 54, 50, 62, 55, 50, 60, 61, 53, 69, 76, 74, 75, 75, 71, 73, 72, 68, 65, 74, 71, 66, 74, 74, 65, 95, 103, 107, 96, 97, 103, 88, 98, 79, 81, 104, 90, 76, 101, 101, 90, 99, 100, 96, 97, 94, 97, 96, 97, 95, 82, 101, 96, 80, 97, 96, 91, 70, 68, 70, 65, 68, 69, 64, 61, 65, 64, 77, 62, 64, 78, 65, 62]
td_Mean_Completeness = [53.86, 53.28, 56.31, 57.37, 54.68, 50.1, 58.05, 54.04, 51.16, 54.76, 51.12, 52.23, 57.52, 52.15, 58.83, 54.29, 55.06, 54.27, 53.37, 53.16, 53.77, 57.81, 54.55, 54.88, 54.17, 52.29, 54.94, 52.04, 51.41, 58.36, 54.69, 56.81, 56.66, 53.72, 57.38, 56.22, 52.7, 52.65, 55.11, 54.88, 57.48, 54.64, 53.6, 54.34, 58.18, 52.11, 56.26, 58.28, 55.64, 57.45, 58.55, 54.54, 57.45, 58.65, 55.18, 55.5, 58.25, 60.02, 55.82, 58.17, 60.34, 56.8, 57.0, 59.14, 54.4, 51.04, 58.28, 56.73, 48.21, 51.68, 52.66, 56.77, 55.24, 57.44, 58.56, 57.33, 54.41, 56.39, 53.78, 58.04, 57.17, 59.53, 56.07, 51.82, 58.95, 60.91, 56.23, 56.59, 60.48, 59.56, 56.87, 61.29, 54.39, 60.71, 53.56, 53.89]
td_Mean_Contamination = [64.8, 56.57, 63.75, 56.14, 63.41, 65.77, 68.71, 59.0, 51.31, 53.39, 53.35, 56.24, 59.82, 71.72, 69.63, 66.78, 53.25, 46.8, 58.27, 54.69, 48.23, 50.47, 57.38, 53.78, 50.02, 45.22, 47.86, 47.19, 54.22, 48.53, 54.71, 53.82, 83.63, 71.19, 88.26, 89.13, 73.4, 65.59, 87.02, 85.4, 79.15, 67.55, 71.37, 72.84, 82.49, 64.29, 92.78, 85.56, 107.95, 99.99, 99.71, 85.91, 99.47, 78.31, 82.71, 107.35, 97.95, 92.98, 87.49, 93.41, 102.6, 77.96, 97.57, 99.91, 83.7, 63.09, 68.23, 69.77, 73.61, 78.21, 70.56, 69.82, 69.56, 64.5, 63.16, 82.06, 72.7, 69.26, 75.71, 62.79, 62.01, 54.02, 69.1, 58.79, 55.27, 51.99, 60.61, 57.98, 59.85, 65.24, 56.41, 58.49, 58.44, 53.79, 57.06, 54.33]
td_good_bins = [21, 19, 19, 14, 15, 20, 20, 17, 17, 12, 19, 16, 12, 22, 20, 17, 16, 16, 17, 17, 15, 17, 19, 15, 15, 14, 16, 16, 15, 18, 19, 16, 18, 17, 17, 16, 16, 16, 17, 17, 14, 15, 17, 15, 15, 19, 15, 17, 22, 21, 18, 19, 21, 22, 17, 21, 21, 16, 23, 16, 17, 24, 20, 21, 19, 21, 21, 15, 18, 19, 20, 18, 15, 17, 19, 18, 18, 18, 19, 16, 17, 17, 19, 18, 18, 17, 19, 17, 19, 18, 18, 16, 18, 21, 20, 17]
td_good_Mean_Completeness = [85.98, 87.17, 86.9, 87.66, 86.64, 86.04, 85.18, 86.79, 86.35, 86.03, 88.47, 85.58, 89.46, 86.17, 83.85, 86.86, 87.87, 87.38, 87.61, 86.94, 87.35, 86.96, 88.23, 88.04, 88.62, 90.16, 88.26, 89.11, 86.1, 88.26, 87.55, 87.21, 87.51, 86.6, 87.87, 87.83, 86.62, 87.87, 87.12, 87.7, 87.06, 87.94, 87.37, 85.69, 87.28, 85.92, 88.11, 86.67, 87.68, 87.33, 87.89, 88.48, 89.21, 88.4, 86.1, 86.69, 87.99, 88.53, 89.18, 87.33, 86.83, 88.58, 87.12, 87.34, 88.54, 86.69, 87.03, 86.06, 88.99, 86.81, 86.17, 86.12, 87.78, 85.64, 86.41, 87.08, 85.98, 88.56, 87.42, 87.18, 85.99, 87.07, 86.97, 86.86, 86.75, 89.43, 86.51, 86.19, 86.17, 85.87, 86.68, 87.63, 86.26, 89.38, 87.22, 87.03]
td_good_Mean_Contamination = [5.06, 4.78, 4.28, 4.58, 3.83, 4.25, 4.66, 4.68, 4.57, 4.01, 4.01, 4.34, 4.43, 4.96, 4.23, 5.17, 3.79, 3.63, 3.76, 3.61, 3.99, 3.66, 4.04, 3.17, 3.67, 3.37, 4.3, 3.39, 3.89, 3.63, 4.05, 4.51, 3.87, 3.93, 3.27, 3.48, 4.16, 4.88, 3.89, 3.16, 4.65, 4.07, 4.41, 4.22, 4.07, 3.78, 3.85, 4.16, 3.92, 3.06, 4.2, 4.41, 3.21, 3.81, 4.31, 3.64, 3.84, 4.09, 3.84, 3.64, 3.48, 3.71, 3.4, 4.06, 3.76, 4.32, 4.61, 4.74, 3.56, 4.54, 4.1, 4.05, 4.26, 4.31, 4.84, 4.06, 3.88, 4.59, 4.63, 3.76, 3.57, 3.88, 3.56, 3.5, 3.68, 4.71, 3.46, 3.84, 3.94, 3.61, 3.5, 3.36, 3.4, 3.57, 3.91, 4.21]
td_medium_bins = [32, 33, 32, 24, 26, 35, 33, 26, 29, 18, 35, 27, 18, 33, 32, 28, 22, 20, 20, 23, 23, 21, 25, 22, 21, 22, 22, 23, 22, 22, 26, 21, 28, 27, 30, 26, 25, 27, 29, 23, 24, 21, 28, 23, 20, 26, 26, 24, 36, 34, 31, 26, 29, 36, 33, 29, 27, 21, 39, 24, 22, 37, 32, 28, 28, 28, 31, 26, 32, 30, 32, 29, 28, 29, 31, 28, 30, 32, 30, 27, 22, 23, 28, 26, 24, 23, 24, 25, 27, 23, 25, 27, 25, 25, 25, 25]
r_read_files = ['9117.5_raw', '10158.8_raw', '11263.1_raw', '11306.3_raw', '11306.1_raw', '11260.6_raw', '11260.5_raw', '9108.1_raw', '9053.2_raw', '9672.8_raw', '9108.2_raw', '9053.4_raw', '9053.3_raw', '9117.4_raw', '9117.6_raw', '9117.7_raw', '9117.8_raw', '10158.6_raw', '10186.3_raw', '10186.4_raw', '7331.1_raw', '9053.5_raw', '9041.8_raw']
r_tMreads = [36.0129894, 17.6218972, 38.2800142, 34.9076424, 35.3037194, 37.1504476, 40.3613864, 20.7773948, 31.8428354, 30.1166938, 27.718318, 40.8492618, 39.7169858, 34.4581152, 26.9696492, 21.3309852, 39.9148934, 34.9441596, 35.690255, 35.5019026, 33.4711394, 28.2058246, 36.96984]
r_Bbases = [54.379613994, 26.609064772, 57.802821442, 52.710540024, 53.308616294, 56.097175876, 60.945693464, 31.373866148, 48.082681454, 45.1750407, 41.85466018, 61.682385318, 59.972648558, 52.031753952, 40.724170292, 32.209787652, 60.271489034, 52.765680996, 53.89228505, 53.607872926, 50.541420494, 42.590795146, 55.8244584]
r_bins = [65, 47, 139, 99, 55, 90, 115, 38, 86, 87, 69, 71, 95, 70, 62, 95, 65, 78, 85, 109, 45, 49, 52]
r_Mean_Completeness = [45.96, 58.83, 51.28, 56.54, 56.55, 63.26, 52.23, 58.47, 54.69, 53.7, 58.18, 60.32, 62.41, 65.52, 52.14, 50.97, 56.26, 57.0, 65.39, 54.89, 53.78, 53.56, 52.81]
r_Mean_Contamination = [24.03, 69.63, 67.13, 60.4, 53.21, 81.43, 55.06, 35.14, 54.71, 74.62, 54.7, 76.81, 66.86, 78.94, 67.85, 46.5, 92.78, 97.57, 121.14, 103.47, 75.71, 57.06, 65.71]
r_good_bins = [15, 4, 23, 19, 14, 25, 29, 9, 20, 20, 18, 18, 23, 17, 10, 22, 16, 21, 18, 20, 9, 9, 14]
r_medium_bins = [24, 9, 37, 28, 22, 36, 41, 14, 34, 26, 28, 27, 33, 22, 16, 36, 22, 32, 28, 31, 14, 16, 20]
r_good_Mean_Completeness = [84.79, 83.85, 89.19, 87.14, 90.06, 87.41, 90.7, 86.6, 87.55, 85.4, 87.31, 86.68, 86.18, 84.94, 87.02, 88.92, 88.11, 87.12, 87.24, 89.3, 87.42, 87.22, 86.64]
r_good_Mean_Contamination = [3.37, 4.23, 3.57, 4.4, 3.62, 4.4, 4.56, 3.0, 4.05, 4.32, 4.53, 3.9, 4.48, 4.17, 2.86, 2.87, 3.85, 3.4, 3.63, 4.37, 4.63, 3.91, 4.25]
#create dataset
df = pd.DataFrame({'td_tMreads': td_tMreads,
'td_Bbases': td_Bbases,
'td_bins': td_bins,
'td_Mean_Completeness': td_Mean_Completeness,
'td_Mean_Contamination': td_Mean_Contamination,
'td_good_bins': td_good_bins,
'td_medium_bins': td_medium_bins,
'td_good_Mean_Completeness': td_good_Mean_Completeness,
'td_good_Mean_Contamination': td_good_Mean_Contamination,
'Mix_Group': Mix_Group})
df.rename(columns={'td_tMreads': 'tMreads', 'td_Bbases': 'Bbases'}, inplace=True)
df2 = pd.DataFrame({'r_tMreads': r_tMreads,
'r_Bbases': r_Bbases,
'r_bins': r_bins,
'r_Mean_Completeness': r_Mean_Completeness,
'r_Mean_Contamination': r_Mean_Contamination,
'r_good_bins': r_good_bins,
'r_medium_bins': r_medium_bins,
'r_good_Mean_Completeness': r_good_Mean_Completeness,
'r_good_Mean_Contamination': r_good_Mean_Contamination})
df2.rename(columns={'r_tMreads': 'tMreads', 'r_Bbases': 'Bbases'}, inplace=True)
#view dataset
#print(df)
#fit regression model
model = smf.mixedlm("td_medium_bins ~ Bbases", data=df, groups=df["Mix_Group"])
modelf = model.fit()
model1 = ols('td_medium_bins ~ Bbases', data=df).fit()
model2 = ols('r_medium_bins ~ Bbases', data=df2).fit()
#adj r^2 = Pearson product-moment correlation coefficient (r) adjusted for number of predictors
#... r = sqrt(0.296)
#adjusted Pearson's r = 0.544
#mdf = md.fit()
#print(mdf.summary())
#view model summary
print(modelf.summary())
print(model1.summary())
print(model2.summary())
#define figure size
fig = plt.figure(figsize=(12,8))
fig2 = plt.figure(figsize=(12,8))
#produce regression plots
fig = sm.graphics.plot_regress_exog(model1, 'Bbases', fig=fig)
fig2 = sm.graphics.plot_regress_exog(model2, 'Bbases', fig=fig)
Mixed Linear Model Regression Results ============================================================ Model: MixedLM Dependent Variable: td_medium_bins No. Observations: 96 Method: REML No. Groups: 6 Scale: 8.4560 Min. group size: 16 Likelihood: -247.5058 Max. group size: 16 Converged: Yes Mean group size: 16.0 ------------------------------------------------------------- Coef. Std.Err. z P>|z| [0.025 0.975] ------------------------------------------------------------- Intercept -2.118 4.767 -0.444 0.657 -11.462 7.226 Bbases 0.610 0.095 6.390 0.000 0.423 0.797 Group Var 12.750 3.052 ============================================================ OLS Regression Results ============================================================================== Dep. Variable: td_medium_bins R-squared: 0.178 Model: OLS Adj. R-squared: 0.169 Method: Least Squares F-statistic: 20.29 Date: Sun, 11 Apr 2021 Prob (F-statistic): 1.91e-05 Time: 18:21:14 Log-Likelihood: -270.83 No. Observations: 96 AIC: 545.7 Df Residuals: 94 BIC: 550.8 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 10.6339 3.619 2.939 0.004 3.449 17.819 Bbases 0.3412 0.076 4.504 0.000 0.191 0.492 ============================================================================== Omnibus: 3.934 Durbin-Watson: 1.321 Prob(Omnibus): 0.140 Jarque-Bera (JB): 3.838 Skew: 0.441 Prob(JB): 0.147 Kurtosis: 2.575 Cond. No. 413. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. OLS Regression Results ============================================================================== Dep. Variable: r_medium_bins R-squared: 0.243 Model: OLS Adj. R-squared: 0.207 Method: Least Squares F-statistic: 6.733 Date: Sun, 11 Apr 2021 Prob (F-statistic): 0.0169 Time: 18:21:14 Log-Likelihood: -78.208 No. Observations: 23 AIC: 160.4 Df Residuals: 21 BIC: 162.7 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 4.5998 8.365 0.550 0.588 -12.796 21.995 Bbases 0.4283 0.165 2.595 0.017 0.085 0.772 ============================================================================== Omnibus: 1.130 Durbin-Watson: 2.333 Prob(Omnibus): 0.568 Jarque-Bera (JB): 1.000 Skew: 0.458 Prob(JB): 0.607 Kurtosis: 2.548 Cond. No. 268. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
<Figure size 864x576 with 0 Axes>